NIRSPEC Reliability Improvement 
Progress report:  12  April  2004

Overview:

Steady progress continues to be made toward a release version of the new keyword and rotator server codes.     Development and testing of the new codes accounts for much of our effort over the past month.  The target date for release has slipped to the end of April.

Evidence continues to accumulate that power supplies and/or power quality may cause some server crashes.     Pursuit of this possibility represents our second major area of resource dedication.

Time lost on sky continues to decline despite no significant reduction in the rate of server crashes.

Speeding recovery from server crashes:

We are now confident that the crash recovery procedures this project has developed are leading to a reduction in time lost on sky.    During this reporting period,   frozen gui's or genuine server crashes prompted the recovery script to be run on seven occasions.     For the three of these that were genuine crashes, the run and recover time averaged less than ten minutes.    On the other four occasions, the script correctly diagnosed the situation, restarted the problematic gui's and no time was lost.

Prevention of server crashes:   reduction of communications traffic

Parts of four engineering half-nights have now been used for testing of new keyword and rotator server codes.   These codes should prevent those server crashes that result when the transputers are swamped by high rates of motor commands.   Our most recent chance to test on sky was two hours on the night of March 31. Although we got past our previous failure point, the testing revealed two problems with the rotator code: The next unclaimed engineering time is a half night on May 30.      Pending the results of day testing starting April 12, we may approach an observer with time at the end of April offering to trade time on May 30.

Prevention of server crashes:   power quality?

Evidence continues to accumulate that some fraction of server crashes may result from problems with power quality.     The correlation between power interuptions and server crashes continued with the most recent brownouts on April 2.

Issues and Concerns: