![]() |
Troubleshooting Swap Space Problems |
|
The NIRSPEC software is a conspicuous and continuous consumer of
memory on the host computer waimea. This results in a
steadily shrinking amount of available disk swap space. If the swap
space runs to zero during observing, the system will crash and a reboot of waimea will almost
certainly be necessary.
The amount of swap space available on waimea has recently
(as of 18 December 2000) been increased from about 300 Mb, to about
500 Mb, and then to about 1.5 Gb (yes, that's GIGAbytes). It is
devoutly to be hoped that swap space problems will not come up during
normal observing after these increases, but if they do, please
be sure to have the OA report them in the nightlog.
There is a cron job that checks the amount of swap space every 10
minutes. If it detects that the available space has fallen below 50
Mb, a recorded announcement will be played in Remote Ops 2: "Warning!
Waimea's tmp directory has fallen below 50 megabytes. You will run
out of swap space soon!" If you hear this, take it seriously!
On any terminal window with a waimea> prompt, enter the command:
waimea> swap -s
You should see something like the following output:
total: 237832k bytes allocated + 37280k reserved = 275112k used, 1310608k available
The last number, the number of kilobytes of swap available, is the critical number. If this drops below 50000k you will start to hear the recorded warnings.
Available swap space is logged 7 times per hour (every 10 minutes plus an extra entry 48 min after the hour) by a cron job running on waimea. The log file is:
Check the last 12 hours' worth of entries with the command:
issued in a window on waimea. By the way, if you are logged intp a waimea window as user "nirspec", you can go straight to the log directory with the alias cdlog.
A line of the logfile looks like this:
and of course the last number is the one to watch.
When you hear the recorded warning, you probably have about 1 hour of operation left before the world comes to an abrupt end. This is done to give observers time to finish their current observing sequence and take any necessary calibration frames before shutting down. But don't try to push your luck on the time. It may take 5 to 10 minutes to shut down and restart, but it will take 30 minutes or more to recover from a crash if you don't shut down in time. Shutting down and restarting gracefully frees up enough memory that you should be able to finish the night without further incidents.
The NIRSPEC software consumes swap space continuously while it is running, even if no data are being taken, no motors being moved, etc. Also, all the IDL processes running for the QuickLook Displays, Rotator GUI, EFS GUI, Slitmove widget and other tools consume a lot of swap. There are two very important things every observer should do to prevent having to shut down during the night: