Summary

Computers & Software

XLRIS/FIGDISP response is slow

Symptom
LRIS widgets such as XLRIS and FIGDISP are slow to respond to keyboard and mouse input on hanauma, but the response time from local windows on hanauma is normal.
Problem
Either punaluu is overloaded or the network connection to the summit is overloaded.
Solution
  1. If you are running a Netscape window on punaluu, then kill it.
  2. Check the CPU load on punaluu by typing the following command in a punaluu window:
      ps -aux | more
  3. Check listings under the %CPU column to identify any jobs consuming large fractions of CPU time. Contact computer support to halt such processes.
  4. If no high-CPU processes are found, then another user may be transferring large amounts of data to or from the summit, or the network may be slow, or perhaps punaluu needs rebooting. Contact computer support for assistance.

Can't reconfigure gratings, slitmasks, filters, etc.

Symptom
Attempts to use the configure command to change the grating, slitmask or filter list give error messages.
Problem
The grating/slitmask/filter list is corrupted in memory and on disk.
Solution
  1. Log into punaluu as user kics.
  2. Go to directory /usr/common/music/info.
  3. Inspect appropriate file (graname.str, redfilt.str, or slitname.str) for obvious errors.
  4. Create a backup copy of the offending file; e.g.,
    cp slitname.str backups/slitname.str.11oct1999
  5. Edit the file (emacs, vi, etc.) to remove the offending entry.
  6. Determine the process number (PID) of infoman using ct.
  7. Use the following command to halt and restart infoman:
    kill -9 154 ; /usr/local/music/bin/infoman l_info v=lris &
  8. Re-try the configure command.

XLRIS dies immediately after startup

Symptom
When XLRIS is started, the window appears and then vanishes within a few seconds without any warning messages.
Problem
The grating/slitmask/filter list is empty or corrupted.
Solution
  1. Type the following command to check the configuration lists:
    consort
    This will print out sorted lists of the configurations for all configurable stages. Check for a list which has either
    • no entries
    • invalid entries
    • duplicate entries (two positions with the same string)
  2. If one (or more) of these lists is empty or corrupted, fix it as follows:
    • delete bad entries using the syntax
      configure -d keyword alias
    • create new entries using the syntax
      configure keyword alias=N
      where N is the stage number to which to assign the string alias
  3. Restart XLRIS.

Blue LICKSERV dies immediately after startup

Symptom
The blue lickserv process dies. Attempting to restart it appears to work briefly, but it dies within a minute of being restarted.
Problem
The observer attempted to window the blue CCD, which is not permitted, and the blue crate is in a bad state.
Solution
  1. reboot the blue CCD crate (connect to the crate from the desktop menu using Engineering > CCD Crates... > TIP line to BLUE CCD crate) and type the command reboot
  2. Restart the blue side lickserv process after the reboot has completed using the pulldown menu item LRIS Control Menu > Subcomponents... > Re-start LICKSERVER... > Re-start Lickserver (BLUE)
It is not necessary to cycle power on the blue saddle bag.

FIGDISP accelerator keys don't work

Symptom
The FIGDISP shortcut keys used to make row plots, columns plots, FWHM measurements, etc., do not work.
Problem
The new version of FIGDISP uses different shortcut keys, so some of the old ones no longer work.
Solution
See the new keybindings list on the FIGDISP help page and learn to use the new shortcut keys.

FIGDISP will not display image

Symptom
When the first exposure completes, FIGDISP does not display the image.
Problem
The lickserv process which passes the information to FIGDISP is either dead or is owned by another observer and cannot communicate with FIGDISP.
Solution
In a punaluu window, type ctx to check the status of the key LRIS processes. Normal output looks like this:
Process    Expected   Got        Status    
-------    --------   ---        ------    
archmon    1          1          OK
watch_ccd  1          1          OK
infoman    1          1          OK
kfigdisp   1          1          OK
xlris      1          1          OK
lickserv   2          2          OK
xpose      1          1          OK
traffic    1          1          OK
If the message
ERROR: wrong owner
appears in the status list, then follow the steps below. If not, then this is not the problem.
  1. Log in to punaluu as lris.
  2. Issue the command kill_lris. This will attempt to seek out and destory any instances of Xpose, XLRIS, FIGDISP, lickserv, and watch_ccd which may be running under any username.
  3. Run ctx again and verify that only infoman, traffic, and cserv are running.
  4. Re-start LRIS software using the pulldown menu and proceed with observing.

FIGDISP IMPORT button does not work

Symptom
When using the IMPORT button in FIGDISP to read in an image, a new FIGDISP appears and hangs the system.
Problem
The IMPORT feature is currently broken on the blue FIGDISP tool.
Solution
  1. Quit and restart the affected FIGDISP tool
  2. Use the dlpr command to display an image on the red FIGDISP window
  3. Use the dlpb command to display an image on the blue FIGDISP window

SKY can't find catalogs

Symptom
When the SKY program is started up, it complains that it can't open the catalog server.
Problem
The catalog server process running on the host skyserver has died.
Solution
Restart the catalog server process:

Can't show Keywords

Symptom
Show keywords give solicite the following output
	     startproc: connect failed
             startproc: connect failed
             startproc: connect failed
             startproc: connect failed
             startproc: giving up on connect; exiting ...
             Sorry, the show command was not able to contact the control system: Can't connect to traffic:
         

After starting a crate session with start_crate [red | blue], the following output is printed to the screen:

             Got network configuration
             Broadcast message sent on ln0
             Broadcast not sent on lo0
             tropen: our host = ted.
             read_remote:  pathname=/local/kroot/data/music/services
             read_remote:  service file open
             tropen:  remote host = punaluup
             startproc: parameter hostname = punaluup
             startproc: gethostname returned ted
             read_remote:  pathname=/local/kroot/data/music/services
             read_remote:  service file open
             startproc: connect failed
             read_remote:  pathname=/local/kroot/data/music/services
             read_remote:  service file open
             startproc: connect failed
             read_remote:  pathname=/local/kroot/data/music/services
             read_remote:  service file open
             startproc: connect failed
             read_remote:  pathname=/local/kroot/data/music/services
             read_remote:  service file open
             startproc: connect failed
             read_remote:  pathname=/local/kroot/data/music/services
             read_remote:  service file open
             startproc: connect failed
             startproc: giving up on connect; exiting ...
             "cserv.c", line 106: cserv, da8398, "cserv ()",
             Error, #41: Can't make connection to traffic controller. Unable to make connection to traffic controller, retrying...
             Timeout waiting for ISERV, retrying
         
Problem
The low_level_software is not running or has died.
Solution
Restart the low level software

Red & Blue Mechanism Moves

Grating/turret move fails

Symptom
When changing gratings or adjusting grating tilt, the move fails soon after being initiated.
Problem
The grating brake drive has failed to reach the unclamp limit.
Solution
The solution is to restart the move by selecting 'set' in either the grating selection subwindow, or the wavelength selection subwindow. This will normally drive the brake unclamp to limit, prior to completing the rest of the compound move sequence.

Grating move fails

Symptom
When attempting to change gratings, the move fails. Running check_turret on the motor crate reveals that the grating is ``not at home.''
Problem
The homing operation fails because the puka (hole) in the rotor is plugged and the optical homing switch cannot detect light through it.
Solution
The puka must be cleaned by technical staff. If this does not fix problem, then see below.

Grating move fails

Symptom
When attempting to change gratings, the move fails. Running check_turret on the motor crate reveals that the grating is ``not at home.''
Problem
The homing operation fails due to some failure of the electronics.
Solution
Follow these instructions to disable the problematic grating port from operation.

Grating move fails:

Symptom
When attempting to change gratings, the move fails. In the Motor Log tail the following output is seen:
      
           Jul 10 11:31:43 [5981] lrs_mcs.c,345,mcs_com:  
            Error message from API unit received, stage grating_tilt         
           Jul 10 11:31:43 [5981] lrs_mcs.c,346,mcs_com:  
                Error message:  0 "#23 HOME NOT FOUND"
          
Problem
The homing operation fails due to some failure of the hardware. The likely hardware failure is that the home switch has failed. The software will not permit you to move a grating if the home switches for the grating tilt are not set.
Solution
An instrument technician will have to examine the home switches. In the mean time, avoid the toublesome port. If the grating is needed for observing, the grating will need to be swapped with a second grating in a port that is opperating. Follow these instructions for manually moving the grating turret away from the failing port.
  1. Have an instrument technician confirm that the grating is tilted to a safe location. An instrument technician has access to the grating port ...
    • Safe (homed) grating position: INSERT PICTURE
    • Unsafe (tilted) grating position: INSERT PICTURE
  2. If the grating is unsafe, change the tilt at the API level.
    • telnet tsred 3016 
    • :0 
    • stat 
      - and check the current position
    • mov  1000  
      - move 1000 steps (you may need to move more or less than this value to reach an acceptable tilt).
  3. Once the grating is tilted to a safe position, in a punaluu window release the grating detent by:
     m gdetent=0 
  4. At the grating service port, an instrument technician will be able to manually rotate the turret away from the troublesome port.
  5. Swap gratings if necessary.
  6. Update SIAS and the instrument software with any grating changes.

Mechanism move fails

Symptom
When attempting to change gratings, slitmasks, or filters (from either GUI or keyword level), the computer responds as if the move completed properly, but the instrument is unchanged.
Problem
The software is confused. It believes that the stage you selected is already in place.
Solution
  1. Issue the following command from a punaluu command line to force the software to re-read the instrument settings:
    m init=3
    Try the move again.
  2. If it still fails try selecting a different filters, gratings, or slitmasks until one works. Then, return to the slitmask, grating, or filter you desired.

Mechanism move fails (mread select timeout)

Symptom
When attempting to change a mechanism (from either GUI or keyword level), the move fails immediately. The following message appears in either the motor log window or the window from which the keyword command was issued:
move failed:
	mread select timeout
Problem
The motor control software is hung.
Solution
  1. Issue the following command from a punaluu window to restart the motor control software and all other low-level software:
    restart_low_level_software

Mechanism move fails (drive fault)

Symptom
A requested mechanism move fails with the following message printed in the motorlog:
DRIVE FAULT
Problem
One of the API motor controller is confused and needs to be reset.
Solution
  1. Issue the following command from a punaluu window to cycle power to the entire instrument:
    allpower cycle
  2. When it comes back up, issue the following command to turn the guiders back on:
    guiderpower on
  3. Use the tdl command to check communication with the crates:
    tdl red
    tdl blue
    See below for information on dealing with crate problems.

Mirror gets set to GRANGLE=0

Symptom
When completing a mirror move using either XLRIS or a script such as R-band, the mirror gets set to grating angle 0 rather than the nominal angle for imaging.
Problem
Either the mirror move is failing due to a bad limit switch, or the MIRGRANG keyword has been set to zero.
Solution

Grating tilts stops prematurely

Symptom
Although the mirror (grating position 1 in optical port) can be commanded to any GRANGLE, other gratings cannot be tilted more than about 5 degrees.
Problem
The turret switch is not reporting the proper grating position; hence, the electronics are always trying to drive the mirror stage (corresponding to turret switch output 0000) regardless of which stage is actually in the service or optical port. The likely problem is no signal from the turret switch, which multiplexes the signals to the proper grating station.

To verify the diagnosis, checking the bits on the API unit as follows:

Bits 3-6 of word I1 are supposed to encode the position of the grating turret. If these bits are all set to 1 and do not change when gratings are changed, then no signal from the switches is the likely problem.
Solution
Check for a broken wire connecting the turret switch to the electronics.

Trapdoor won't close

Symptom
Attempts to close the trapdoor from within XLRIS fail.
Problem
Due to an overflexible design the trapdoor will not often make it to its closed limit in certain orientations; it is essentially closed for optical purposes, however.
Solution
Try again to close the trapdoor using XLRIS. If it repeatedly fails, then assume that the door is closed and proceed.

Blue mechanism move fails due to latch out of position

Symptom
When trying to move a blue filter, dichroic, or grism, the move fails with the following error:
Error reading crate:  Carousel cell retainer latch is in an intermediate position.  Check air pressure!
See log files for further details.
Problem
The cell retainer latch for the appropriate stage has moved from the latched position to an undefined position in the middle of the move. The carousel cannot be rotated until the latch is latched, but the latch cannot be moved until the carousel is in a valid position.
Solution

Red focus or movable guider position invalid

Symptom
Following installation or removal of the polarimeter module, attempts to move the red focus stage or the moveable guider stage fail with a serial port timeout error. Motor log may show an entry similar to this:
Nov 13 16:43:01 [9881] sta_init:  Getting stage position for red_camera_focus   
  
Nov 13 16:43:05 [9881] lrs_mcs.c,1456,mcs_timeout():  Serial port (select) timeo
ut, stage red_camera_focus     , retrying select/read
Nov 13 16:43:09 [9881] lrs_mcs.c,1456,mcs_timeout():  Serial port (select) timeo
ut, stage red_camera_focus     , retrying select/read
Nov 13 16:43:13 [9881] lrs_mcs.c,1456,mcs_timeout():  Serial port (select) timeo
ut, stage red_camera_focus     , retrying select/read
Nov 13 16:43:13 [9881] lrs_mcs.c,1468,mcs_timeout():  Aborting because of serial
 port timeout, stage red_camera_focus     
Nov 13 16:43:13 [9881] lrs_encoder.c,168,read_abs_encoder:  Serial port timeout,
 trying to read AR encoder, stage red_camera_focus
Problem
The polarimeter install/remove script was not completed properly, leaving the encoder is powered off. The system cannot communicate with the red focus and movable guider encoders.
Solution
Turn on power to the polarimeter encoder using the LRIS Power Control GUI.

Invalid controller unit number

Symptoms
Attempt to make motor move fails with the following error message:
Error setting grating: Error reading crate:  
Controller unit number != unit number expected.
Problem
The API unit used for this motor is not echoing command input back to the motor control software.
Solution
Enable command echoing on the appropriate API unit by issuing the following command sequence from a punaluu xterm:
acom tsred 16 :N
acom tsred 16 echo
where N is the appropriate API unit number as determined from consulting the Motor Stage Data file.

Polarimeter waveplate will not rotate to all angles.

Symptoms
The wave plate may move to an angle of 0 degrees and some other angles, but not all the normal angles. As an example, it may move from -90 to +30 but not to the normal angle of +90. Check the plangle and plangler keywords. For a plangle=0, the plangler should be 1946876.
Problem
The parameter file that holds the default encoder value for the home or 0 degree angle is corrupt. This has happened at least twice in the past when a new software version is released. Two known corruptions were that 1) the file dropped the least significant digit for the encoder value and 2) the encoder value was deleted. It is unknown why this happens during a new software release.
Solution

Grism fails to move: Unlatch Failed

Symptoms
Attempts to move the Grism results in the following error as seen in the motor log.
      UNLATCH failed, stage grism_carousel       
      Both carousel positions switches are not set.  Latch will
      not move., #1=1, #2=0, stage grism_carousel    
Problem
The flags that are used to indicate that the carousel is in position are not "made" suggesting that the carousel is missaligned. Because the flag are not made, the carousel will not deploy. The grism will not unlatch if one or both of the flags are not made ( #1=1 and #2=0,#1=0 and #2=1, or #1=0 and #2=0).
Solution
Try to nudge the carousel into position at the API level. Please use caution.
  1. determine which flags are not made by issuing the command in a punaluu window
     cari1 tsblue 16
  2. If in position the output for bits 6 and 7 will read
                       I1 Bit 6 set.      Cell is in position according to Switch #1.
                       I1 Bit 7 set.      Cell is in position according to Switch #2.
    		
  3.  telnet tsblue 3016 
    log in to the terminal server on the blue side
  4. check the flags by
     stat 
    and looking at
                             I1 - STATES, INPUT SET 1   = 11110111
    	      
    Bits 6 and 7 should be set to one if the carousel is in position. If either is 0 then procede to the next step
  5.  set 4 
    - this releases the carousel brake.
  6.  mov -500 
    move 500 steps in the negative direction.
  7. check the flags by
     stat 
    and looking at
                             I1 - STATES, INPUT SET 1   = 11110111
    		    
    Try the other direction if the negative direction fails.
  8. If you are unable to set the flag, this position will have to be avoided until someone is able to look inside the instrument and verify the position of the element and the flags.
  9. If the flags are set, try to deploy the grism at the keyword level
     m gtran=1 

Slitmask Alignment

Xbox can't find boxes

Symptom
When the xbox program to align slitmasks is executed, it cannot find any of the alignment boxes.
Problem
The xoff and/or yoff parameters are not adjusted properly.
Solution
  1. Check the coordinates printed out by xbox indicating the location of the first box it is trying to find.
  2. Display the image in IRAF and move the image cursor onto the box.
  3. Compare the cursor location to the coordinates printed by xbox. If they are off by more than 10 pixels, you might have problems.
  4. If needed, change the value of the xoff parameter by the amount x_m - x_p, where x_m is the measured column location and x_p is the location predicted by xbox.
  5. If needed, change the value of yoff by the amount y_m - y_p, where y_m is the measured row location and y_p is the location predicted by xbox.
  6. Run xbox again.

Guiders

Offset guider move fails

Symptom
When changing the moving guider position, the move fails, sending the stage to a limit.
Problem
The guider stage encoder reading has become corrupted, and the software is misreporting the position: the next move tries to correct for the bad reading.
Solution
Re-home the offset guider using the following steps:
  1. If needed, start tailing the motor crate log by typing the following command in a punaluu window:
       start_motor_log_tail
  2. From a punaluu window, bring up the GUI for homing mechanisms:
      home.tcl &
  3. Click the button marked OFFSET GUIDER and follow the progress of the homing operation in the motor log window.
  4. When this completes, type the following in a punaluu window to display the current position of the movable guider stage:
      s tv1fpos
  5. If the returned value if between 200 and 400 mm, you have fixed the problem. If not, proceed with the following steps.
  6. Restart the LRIS motor crate software by typing the following command in a punaluu window:
      restart_low_level_software
  7. When the software is restarted, again type
      s tv1fpos
    in the punaluu window.
  8. If the returned value if between 200 and 400 mm, you have fixed the problem. If not, seek assistance.

Guider images are saturated

Symptom
Guider images are saturated (signal > 32000DN).
Problem
  1. LRIS lamps are on, OR
  2. CCD controllers are not on, OR
  3. Glycol flow is off to the guiders, OR
  4. Saddlebag is overheating
Solutions

CCD Images

Images show severe light contamination

Symptoms
LED-contaminated Image
LED-contaminated
image
Images taken with LRIS show severe light contamination with an illumination pattern strongest at the top, resembling the image at right. Spectra of the contamination shows it to be predominantly red in color.
Problem
The light-emitting diodes (LEDs) used in the grating tilt optical homing switches have been left on, probably due to a failure in homing the grating tilt.
Solution
Please consult your instrument specialist for assistance.
  1. Start a motor crate session.
  2. At the API level on the motor crate, type:
      :0
      reset 1
      quit

Random image rows are reversed

Symptom
CCD images read out in dual-amp mode show strange patterns, with the rows from the left-side of the chip randomly appearing on the right and vice versa.
Problem
Fiber connections between the CCD and the crate are flaky, causing random bits to be dropped and the sense of certain rows to appear reversed.
Solution
Ask Instrument Technician to test all fiber connections to the CCD for light loss.

No light in half or entire image

Symptom
Images taken in single-amp mode appear to have no signal, and images taken in dual-amp moce have single in only one amp but not the other.
Problem
One of the CCD amplifiers is not working, perhaps because the saddlebag is overheating or a card in the saddlebag is poorly mounted.
Solution
  1. Check the saddlebag temperature using the command
    show -s lris utbtemp
    If it is above 25°C, then the saddlebag should be inspected for failed glycol or failed fan.
  2. If the saddlebag temperature is okay, then the saddlebag should be opened and the cards reseated.

Images saturated but no lights on

Symptom
Images are saturated (signal above 65000DN) but the trapdoor is closed and no lights are on.
Problem
One or both of the fibers allowing the saddlebag to communicate with the VME crate in the control room are broken.
Solution
  1. Change to a new (good) set of fibers between the NAS deck interconnect panel and the control room interconnect panel.
  2. If this does not fix problem, check the transmission of the other fibers connecting the Leach board in the saddlebag to the VME crate in the control room.

Bias level very high on half or all of image

Symptom
Images show an elevated bias level on half of the image, or on the entire image. Most commonly this is seen in dual-amp mode where the right-side amplifier suddenly has a bias level of over 40,000, as seen in the pre- and post-scan columns. Aside from the elevated bias level, the image may look normal (light can be detected).
Problem
The A/D converter for one amplifier has a stuck bit which is raising all pixel values by some constant amount.
Solution
  1. Try taking a dark image to verify the elevated bias level.
  2. Power cycle the offending ccd crate using the following command on punaluu (assuming the red side is causing the problem):
    ccdpower red cycle
    Remember to issue the lda 0 command on the CCD crate to re-enable temperature control.
  3. If problem persists and affects only one amp, switch to single-amp readout mode and use the good amplifier.
  4. If this is not possible, attempt to shorten exposure times so that the elevated bias level and accordingly reduced dynamic range of the image do not cause saturated pixels for science target.

Image regions are filled with zeros

Symptom
After reading out part of an image successfully, the readout pauses and the remainder of the image if filled with zeros.
Problem
Communications between the CCD crate and the saddlebag were lost during image readout.
Solution
  1. Use the tdl command on the CCD crate to check communications.
  2. It is likely you will find that the crate can't talk to the VME interface card (i.e., tdl 0,1 fails). If so, the card must be reset. The image data cannot be recovered.
  3. After resetting the card, use the tdl command to verify that communications with the saddlebag are restored.
  4. If the problem persists, try switching to single amp readout mode in order to reduce the data flow rate and hence prevent lockups.

Images have central depression

Symptom
Illuminated images show a circular patch in the image center which has reduced flux (20% or more). Objects seen in images may look fuzzy.
Problem
Water vapor has condensed onto the dewar window.
Solution

Bias jumps in image

Symptom
Bias level jumps randomly between two values during readout, affecting both data columns and pre/post-scan columns, as shown in this image.
Problem
A cable in the Leach CCD saddlebag readout electronics is loose.
Solution
Inspect and reseat cables in the affected saddlebag.

Image did not readout - Exposure did not complete

Symptom
The exposure did not readout leaving a partial image in the figdisp window or the exposure never finished exposing leaving the exposure meter and timer stuck. You see partial images in figdisp or the elapsed time is no longer incrementing.
Problem
The exposure did not readout or the exposure did not complete because the crate crashed. It is possible to save the image if the crash occured before the image readout.
Solution
  1. Tell the OA to reboot the appropriate side crate (blue side crate or the red side crate)
  2. in a punaluu window log into the crate:
     start_crate [blue,red] 
  3. monitor the crate session to ensure that the crate boots appropriately.
  4. if the crate is stuck at the VMXWORKS BOOT prompt, type @ and return to initiate the rest of the boot sequence
  5. At the end of the boot sequence, the prompt will read done.
  6. log out of the crate: ^] followed by quit at the telnet prompt.
  7. If the crash occured before readout began, it is possible to save the exposure. The observer must deside whether it would be valuable to save the image. Run the recover_image [red,blue] script at the command line or from the engineering poriton of the pulldown menu.
  8. take a 1 sec test frame to ensure everything is okay. If the recover image script is run, it prompts you to acquire a test image.
  9. If still having problems, run testAll

Image did not readout - Exposure did not complete on the Blue side.

Symptom
The exposure did not readout. "testAll", "tdl blue", "ctx" all indicate that all processes are running normal. Blue exposures will finished, but the images are not displayed in figdisp and the data is not written to disk. The "Start" button on the XPOSE gui became inactive and needs to be restarted to see the start button active again.
Problem
We don't know what the problem is at this time. Somehow the software got tied in knots.
Solution
    These are some things to try
  1. take an exposure using the command line: goib
  2. reboot the blue crate
  3. Do a full software shutdown using the desktop menu
  4. restart XPOSE, figdisp, and lickserve on the blue side

Crates

Can't boot crate

Symptom
When crate is rebooted, the boot sequence doesn't complete. Logging in via TIP session (start_crate [red|blue] command) a pressing Enter yields only the following prompt:
[VxWorks Boot]: 
Problem
The startup script hung for some reason.
Solution

Can't boot crate

Symptom
When crate is rebooted, the boot sequence fails before the startup script is run. The following error message appears on the crate console (or TIP line):
Attaching network interface ln0... done.
interrupt: ln0 lnInt: no carrier
Loading... interrupt: ln0 lnInt: no carrier
interrupt: ln0 lnInt: no carrier
interrupt: ln0 lnInt: no carrier
interrupt: ln0 lnInt: no carrier
 
Error loading file: errno = 0x3c.
Can't load boot file!!
 
[VxWorks Boot]: 
Problem
Communications between the crate and punaluu are broken.
Solution
Check ethernet connections between the crates and punaluu.

Sudden inoperation (CCD cannot start an exposure)

Symptom:
CCD system suddenly cannot start an exposure after an extended period of normal operation.
Problem:
It is likely that too many serv_ tasks have piled up on the crate. Verify the problem by starting a CCD crate session (see the solution below) and using the i command to check processes running. Here is an example showing an excesss of suspended tasks:
-> i

  NAME        ENTRY       TID    PRI   STATUS      PC       SP     ERRNO  DELAY
---------- ------------ -------- --- ---------- -------- -------- ------- -----
tExcTask   _excTask       3f8ea0   0 PEND          37f60   3f8cf0  3d0001     0
tLogTask   _logTask       3f6d48   0 PEND          37f60   3f6b98   d0003     0
tShell     _shell         3c5818   1 READY         525cc   3c5208   30065     0
tRlogind   _rlogind       3d3138   2 PEND          6f5cc   3d2d68       0     0
tTelnetd   _telnetd       3d1260   2 PEND          6f5cc   3d0fc0       0     0
tNetTask   _netTask       3f1410  50 PEND          6f5cc   3f1268  3d0002     0
tPortmapd  _portmapd      3cfd20 100 PEND          6f5cc   3cfa30      16     0
MLOG_STDOUT_start_mlog_   2b0da8 100 PEND          37f60   2b0878  3d0001     0
cserv      _cserv         298550 100 PEND          6f5cc   298028  3d0002     0
ccdClock   _ccdClock      27fcf8 100 PEND          6f5cc   27f880  3d0002     0
broad_mon  _broadcast_m   2767f0 100 DELAY         70f2c   2766c0  3d0002  2595
responder  _responder     24b2f8 100 PEND          37f60   24a010  3d0001     0
serv_4644  _s_show_auto   3b7ed0 100 SUSPEND       70924   3b7d40  3d0002     0
serv_4624  _s_show_numa   231a68 100 SUSPEND       70924   2318d8  3d0002     0
serv_4644  _s_show_auto   22f1a0 100 SUSPEND       70924   22f010  3d0002     0
serv_4632  _s_show_voff   22c8d8 100 SUSPEND       70924   22c748  3d0002     0
serv_4644  _s_show_auto   22a010 100 SUSPEND       70924   229e80  3d0002     0
serv_4622  _s_show_ccdg   226710 100 SUSPEND       70924   226580  3d0002     0
serv_4624  _s_show_numa   222e10 100 SUSPEND       70924   222c80  3d0002     0
serv_4624  _s_show_numa   21f510 100 SUSPEND       70924   21f380  3d0002     0
serv_4604  _s_show_binn   21cc48 100 SUSPEND       70924   21cab8  3d0002     0
serv_4630  _s_show_voff   21a380 100 SUSPEND       70924   21a1f0  3d0002     0
serv_4604  _s_show_binn   216a80 100 SUSPEND       70924   2168f0  3d0002     0
serv_4604  _s_show_binn   2141b8 100 SUSPEND       70924   214028  3d0002     0
serv_4606  _s_show_elap   2118f0 100 SUSPEND       70924   211760  3d0002     0
serv_4604  _s_show_binn   20f028 100 SUSPEND       70924   20ee98  3d0002     0
serv_4904  _s_show_raw_   20c760 100 SUSPEND       70924   20c5d0  3d0002     0
serv_4610  _s_show_wind   209e98 100 SUSPEND       70924   209d08  3d0002     0
serv_4606  _s_show_elap   206598 100 SUSPEND       70924   206408  3d0002     0
serv_4606  _s_show_elap   203cd0 100 SUSPEND       70924   203b40  3d0002     0
serv_4502  _s_set_binni   2003d0 100 SUSPEND       70924   200240  3d0002     0
serv_4816  _s_set_utb_d   1fcad0 100 PEND          6f5cc   1fc898  3d0002     0
rccd       _rccd          395390 150 SUSPEND      3fe730   395010  3d0002     0
value = 3951892 = 0x3c4d14
Solution:
We need to reboot the CCD crate.

VME message exchange errors on CCD crate

Symptom
vmemsgxchng (VME message exchange) errors such as the following are seen on the CCD crate output:
vmemsgxchng:  DSP reply timeout.  Check power/hardware.
E1:Time=209340:"util_mem_rd.c", line 31: broad_mon, 24da70, "util_mem_rd ()",
Error, #1: Undefined error condition. Bad return 12 from vmemsgxchng() sending RDM packet

E1:Time=209340:"get_dsp_data.c", line 108: broad_mon, 24da70, "get_dsp_data ()",
Error, #1: Undefined error condition. util_mem_rd returned -1007 reading loc 40000d of camera 0

E1:Time=209340:"broadcast_ccd_analog_inputs.c", line 72: broad_mon, 24da70, "broadcast_ccd_analog_inputs ()",
Error, #1: Undefined error condition. get_dsp_data returned 3 reading UTB adc chan 6 from camera 0
Problem
These messages indicate an inability of the CCD crate to communicate with the utility board in the LRIS saddlebag. Possible causes are:
Solution
Use the tdl command on the CCD crate to determine the failure point and take actions recommended at the bottom of that web page.

Shutter

Can't start an exposure

Symptom
When the START button on the Xpose widget is pressed, the exposure does not start. Logging into the CCD crate reveals shutter error messages such as:
check_shutter_status: returning -1103
E1:Time=41488:"s_cshutter.c", line 491: serv_4714, 3b9310, "check_shutter_status ()",
Error, #1: Undefined error condition. Camera shutter is partially open!

E1:Time=41488:"s_cshutter.c", line 318: serv_4714, 3b9310, "do_shutter ()",
Error, #1010: Error number not found in error message table. Bad return from check_shutter_status.

E1:Time=41488:"s_set_erase.c", line 234: serv_4714, 3b9310, "set_erase ()",
Error, #1010: Error number not found in error message table. Bad return from close_shutter()

E1:Time=41488:"s_expose.c", line 254: serv_4714, 3b9310, "s_expose ()",
Error, #1: Undefined error condition. Bad return from set_erase()
Problem
The shutter is stuck partially open.
Solution
  1. Cycle the shutter open and closed manually as directed in this procedure. Then try pressing the ABORT key on the Xpose widget and attempt another exposure.
  2. If this fails, have technician cycle shutter open and closed manually via the toggle switch on the LRIS shutter controller.
  3. If this fails, have technician cycle power on the LRIS shutter controller.
  4. If this fails, check the cabling from the Leach saddlebag to the shutter control box, and from there to the shutter housing in front of the dewar.

Shutter will not operate

Symptom
Attempts to start exposure or cycle the shutter fail. Logging into the CCD crate reveals shutter error messages such as:
safe_do_shutter:  calling do_shutter.
"s_cshutter.c", line 304: serv_4740, fdb698, "do_shutter ()",
Error, #1010: Error number not found in error message table. CSH command unsuccessful, reply was 455252

"s_cshutter.c", line 406: serv_4740, fdb698, "safe_do_shutter ()",
Error, #1010: Error number not found in error message table. Bad return from do_shutter()

"s_cshutter.c", line 153: serv_4740, fdb698, "s_cshutter ()",
Error, #1010: Error number not found in error message table. Bad return from safe_do_shutter()

Problem
The shutter is not operating.
Solution
  1. Execute the test_shutter command from punaluu to verify operation of the shutter.
  2. If the shutter fails to move, the script will print the following message:
    ------------------------------------------------------------------------
    	    ERROR: The $side shutter is not functioning properly
    ------------------------------------------------------------------------
    
    I tried to cycle the $side shutter open and closed, and it FAILED to work.
    Should I power cycle the $side CCD saddlebag and the $side shutter controller
    echo -n "and then try again? (y/n) [y]:
  3. Press Enter to power cycle the components. The script will try again to open the shutter. Quit the script by answering n if it does not succeed on the second try.
  4. If still not working, get onto the CCD crate and use the tdl 0,3 command to verify that the crate can communicate with the utility board, the card that actually commands the shutter to open. If tdl fails, try cycling power to the saddlebag, crate, and/or VME interface card to restore communications.
  5. If you can talk to the utility board, have technician cycle shutter open and closed manually via the toggle switch on the LRIS shutter controller, which is located in the red-side enclosure. The controller has three lights:
    • red means closed
    • green means open
    • amber means unknown
    An amber light is bad news and could indicate a mechanical problem, possibly a broken linkage.
  6. If the shutter can opened and closed manually, the problem is not mechanical and hence may be electrical or electronic.
    • Check the cabling from the Leach saddlebag to the shutter control box, and from there to the shutter housing in front of the dewar.
    • If the cabling looks fine, use a voltmeter to check the ?? cable for evidence of the +5V signal from the utility card to the shutter controller which triggers the open shutter operation. If this is not found, then try to gain access to the utility card and probe ?? for the +5V signal.

Shutter will not operate - Blue Side

Symptom
Exposures on the blue side look like bias images or they have "star trails" above a star caused by the erase and below the star caused by the readout.
Problem
The shutter is not operating normally.
Solution
    Try the following
  1. Execute the cycle_shutter blue 5 command from punaluu to cycle the shutter open and closed. This may free the shutter..
  2. Execute the test_shutter command from punaluu to verify operation of the shutter.
  3. If this fails, have technician cycle power on the LRIS shutter controller,
  4. If this fails, have technician cycle shutter open and closed manually via the toggle switch on the LRIS shutter controller.
      Technician may access the shutter controller via:
    • Removing the blue side electronics bay cover which is at the bottom of the instrument when the dewars ate at the fill position (3 O'clock when standing behind LRIS).
    • locate the shutter control box at the left. It has red and black buttons with a toggle switch for manual and auto modes.
    • switch to manual and open and close the shutter.
    • Technician should hear a click as the shutter opens and closed.
    • If it opens, observes can try automatic shutter mode or choose to observe in a shutterless mode.
  5. If this fails, check the cabling from the Leach saddlebag to the shutter control box, and from there to the shutter housing in front of the dewar.
  6. Instructions for using the trapdoor as a shutter may be found here

ADC

ADC System is "In Forward Soft Limit"

Symptom
Guis report that the system is in a forward software limit. The status message reads:
 In fwd-soft limit 
Problem
The ADC is in a forward software limit
Solution
To clear the limit status
  1. Select the "track" option to see if it will restart.
  2. Move to elevation that is legal and select "track" to initiate tracking.

ADC System will not track

Symptom
The ADC will not change modes
Problem
The ADC daemons are not running appropriately
Solution
To fix the problem.
  1. restart the daemons
    1. login to adc server using the appropriate account
    2. execute the following command
       adc restart daemons 
  2. From the desktop menus, restart the ADC gui.

Status never says Track

Symptom
Obs gui status never says that it is tracking. Engineering gui indicates that the separation is oscilating between values.
Problem
There is a delay between information being received for updating the position, and this delay is keeping the mechanism motion out of phase with requested position updates.
Solution
To fix the problem
  • From the engineering gui
    1. Halt the ADC
    2. Select Tracking again.
    3. If problem persists, Halt the ADC
    4. Put ADC into POS mode and set the position ot 20 mm.
    5. Select Tracking again.
  • Did not reach intended position

    Symptom
    When in position mode, the ADC does not reach the intended positon because it times out.
    Problem
    During long moves, the software may time out before the ADC reaches the destination. This is due to a combination of a relatively short time out and slow slew speed.
    Solution
    Re-execute the move command.

    Primary hardware limit set

    Symptom
    ADC will not move because the primary hardware limit is set
    Problem
    The primary hardware limit is set.
    Solution
    1. Change to Position mode and set the destination to 20 mm. This should get the ADC out of the limit.
    2. Re-initialize
    3. Start Track mode, to begin tracking.

    Primary and Secondary limits set

    Symptom
    ADC will not move or initialize. The status indicates that the primary and secondary hardware limits are set
    Problem
    The primary and secondary hardware limits are set. This requires manual intervention. If this happens at night, the protocol is to opperate LRIS with the ADC stuck in this state. If this occurs in the afternoon when trained staff members are available, then they should address the problem.
    Solution
    To move the ADC out of the primary and secondary limits, a trained staff member has to manually turn the lead screw on the back of the ADC. The ADC does not need to be removed from the CASS position, however, if LRIS in in beam, LRIS may need to be pulled back to gain access to the lead screw. Someone should monitor the ADC status while the lead screw is turned. As the lead screw is tunred, the secondary limit will dissapear and then the primary limit will dissapear in the status message window on the eng and observer guis. After the limits are cleared, ensure that the technician is safely outside the instrument and then re-initialize the ADC. Set the mode to track to test whether the ADC will track appropiately.

    temps

    Temperature warnings

    Symptom
    The teperature of the CCD and saddle bags on either or both the red and blue side are out of range. The LRIS dewar status monitor task "/home/punaluu/lris/bin/monitor_tempdet2" running on the LRIS host computer "punaluu" may have detected the problem and is now sending e-mail warnings.
    Problems
    Solution
    1. Try running "testAll" on punaluu
    2. Try running "tdl red" or "tdl blue" on punaluu and follow instructions if there are problems.
    3. If both tests pass, run "temps" on punaluu. Typical output is:
      [69] lris@punaluu: temps
      System               Current Temp  Acceptable Range  Status
      -------------------  ------------  ----------------  ------
      Blue CCD temp              -120.0  -121.0 to -119.0  OK
      Red CCD temp               -100.0  -101.0 to  -99.0  OK
      Blue saddlebag temp          -3.5   -10.0 to    0.0  OK
      Red saddlebag temp           -4.5    -8.0 to    0.0  OK
      
    4. if several temps output indicates that the temperature is converging to the acceptable range, then continue to monitor these messages until the CCD temperature converges.
    5. If several temps output indicates that the temperature is rising, i.e. continuing to warm up, then the dewar is indeed dry and an immediate LN2 fill is necessary, to prevent warm up and loss of vacuum. Contact the support astronomer on duty if this happens.

    Recofig software

    Reconfig software hangs

    Symptom
    Reconfig software hangs while trying to retireve slitmask information from the slitmask database.
    Problem
    It may be that polo is hung. Polo serves the slitmask retrieval software because it is authenticated with the slitmask database.
    Solution
    An example of how polo was hung from 30 July 2008. A nirspec host computer (waikoko) disk was cross mounted on polo. Waikoko was having problems and this hung polo. JC unmounted the waikoko disk and then polo was fine. Then the reconfig software ran normally.