Calibration unit

Error (85): Axis has not been homed

Symptom
While trying to move one of the element of the calibration unit, or while running a focus/offset script, there is an error.
Problem
The calibration unit has not been initialized
Solution
  1. From the command line, run:
    	kcwiHomeCalUnit

 

Computers & Software

For SCIENCE DETECTORS, please click here to see that section below

DCS or instrument keywords missing in image headers

Symptom
Keywords such as RA, Dec, ROTPPOSN, or keywords related to the instrument status, are not found in the image headers. These keywords can still be obtained from the kcwiserver command line via show -s dcs commands or by showing individual servers keywords.
Problem
The kcwiheaders process has lost its connection to DCS or to one of the servers, or it is not running.
Solution
Use this procedure to restart kcwiheaders, as well as kcwi global and kcwihistory just in case:
  1. Wait for the exposure to end; do not begin another
  2. kcwi stop kcwi head hist
  3. kcwi start kcwi (wait 60 sec, until gshow -s kcwi %wav% returns values)
  4. kcwi start head (full name kcwiheaders, can be abbreviated head)
  5. kcwi start hist (full name kcwihistory, can be abbreviated hist)
  6. In 60 sec, check if kcwiheaders have started using kcwi status head
Newly-acquired images should now have these keywords in the image header.
If CCDCFG or other blue detector keywords are missing from the headers, see this entry.

MIRA fails with "cp: missing second argument" error

Symptom
The MIRA procedure starts but as soon as a focal plane camera image is taken, the script ends with an error about missing second argument on "cp"
Problem
The filename of the focal plane camera images is wrong.
Solution
The filename of the focal plane camera should be in the format: kfYYMMDD_<number>.fits
  1. Check the expected filename with the command framerootfpc
  2. Check the current filename with the command show -s kfcs outfile
  3. If they are different, use the following command to fix the problem: modify -s kfcs outfile=`framerootfpc`
  4. Re-run MIRA

Starting the eventsounds GUI

Symptom
No sounds (Exposure Complete, Readout Complete, etc.) are playing.
Problem
The eventsounds GUI or soundboard has crashed or failed to start.
Solution
The eventsounds GUI should be restarted.
  1. From a kcwirun@kcwiserver terminal, run kcwi start eventsounds

Starting kcwihistory or kcwiheader

Symptom
The kcwihistory or kcwiheaders is sending software alarm emails, with messages such as "kcwihistory resuscitation count is excessive".
Problem
The kcwihistory or kcwiheader has crashed or failed to start.
Solution
The kcwihistory or kcwiheader should be restarted.
  1. From any terminal, run kcwi stop kcwihistory (or kcwiheaders)
  2. Check if it has stopped with kcwi status kcwihistory
  3. Run kcwi start kcwihistory
  4. . In 30 sec, check with kcwi status hist to see if it is running.
  5. You may also need to check all the rest of the processes, including kcwi global, to see if they are the reason the other ones crashed. You can check them all by running kcwi status from kcwiserver.
______________
Old directions
  1. From a kcwirun@moolelo terminal, run $RELDIR/etc/init.d/kcwihistory stop! . It will stop even as it tells you that kcwirun lacks the right permissions to do so.
  2. Once kcwihistory has stopped (you can check with the status option), run $RELDIR/etc/init.d/kcwihistory start . Check with status to see that kcwihistory is happily running one process.

Mechanisms

Generic errors running mechanisms

Symptom
Mechanisms moves fail with errors such as:
Problem
Either the Galil controller or the server are in a bad state. Might be both. A stage might not initialized.
Solution
1. Check if the stages are homed/initialized.
 
Calibration Unit:
If the failure is related to the calibration unit, it is always safe to initialize it again.
From the background menu, Home/Initialize Components > Initialize Calibration Unit
This can take a long time.
If you are sure of which stage is not initialized, you can use the low level engineering GUI:
KCWI Engineering > Low Level Engineering GUIs > Calibration Unit
The GUI is self-explanatory, and stages can be homed one by one.
 
Blue Exchanger
The only stage that needs to be initialized is the grating rotator. The status can be checked with:
show -s kbes grhomed
If it needs to be initialized, use:
modify -s kbes grhome=1
 
Blue Camera Focus
The status can be checked with:
show -s kbms fochomed
If it needs to be initialized, use:
modify -s kbms fochome=1
Once this is done, you need to restore a sensible focus value, such as focusb -1.85
 
Red Exchanger
The only stage that needs to be initialized is the grating rotator. The status can be checked with:
show -s kres grhomed
If it needs to be initialized, use:
modify -s kres grhome=1
 
Red Camera Focus
The status can be checked with:
show -s krms fochomed
If it needs to be initialized, use:
modify -s krms fochome=1
Once this is done, you need to restore a sensible focus value, such as focusr -1.40
 
2. If this doesn't solve the problem after sending the move again, we need to look at restarting the servers.
 
The calibration unit is controlled by the kcas server, the blue exchanger is on kbes, the blue camera focus stage is on kbms, the red exchanger is on kres, and the red camera focus stage is on krms.
Any server can be restarted with kcwi restart <server>
It is always a good idea to restart the kcwi global, kcwiheader, kcwihistory after restarting a server
kcwi stop kcwi head hist
kcwi start kcwi (wait 60s)
kcwi start head; kcwi start hist (head can take 60 sec to finish starting, check with kcwi status head
 
3. If the problem persists, it might be time to power cycle the Galils.
 
The main command is kcwiPower, with kcwiPowerStatus showing everything.
Note that the Galils are coupled. The rotator, IFU stage, and the calibration server are on the same port (server 3, port 8), the blue exchanger and the blue mechanisms are on the same port (server 1, port 7), and the red exchanger and red mechanisms are on the same port (server 2, port 7). Be aware that power cycling one of them, will power cycle the other too. This means that a stage might need to be initialized again even if it was working before.
The procedure to follow is:
- shut down the corresponding servers with kcwi stop <server>
- power cycle the galil with kcwiPower
- wait about a minute for the power up procedure of the galil to complete
- restart the servers with kcwi start <server>
- verify stages are initialized (see above)
- restart the kcwi, kcwiheaders, kcwihistory
 
 
4. If that doesn't work, it's time to call the instrument master
 

Hatch not opening using keywords or GUI

Symptom
The hatch keywords and commands (hatch open/close) seem to have no effect on the hatch, no light is visible, the hatch close returns a "2"
Problem
The hatch motors are confused between limits (hatch close returns code 2), in a bad state, or the galil is confused
Solution
First try rehoming the hatch with modify -s khas bladecal=reset bladecal=homed
After this, test the hatch with hatch open (exit code 1 is open) and hatch close (exit code 0 if closed, 2 if still confused between limits).
 
If attempting to rehome returned the error: "Error setting bladecal: bladecal: ERR_WRONG_CTRL_MODE (-5254) Wrong control mode for request" do:
- modify -s khas blademod=pos
- modify -s khas bladecal=reset bladecal=homed
- test the hatch with hatch open and hatch close
 
If that didn't work, power cycle the hatch with the kcwiPower 1 2 command
Restart khas with kcwi restart khas
Test the hatch with hatch open/close

Blue Camera Focus mechanism is not homing, reports spurious errors

Symptom
The focus mechanism is not working. The errors can be "Move already in progress" or "Moves are locked"
Problem
Unless there is a move in progress or the moves are locked, this is likely to be a problem with the kbms server not being able to communicate with the hardware, and the most likely cause is a failure of the Lantronix terminal server
Solution
- It is worth looking at the output of the watchlogs command, which can be accessed from the background menu under the troubleshooting submenu. The KBMS server might be complaining. Try to stop and restart the server with kcwi restart kbms, and watch the output of the logs mentioned above. If the errors persist, it is possible that the Lantronix terminal server needs to be restarted.
- This can be done using the kcwiPower command. It takes 30/40 seconds for the Lantronix to restart.
- See ticket: K2-28841
- After all is solved, don't forget to restart kcwi/kcwiheaders/kcwihistory.

Rotator

Rotator does not not seem to communicate with TCSU, is stuck on Idle, Halt, or Slew

Symptom
Facsum reports an incorrect status, skypa cannot be set
Problem
The rotator is not fully initialized, or watchrot has either died or is in a bad state
Solution
1. Restart watchrot from a terminal on kcwiserver:
kcwi stop watchrot
kcwi start watchrot
2. If that doesn't work, stop watchrot and initialize the rotator from the background menu

Rotator has high errors, not tracking closely (less extreme)

Symptom
High rotator errors, drive/tracking lags or catches up in big jumps.
Problem
The rotator is not fully initialized, or watchrot has either died or is in a bad state, or the kros service has died or is lagging.
Solution
1. Restart watchrot
From a terminal on kcwiserver use:
kcwi stop watchrot
kcwi start watchrot
2. If that doesn't work, stop watchrot and initialize the rotator from the background menu
3. If that doesn't work, restart kros. From a terminal on kcwiserver:
kcwi stop kros
kcwi start kros
kcwi status kros
(check if it started; give it 30 sec).
If kros started, then restart kcwi global, kcwiheaders, kcwihistory:
 kcwi stop kcwi
 kcwi start kcwi
(wait 60 sec for it to start)
 gshow -s kcwi %wav%
(only will display once kcwi global is successfully started, then proceed)
 kcwi restart kcwihistory
 kcwi restart kcwiheaders
(wait 60 sec, gshow it to check if it started successfully)

Rotator stops tracking at random times, does not reach the requested position

Symptom
There are many possible symptoms: the rotator might stop tracking and stay in fixed position, it might try to get to a requested PA without getting there, it might go back and forth between tracking and slewing, or stay stuck in "slew"
Problem
One possible problem is overheating of the rotator galil. This causes 1 or 2 of the rotator encoders to read wrong values, which causes the rotator tracking software to shut down tracking.
Solution
1. Verify that the encoders are reading wrong values. Bring up the rotator engineering GUI, and look for the values read by encoders ABDE. Usually encoders D and E are the one showing errors. Also, on the top right part of the GUI, look for Encoder Error = True
 
2. Check the temperature of the electronic cabinet. Use command kcwiTempStatus
The temperature to check is Cab Int, cabinet interior. It should be below 20 C. If there is no value returned, it is possible that kt2s is running in simulation mode. If that is the case, stop and start kt2s.
 
3. If the temperature is too high, we need to cool down the cabinet. If possible at all, increase the glycol flow or at least check that the glycol valve is open. A good reference value is 1 GPM. If that is not possible, ask somebody at the summit to open the electronic cabinet door and leave it open. This will have undesired effects, such as releasing heat in the dome and possible light contamination.
 
4. Stop watchrot with kcwi stop watchrot
 
5. Reset the rotator galil with modify -s kros reset=1
 
6. Restart the rotator server kcwi restart kros
(then restart kcwi global, kcwiheaders, kcwihistory)
 
7. Initialize the rotator from the background menu
 
8. From the background menu, KCWI Control -> Enable KCWI rotator
 
9. If this doesn't work, it might be necessary to remove 2 of the encoders from the standard position measurement. To do this, login into the rotator galil (See below) and follow these instructions:
 
    telnet rotgalil
    SN AB
    ^]
    
Note that this setting goes away if the galil is reset.
 
10. Ultimate hammer: call the Instrument Master to restart the power to the rotator. This is a last resort since power cycling it will mean you need to rehome the rotator, the calibration unit, and the IFU stage since they share the power switch. Avoid doing this on-sky if possible.
 

Sending a Sky PA change is interrupted by a Virtual Circuit Disconnect

Symptom
The observer attempted to set the Sky PA position angle using the Blue Exposure GUI but failed with some version of the following error:
**********************************************
Virtual circuit disconnect 192, 192
vm-k2epicsserver:7502
**********************************************
Error setting ROTMODE: Can't put k2:dcs:pnt:fast.DROT to channel access (User specified timeout on IO operation expired)
Problem
The virtual disconnect interfered with changing the PA, so it was never sent or only partially sent.
Solution
Resend the "Set Sky PA" command by reclicking the button on the Blue Exposure GUI. Check FACSUM that the Rotator successfully goes to the requested angle. If not follow other rotator troubleshooting directions.

Science Detectors

Blue/Red Detectors have almost no light after a Mira (FPCam still in beam)

Symptom
The observers are trying to start their science/standard exposure right after a Mira, and there is no light or almost no light on either detector. What light there is does not look normal. See picture below for an 800sec blue and 300sec red example.

Problem
The observers have probably forgotten to re-send their configuration, so the Focal Plane Camera is still in beam instead of their slicer. The Blue Config GUI shows a yellow light next to IFU, indicating that the FPCam is still in place.
Solution
Have the observer send their configuration from the Configuration Manager in Firefox using the green play button. After pressing "Execute", wait until the move is complete, shown by the Execute window closing AND the Blue Config GUI shows the correct slicer installed.

Blue Detector stops taking images or rejects input

Symptom
It is impossible to start a new exposure. Neither the command line nor the buttons are able to initiate an exposure.
Problem
The detector server is in a bad state. This usually results from pausing an exposure following with an abort or stop.
Solution
The detector server should be restarted.
  1. From the login menu, open a terminal as the CURRENT user
  2. Run kcwiRecoverBlueDetector
  3. Take a test exposure tintb 0; goib

Missing blue header keywords such as CCDCFG

Symptom
The CCDCFG or other blue detector keywords are missing from the blue header.
Problem
The blue detector server is confused.
Solution
The detector server should be restarted.
  1. From the login menu, open a terminal as the CURRENT user
  2. Run kcwiRecoverBlueDetector
  3. Check if kcwiheaders is started with kcwi status head; if not restart it and wait 60sec.
  4. Take a test exposure tintb 0; goib

Blue Detector server is not starting or stopping

Symptom
The detector server appears to be stuck. It does not start and, if it is running, it does not shutdown.
Problem
The most likely cause is dirty fibers. This causes a loss of communication between the ARC board and the controller. If you look at the log stored in /var/log/kroot/kcwi.log in kcwitarg (not kcwiserver), you will see entries such as:
Dec 18 16:08:36 kcwitarg kbds_server: arc_api.c,517: Failed to setup ARC device: ( CArcPCIe::Command() ): ( CArcPCIe::ReadReply() ): Time Out [ 2 sec ] while waiting for status Exception Details: 0x203 TDL 0x7d 0xffffffff 0xffffffff 0xffffffff
Dec 18 16:08:36 kcwitarg kbds_server: arc.c,208: Failed to setup ARC device
Solution
The fiber should be cleaned.
  1. The problem is likely to be in the control room, but it might also be on the instrument side
  2. Once the fibers are cleaned, the detector server must be stopped manually (find the process using a ps -ef | grep rpc on kcwitarg)
  3. It can then be restarted with kcwi start kbds
  4. It might be worth verifying that the power to the detector is on (kcwiPowerStatus)

Red Detector server has crashed, recovery

Symptom
The red detector server krds has returned errors or exited with values not 0. The observer double-clicked "Expose" or tried to change a detector setting (ampmode, gain, etc) while the detector was still exposing or reading out.
It is impossible to start a new exposure. Neither the command line nor the buttons are able to initiate an exposure.
Problem
The detector server is in a bad state.
Solution
The detector server should be restarted.
  1. From the login menu, open a terminal as the CURRENT user
  2. Run kcwiRecoverRedDetector and follow directions.
  3. Take a test exposure tintr 0; goir
 
Explicitly, the red recovery script does the below steps as well as restarting the GUIs and eventsounds.
  1. Stop the red detector server
     kcwi stop! krds
  2. Start the red detector server and then give it 60 sec to start.
     kcwi start krds
  3. (then restart kcwi global, kcwiheaders, kcwihistory)
  4. Initialize the red side, which resets detector parameters, directories, prefixes, autoshut=1, ccdpower=1
     kcwiInit -red

Guider and Focal Plane Camera

The Focal Plane Camera is not exposing, is rejecting keyword changes

Symptom
Exposures fail, attemps to change the exposure time fail, the camera cannot be powered up
Problem
Either the camera hardware or the kfcs server are in a bad state
Solution
  1. First check if the focal plane camera is in by typing slicer into a terminal and seeing if you get fpcam back. If not, FPCam is not in beam, fix with slicer fpcam
  2. Use these commands to restore camera functionalities (MUST BE DONE FROM KCWISERVER):
    fpcamPower off
    fpcamPower on
    If you are lucky, the camera turned on and successfully set the 4x4 binning. If not, continue.

     

  3. Use these commands to restore camera functionalities (MUST BE DONE FROM KCWISERVER):
    fpcamPower off
    kcwi restart kfcs
    fpcamPower on
    If the camera turned on and successfully set the 4x4 binning, continue to the next step to restore all the settings, services, and displays. If not, try the above set one more time, then call the instrument master.

     

  4. Restart the AutodisplayFPC from the KCWI Individual GUIs background menu
  5. Restart the kcwi global, kcwiheaders, kcwihistory
    kcwi stop kcwi; kcwi start kcwi
    kcwi stop head; kcwi start head
    kcwi stop hist; kcwi start hist
  6. Restore the correct prefix and data directory for the kfcs server. Use "outdir -all" to see what is the directory for tonight (like /s/sdata1400/kcwi1/<UTDATE>):
    kcwiInit
If it is necessary to continue displaying FPC images, the FPC image display must also be restarted. This can be done from the background menu, under KCWI Individual GUIs.

The NEW KCWIA guider is not happy.

Symptom
The KCWIA andor camera guider will not take an exposure.
Problem
The k2-andor-camserver2 KCWIA camera server is not happy. The mechanisms run on k2-magiq-camserver1, and while it could be unhappy, it normally is not.
Solution
See Shui's page for SWOC to guide your steps: https://www.keck.hawaii.edu/twiki/bin/view/Software/SoftwareOnCallNotes#K2KCWIA
If they ask you to power cycle the KCWIA camera, the steps are:
  • kcwiPower 2 2 off
  • kcwiPower 2 1 off
  • Wait about 5-10 seconds.
  • kcwiPower 2 2 on (Guider Fiber Converter)
  • Wait 30 sec, Fiber converter needs to be on before camera power.
  • kcwiPower 2 2 on (Guider Camera Power)
  • The OLD guider reports "INTERNAL MODE" or shows two vertical black lines

    Symptom
    Using the EXPOSE button produces errors for "INTERNAL MODE"
    Problem
    The kcwi magiq camera server is not working correctly.
    Solution
    The OA can fix this problem by restarting the KCWI magiq camera server and the magiq server.

    If you want to try to fix it yourself, follow these steps:
  • Bring up a TCSGUI and select MAGIQ, which will pop up a small separate MAGIQ icon.
  • From the MAGIQ-K2_Operations_Menu icon, pull down and choose KCWI and then Restart KCWI Camserver. Wait a minute or two for it to complete the restart.
  • Then from the same Operations Menu pulldown, select Restart Server * , which is the first option. The restart will take a couple of minutes.

  • Test that the Guider is now working correctly:
  • From the MAGIQ icon pulldown, Select “Start OA Gui”
  • On the bottom left of the MAGIQ GUI area, select "kcwi" (instead of "NULL")
  • Click "Make Guider"
  • Click "Expose"
  • Inspect the images as it continuously exposes
  • Check that the Guider Log to the far left does not say "INTERNAL MODE" (shown as red text)
  • Click "Stop Exp"

    If the actions above fail to resolve the porblem, the guider hardware can be power cycled from the command line:
  • kcwiPower 2 2 off
  • kcwiPower 2 1 off
  • Wait about 5-10 seconds.
  • kcwiPower 2 2 on (Guider Fiber Converter)
  • Wait 30 sec, Fiber converter needs to be on before camera power.
  • kcwiPower 2 2 on (Guider Camera Power)
  • General OLD GUIDER problems: No images, no control of mechanisms, rcp errors...

    Symptom
    Often after an instrument swap, the guider is unresponsive: no images, no control of mechanism, generic rpc errors on the console
    Problem
    A number of problems can cause this: incorrect power up sequence, problems with the fiber connections, Galil in a confused state, spurious software running on the magiq servers, incorrect initialization of the magiq interface card in the back of k2-magiq-camserver1
    Solution
    The first approach is to power cycle the magiq server using the kcwiPower command, then restart the camera server and the magiq server.
    If that doesn't work, this is a ticket submitted by Liz (K2-24972):
    Long story - 
    - k2-magiq-camserver1 had been up 19 days. a NIRSPECM rpc_cam task was still running , but i do not think a KCWI rpc_cam task was up.  

    - Luca folks had done power-cycle, RestartKCWI for MAGIQ but no joy 
    - Following commands from 'Magiq Status' webpage, Luca did 'magiqCamera initcam 0 0 (for nirspec first), then 0 1 (for kcwi)', still no joy 

    - decided to reboot k2-magiq-camserver1 (telnet k2consoles-3 2012; solaris 5.10) 
    - after server back up, did the initcam commands again; now no NIRSPECM tasks running 
    - did 'RestartKCWI' from MAGIQ menu and rpc_cam for KCWI running 
    - able to take images now 
    - tried moving the filter wheel - nada happened, apparently from KCWI, these mechanisms not usually moved (or are they working?) 
    - per Luca, we only need to worry about focus, which doesn't move much around 137.0 
    - tried moving focus to 150 to test, didn't seem to move from main MAGIQ UI, but looks like it eventually did when Luca brought up MAGIQ camDiagostnics? tried moving it to 130.0; seemed to take a while, then eventually it refreshed on the camDiagnostics gui (I think you have to hit refresh). Still said 149.0 on main MAGIQ UI but we left it since camDiagnostics appeared to reflect the current state. 
     
    In summary, this problem requires the intervention of software on call, and might require rebooting the magiq server k2-magiq-camserver1.