NIRC Diagnostics

Table of contents

  1. NIRC Crashes
  2. Rabbit Tip Session: no output
  3. Rabbit daemons not running
  4. Images write fails. Images take a long
  5. XSHOW does not update
  6. Rabbit not responding
  7. Guider images not normal

NIRC Crashes

Quick Summary:

  • Try selecting "Reset NIRC software" from the "NIRC Tools" menu in the OpenWindows menu.

  • If this fails, try rebooting rabbit:

    Try selecting "Reboot Rabbit" from the "NIRC tools" desktop menu. The "Rebbot Rabbit" selection will run a script located in "/home/maili/nirc/p3/bin" called "reboot_rabbit." Alternatively, type "p3/bin/reboot_rabbit" on a maili xterm.

      The "reboot_rabbit"script will
    • start an xterm
    • shutdown the NIRC guis
    • run the rabbit reboot sequence
    • inform you to restart nirc software from the desktop menu

  • If rebooting NIRC fails, you may need to try manually cycling power on the module. This requires going to horizon and having someone at the summit climb into the tertiary tower to cycle power.

  • If all else fails, you may need to reboot maili. Have the OA do this by using "/etc/oareboot".

Details:

The new ("P3") NIRC software does not crash as often as the old software, so hopefully this page will be referenced relatively rarely. Still, there are ways in which the new software can crash. One diagnostic is activity on the rabbit tip window. Generally this window shows almost no output except during reboots and during DSP reboots. If you see a matrix of numbers, including "passthru," then you have likely suffered a crash.

An example...

May  2 14:40:42 rabbit vmunix: unit=0 model=BCE,270-1722-02 Rev.1, file=../driver_share/hostport.c, line=249, unit_addr=fb05c000
May  2 14:40:42 rabbit vmunix: unit=0 csr=94400100, count=0, addr=0xfff05000
May  2 14:40:42 rabbit vmunix: count=0, addr=0xfff05000
May  2 14:40:42 rabbit last message repeated 3 times
May  2 14:40:42 rabbit vmunix: unit=0 currentPage=4 count_saved=65668 addr_base=0xfff05000
May  2 14:40:42 rabbit vmunix: unit=0 endPage=0 endByte=0, waitPage=0 waitByte=0
May  2 14:40:42 rabbit vmunix: unit=0 icr=8, cvr=12, isr=0, ivr=f
May  2 14:40:42 rabbit vmunix: hpGetBlock: failed after 0 of 1 words, receive register empty.
May  2 14:40:43 rabbit vmunix: doing reset
May  2 14:40:43 rabbit vmunix: doing reset
May  2 14:40:42 rabbit rpc.collectd: Starting debugInfo.
May  2 14:40:42 rabbit rpc.collectd: passthru isr = 0
May  2 14:40:42 rabbit rpc.collectd: passthru cvr = 12
May  2 14:40:43 rabbit rpc.collectd: passthru y1     = $2000
May  2 14:40:43 rabbit rpc.collectd: passthru DB_VEC = $128022
May  2 14:40:43 rabbit rpc.collectd: passthru DB_PBC = $1d
May  2 14:40:43 rabbit rpc.collectd: passthru DB_PAC = $2
May  2 14:40:43 rabbit rpc.collectd: passthru DB_GAC = $0
May  2 14:40:43 rabbit rpc.collectd: passthru DB_BUG = $0
May  2 14:40:43 rabbit rpc.collectd:             icr cvr isr ivr dum hi  mid lo
May  2 14:40:43 rabbit rpc.collectd: mapped      31  71  f1  31  71  71  f1  31
May  2 14:40:43 rabbit rpc.collectd: slot 0x0e00 8f  0f  4f  0f  cf  0f  8f  0f
May  2 14:40:43 rabbit rpc.collectd: slot 0x0f00 4f  0f  8f  0f  8f  0f  87  07
May  2 14:40:43 rabbit rpc.collectd: debugInfo done.
May  2 14:40:44 rabbit vmunix: unit=0 model=BCE,270-1722-02 Rev.1, file=../driver_share/hostport.c, line=249, unit_addr=fb05c000
May  2 14:40:44 rabbit vmunix: unit=0 csr=90000000, count=0, addr=0xfff05000
May  2 14:40:44 rabbit vmunix: count=0, addr=0xfff05000
May  2 14:40:44 rabbit last message repeated 3 times
May  2 14:40:44 rabbit vmunix: unit=0 currentPage=4 count_saved=65668 addr_base=0xfff05000
May  2 14:40:44 rabbit vmunix: unit=0 endPage=0 endByte=0, waitPage=0 waitByte=0
May  2 14:40:44 rabbit vmunix: unit=0 icr=8, cvr=12, isr=0, ivr=f
May  2 14:40:44 rabbit vmunix: hpGetBlock: failed after 0 of 1 words, receive register empty.
May  2 14:40:45 rabbit vmunix: doing reset
May  2 14:40:45 rabbit vmunix: doing reset
May  2 14:40:44 rabbit rpc.collectd: Starting debugInfo.
May  2 14:40:44 rabbit rpc.collectd: passthru isr = 0
May  2 14:40:44 rabbit rpc.collectd: passthru cvr = 12
May  2 14:40:45 rabbit rpc.collectd: passthru y1     = $2000
May  2 14:40:45 rabbit rpc.collectd: passthru DB_VEC = $1d
May  2 14:40:45 rabbit rpc.collectd: passthru DB_PBC = $2
May  2 14:40:45 rabbit rpc.collectd: passthru DB_PAC = $0
May  2 14:40:45 rabbit rpc.collectd: passthru DB_GAC = $0
May  2 14:40:45 rabbit rpc.collectd: passthru DB_BUG = $0
May  2 14:40:45 rabbit rpc.collectd:             icr cvr isr ivr dum hi  mid lo
May  2 14:40:45 rabbit rpc.collectd: mapped      31  31  b1  31  f1  31  b1  31
May  2 14:40:45 rabbit rpc.collectd: slot 0x0e00 cf  0f  87  0f  cf  0f  8f  0f
May  2 14:40:45 rabbit rpc.collectd: slot 0x0f00 8f  0f  8f  0f  0f  0f  cf  0f
May  2 14:40:45 rabbit rpc.collectd: debugInfo done.

To recover from such a crash, it is usually enough to select Reset NIRC Software from the pull-down menu. This will automatically kill and restart various components of the software. It will then run through the same software that is run when the software is first started, so you will again be asked for your data directory.

Some crashes are more insidious. For example, if communications with between rabbit and maili are lost, you may get the following on the rabbit tip window...

May 14 02:30:52 rabbit vmunix: hostIntrTimeout occurred
May 14 02:30:55 rabbit vmunix: hostIntrRet is -1
May 14 02:30:55 rabbit vmunix: hostIntrRet is -1
May 14 02:30:55 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:55 rabbit rpc.collectd: passthru isr = e
May 14 02:30:55 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit last message repeated 3 times
May 14 02:30:56 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit rpc.collectd: passthru isr = e
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit last message repeated 2 times
May 14 02:30:56 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit last message repeated 2 times
May 14 02:30:56 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:56 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit rpc.collectd: passthru isr = e
May 14 02:30:57 rabbit vmunix: hostIntrRet is -1
May 14 02:30:56 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:56 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:56 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:56 rabbit rpc.collectd: passthru isr = e
May 14 02:30:56 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:57 rabbit vmunix: hostIntrRet is -1
May 14 02:30:57 rabbit vmunix: hostIntrRet is -1
May 14 02:30:57 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:57 rabbit vmunix: hostIntrRet is -1
May 14 02:30:57 rabbit last message repeated 3 times
May 14 02:30:57 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:57 rabbit rpc.collectd: passthru isr = e
May 14 02:30:57 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:57 rabbit vmunix: hostIntrRet is -1
May 14 02:30:58 rabbit last message repeated 3 times
May 14 02:30:58 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:58 rabbit vmunix: hostIntrRet is -1
May 14 02:30:58 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:58 rabbit rpc.collectd: passthru isr = e
May 14 02:30:58 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:58 rabbit vmunix: hostIntrRet is -1
May 14 02:30:58 rabbit last message repeated 3 times
May 14 02:30:58 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:58 rabbit vmunix: hostIntrRet is -1
May 14 02:30:59 rabbit vmunix: hostIntrRet is -1
May 14 02:30:59 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:59 rabbit rpc.collectd: passthru isr = e
May 14 02:30:59 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:59 rabbit vmunix: hostIntrRet is -1
May 14 02:30:59 rabbit last message repeated 3 times
May 14 02:30:59 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:59 rabbit vmunix: hostIntrRet is -1
May 14 02:30:59 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:59 rabbit rpc.collectd: passthru isr = e
May 14 02:30:59 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:59 rabbit vmunix: hostIntrRet is -1
May 14 02:30:59 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:59 rabbit vmunix: hostIntrRet is -1
May 14 02:30:59 rabbit last message repeated 3 times
May 14 02:30:59 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:59 rabbit vmunix: hostIntrRet is -1
May 14 02:31:00 rabbit vmunix: hostIntrRet is -1
May 14 02:31:00 rabbit rpc.collectd: Starting debugInfo.
May 14 02:31:00 rabbit rpc.collectd: passthru isr = e
May 14 02:31:00 rabbit rpc.collectd: passthru cvr = 15
May 14 02:31:00 rabbit vmunix: hostIntrRet is -1
May 14 02:31:00 rabbit last message repeated 3 times
May 14 02:31:00 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:31:02 rabbit nirc_klib: SysErr: Device busy ioctl(fd=3, DspSetStart)
May 14 02:31:02 rabbit rpc.collectd: Starting debugInfo.
May 14 02:31:02 rabbit rpc.collectd: passthru isr = e
May 14 02:31:02 rabbit rpc.collectd: passthru cvr = 15
May 14 02:31:02 rabbit vmunix: hostIntrRet is -1
May 14 02:31:02 rabbit last message repeated 3 times
May 14 02:31:02 rabbit rpc.collectd: passthru ATTACH failed

Corresponding output in the "log tail" window is

May 14 02:30:55 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:55 rabbit rpc.collectd: passthru isr = e
May 14 02:30:55 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:56 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:56 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:56 rabbit rpc.collectd: passthru isr = e
May 14 02:30:56 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:56 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:56 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:56 rabbit rpc.collectd: passthru isr = e
May 14 02:30:56 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:56 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:56 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:56 rabbit rpc.collectd: passthru isr = e
May 14 02:30:56 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:57 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:57 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:57 rabbit rpc.collectd: passthru isr = e
May 14 02:30:57 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:58 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:58 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:58 rabbit rpc.collectd: passthru isr = e
May 14 02:30:58 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:58 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:59 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:59 rabbit rpc.collectd: passthru isr = e
May 14 02:30:59 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:59 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:30:59 rabbit rpc.collectd: Starting debugInfo.
May 14 02:30:59 rabbit rpc.collectd: passthru isr = e
May 14 02:30:59 rabbit rpc.collectd: passthru cvr = 15
May 14 02:30:59 rabbit rpc.collectd: passthru ATTACH failed
May 14 02:31:00 rabbit rpc.collectd: Starting debugInfo.
May 14 02:31:00 rabbit rpc.collectd: passthru isr = e
May 14 02:31:00 rabbit rpc.collectd: passthru cvr = 15
May 14 02:31:00 rabbit rpc.collectd: passthru ATTACH failed

To recover, first try the Reset NIRC Software menu item. If this does not work, you may have to reboot rabbit, by using the option on the pulldown menu "Reboot Rabbit" or from a maili xterm reboot_rabbit. When the script completed, restart the NIRC observing software. Once it is up,type a carriage return in the rabbit tip window will give you a "login:" prompt. If all the scripts fail, the method to reboot rabbit from a maili xterm is as follows: > telnet k1consoles 2016 > ^] (control right bracket) > send break > boot (at the ok prompt) You may skip the first step if the "rabbit tip via telnet" xterm is still available for use. If it is availalbe use this xterm to complete the reboot sequence.

Only one telent session may be connected at any given time. If the "tip" window fails to appear see if another maili tip window is still open by issuing the command: ps -axw | grep nircxterm | grep tip

Rabbit Tip Via Telnet Session: no "rabbit login" prompt

Sympton: Pressing return in the xterm labeled "Rabbit Tip Via Telnet" does not lead to a "rabbit login:" prompt.

    Problem and Solution:
  • It may be that rabbit is not properly connected at the instrument. Check that the RS232 connection through the lantronics terminal server is appropriately connected.
  • The converter may have crashed. Try rebooting the serial converter. Below are picture which show the location of the serial converter. The serial converters are located on the left of MAILI (looking from the back, see pic 1). To power cycle the converter either pull the plug from the right side of the converter or pull the wall-wart out of the outlet.

Pic 1: Serial converter is located left of MAILI

Pic 2: Close-up of serial converter.

Rabbit daemons not running

Sympton: Rabbit daemons are not running. Typing ct on maili or rabbit does not yeild the following results.
    on maili...

    on rabbit...
    nirc     18017  0.0  0.2  132  124 ?  S    12:56   0:03 rpc.xycomd
    nirc     18016  0.0  0.2  136  116 ?  S    12:56   0:00 rpc.motord
    nirc     18015  0.0  0.4  304  264 ?  S    12:56   0:04 rpc.collectd

    Problem and Solution:
  • Daemons are not running on rabbit.
  • rlogin to rabbit from maili
  • type "cdi" - to change to install directory
  • type "startup" - to restart daemons

Images fail to write to a scratch directory, or images take a long time to write. Image does not display

Symptom: Images take a long to complete and you may think that the failure mode is a rabbit crash. Images also do not display in figdisp. The image will complete on the order of one minute for a 1 sec exposure. Output in rabbit is the following:
Mar 12 17:19:34 rabbit acquire_nirc: RFits_create: clnt_create failed with" 
                : RPC: Remote system error - Connection timed out".
Mar 12 17:19:34 rabbit acquire_nirc: Couldn't open nirc@maili:/sdata309/nirc/2008mar13.
Mar 12 17:19:34 rabbit acquire_nirc: createFile failed, trying alternates.
Mar 12 17:20:49 rabbit acquire_nirc: RFits_create: clnt_create failed with "
                 : RPC: Remote system error - Connection timed out".
Mar 12 17:20:49 rabbit acquire_nirc: Couldn't open nirc@maili:/scratch.
Mar 12 17:20:55 rabbit nirc_klib: createFile succeeded with alternate(nirc@maili:/scratch2).
    Problem and Solution:
  • The data dir has not properly been defined
  • run "newdir"
  • show the "outdir" keyword. Ensure that the dir from newdir agrees with outdir and correct if necessary
  • Take a test image and make sure it displays in figdisp and that a hotdata.fits is written to the data dir.

XSHOW does not update when obeserving parameters are updated.

Symptom: XSHOW appears to be hung. Updated observing parameters are not displayed in the XSHOW window.
    Problem and Solution:
  • XSHOW can't read the log file that records parameter changes
  • Test whether the logfile is corrrupt by issuing the following cmd:
     echo This is a test >> /sdata310/log/nirclog 
    In the log tail window, you should see "This is a test" displayed. Adjust permissions if necessary. Type "cdlog" to take you to the log directory.
  • The syslog process may need to be slapped around.
    • login on maili as root
    • Identify the syslog process by issuing the cmd
       ps -auxww | grep sysl 
    • restart the syslog by
       kill -HUP pid_of_syslog  
  • Test whether XSHOW will now update by modifying observing paramters like tint=10 or object=Santa

Rabbit not responding

Symptom: Data taking has halted. A ping check of rabbit from maili indicates that you can't connect to rabbit. The terminal server connection also exhibits behavior as if it can't talk to rabbit. TKloger pops up a warning that there is something wrong with the DSP. After trying to reboot rabbit from the pulldown menu, the p3 window looks like it is hung with the last output reading ...
Starting tip session ...[1] 13837
Starting log tail session ...[4] 13844
Checking rabbit software ...[1]    Done                   kill_caRepeater >& /dev/null
 
Timed out waiting for rabbit software to respond.
Will try to initialize...
    Problem and Solution: Rabbit is not powered. There could be several reasons for this. It may be that the electronics have overheated. There is a thermal switch which cuts power at a temperature that is unknown. Below are some things to check.
  • Bring the telescope to horizon and rotate so that you have access to NIRC.
  • Check the Glycol supply and return valves. Both valves should be open (handle in-line with the hose).
  • Check the glycol flow indicator on the instrument, Flow indicator should be rotating so fast that you can't see it rotating, but you should hear it moving.
  • Check that the rocker switches for the NIRC fans are one and that clean and commercial power is on

Guider images are not normal

Symptom: The Guider images do not look normal. However, the afternoon guider checkout indicated that the guider was functioning normal. Current images exhibit bad row readouts as seen in the two screen grabs below.

Problem and Solution:

This may indicate that somewhere in the cable chain, a connector is not well seated. And moving in elevation is changing the pin contacts. Check the connectors for the guider. Remember that NIRC is removed from FWD Cass so that the instrument may be filled with LHe and LN2, so it is likely that the connectors at the FWD Cass Connector pannel need to be reseated. Below is a pic of the FWD Cass connector pannel which shows the photometric guide camera connector. Power down the guider and then try reseating connectors.

Popular Links


Send questions or comments to: