1) How one can determine whether IP forwarding on keamano is enabled? i.e., what software should be running and how do you determine whether it's operating properly?

2) What remedial action was necessary? i.e., what commands did you execute on keamano and under what account are they to be run?


Short version

  1. Check IP forwarding on keamano by typing (as either root or kics): routeadm The output will be something like:
    [1016] kics@keamano% routeadm
                  Configuration   Current              Current
                         Option   Configuration        System State
    ---------------------------------------------------------------
                   IPv4 routing   disabled             disabled
                   IPv6 routing   disabled             disabled
                IPv4 forwarding   enabled              enabled
                IPv6 forwarding   disabled             disabled
    
        As of Feb 2015 following an upgrade to keamano, the third line
        says that ip_forwarding is either enabled or disabled. Both the
        configuration and the Current System State need to be set enabled. 
    
      
    
  2. Turn on IP forwarding: If IP forwarding is disabled, we need to enable it. Ideally, once we enable ip forwarding, we will never need to correct this again. It was last corrected on a date near to my heart: 14 Feb 2015.
    1. to enable it on the current configuration run
      1. routeadm -d ipv4-forwarding
      2. routeadm -e ipv4-forwarding
    2. You then need to set the current system state to apply the changes above:

      routeadm -u

  3. It is likely that the rotator software is in a confused state since it was completely unable to talk to DCS. Therefore restart it:
    as dmoseng on polo run deirot_restart
    or as kics on keamano run deimos restart deirot

Long Version

Background

Polo and keamano have two network interfaces: one to the public .keck.hawaii.edu network, and the other to the DEIMOS private network. Roto, however, has a connection *only* to the private network. It can send packets to hosts outside the private net because it's configured to route such packets through keamano, and keamano is configured to do "IP forwarding" of packets between its own two network interfaces. Other hosts -- such as the DCS crates, DNS servers, etc -- can successfully reply to roto only if they have been configured to know that that packets to roto (192.168.6.6) must be routed via keamano. I believe that only tdc2 and aux2 have been given such "static" routes, whereas other hosts such as Keck's DNS servers have not been told a route to roto. Therefore, K2's DCS crates are the only hosts outside the DEIMOS private network that can reply to any message from roto.

Which takes us to: troubleshooting when rotation control appears to work in all aspects except that DCS tracking fails.

According to the DEIMOS master GD Wirth the Second (MK), this has happened twice in DEIMOS's history.

Symptoms:

Other email communication

Below appears to be communication that was captured during troubleshooting efforts. The suggestions below should be tried with caution....

From 2003: However, if deirotd had already learned that CURRINST = deimos before IP forwarding was disabled, it is possible that the precise symptom would be different. This would be a useful test to try today: put deirot into dcs-tracking mode, disable IP forwarding, and see what happens in the logfile. ]]

  1. In an effort to find out if this is a roto-specific problem, a deirot-specific problem, or a more general problem, try showing a DCS keyword from either polo or keamano, and then try showing the same keyword from roto.
    	kics@polo% show -s dcs2 currinst
    		currinst = DEIMOS
    If this hadn't worked, the problem would plainly be more widespread than roto. Go investigate other sources of trouble, e.g. ensure that DCS is up and functioning.
  2. If the "show" command in step (1) does work from polo, log into roto as kics, and do the same thing:
    	rotop:kics% show -s dcs2 currinst
    (Note that the service name *must* be "dcs2", not "dcs"; the command will fail with the complaint that the shareable library libdcs_keyword.so cannot be found. This is done to avoid any ambiguity about which DCS service is used by roto -- we don't want roto accidentally sending rotator state updates to K1!) If the show command succeeds, the problem is within deirot itself. Try restarting the rotator software: "deimos restart deirot".
  3. If the "show" command in step (2) fails, saying that it can't contact the DCS service, check if roto can ping the aux2 and tdc2 crates:
    	rotop:kics% ping aux2
    	...
    	rotop:kics% ping tdc2
    	...
    If the ping command succeeds, the source of trouble is completely unclear to me... as an act of desperation, reboot roto by opening the Bay C panel on DEIMOS, and power cycling the rotation computer (it's in a pizza-box style enclosure).
  4. If the ping command in step (3) fails, then roto can't talk to the crates -- yet we know they must be functioning, because the "show" command worked from polo (or keamano). That suggests that roto can't "see" outside the DEIMOS private network. This might be due to (a) roto having a misconfigured routing table, so that it doesn't know to send non-private-network packets via keamano, (b) keamano having IP forwarding turned off, or (c) aux2 and/or tdc2 not having a route back to roto.

    Check roto's routes:

        rotop:kics 102 % netstat -nr
        Kernel IP routing table
        Destination   Gateway       Genmask       Flags MSS Window  irtt Iface
        192.168.6.0   0.0.0.0       255.255.255.0 U       0 0          0 eth0
        0.0.0.0       192.168.6.1   0.0.0.0       UG      0 0          0 eth0
    This shows that all packets going outside the DEIMOS private network are routed through 192.168.6.1, ie keamano.

    If the routing was wrong, you can try reconfiguring the routes ("netstat add default gw 192.168.6.1") or brutally rebooting as described in step (3).

  5. If the routing in step (4) is correct, check that IP forwarding is enabled on keamano:
    	keamano%  /etc/init.d/ip_forward
    The output will be something like:
    	Usage: /etc/init.d/ip_forward { start | stop }
    	Currently: ndd -get /dev/tcp ip_forwarding = 1
    The second line says that ip_forwarding is either on (ip_forwarding = 1) or off (ip_forwarding = 0). If it on, ask your Keck experts to check that aux2 and tdc2 have static routes to roto (192.168.6.6) that go via keamano.
  6. If the test in step (5) shows IP forwarding is off, turn it on by becoming root on keamano and typing
    	keamano# /etc/init.d/ip_forward start
    It is likely that the rotator software is in a confused state, since it was completely unable to talk to DCS. Therefore restart it:
        kics@polo%  deimos restart deirot