 |
The ESI Dispatchers and Galil Controllers
|
 |
The ESI Dispatchers
The ESI software on the UNIX side talks to the two Galil controllers
with "dispatchers." Galil controller 0 uses dispatcher 0 and Galil
1 uses dispatcher 1. The dispatchers are represented in the "Esicon"
control panel by two buttons. If the button is not blue, then the dispatcher
is not running. You can also check from the esieng account on koki by typing
"esi status dispatcher2.0" and "esi status dispatcher2.1".
Finally, you can type (on koki) "ps -auxww | grep dispatcher"
which should show you the dispatcher processes. Note that they are of the
form:
1642 ? S 0:00 dispatcher2 -s esi -n 1 -l /kroot/var/log/dispatcher_lo
1670 ? S 0:00 dispatcher2 -s esi -n 0 -l /kroot/var/log/dispatcher_lo
In this output note that the "2" in "dispatcher2" has
nothing whatsoever to do with which dispatcher is run by that process. The
"-n 0" and "-n 1" parts of the command
give this information.
Normally, the dispatchers are started automatically when koki is booted.
If the dispatchers are stopped, they need to be manually restarted from
the esieng account on koki, using the commands: "esi start dispatcher2.0"
and "esi start dispatcher2.1". These are also available
from the esieng account's "Engineering" pull-down menu as "Restart
dispatcher 0" and "Restart dispatcher 1".
The Galil controllers can also be accessed through the Lantronics terminal
server and the main Galil serial ports. Galil 0 connects to server port
2001 and Galil 1 to port 2003. For example
telnet 192.168.5.2 2001
will attempt to connect to port 2001 and Galil 0. However, note that while
the dispatchers are running, they "own" their respective ports
on the terminal server, and attempts to telnet to the port will be refused.
For example, if dispatcher 0 is running, any attempt to telnet to port 2001
will be refused by the Lantronix. Should you need to obtain a telnet connection
to a Galil controller's serial port, you need to stop the corresponding
dispatcher first.
Known Problems
The Lantronix needs to have a valid, properly terminated, ThinNet ethernet
connection before you apply power to it in order to boot correctly. Unforuntately,
just because it boots correctly doesn't guarantee that it will come up in
a useful state.
The fundamental problem here is that neither the Lantronixs nor the Galils
behave deterministically when power cycled. It appears that their serial
interface hardware can spit out some stray characters when power is first
applied. If the Lantronix senses these characters, it wrongly concludes
that the Galil controller is trying to log into the Lantronix, and thus
it puts that port into "Job Service". Once that happens, you can't
telnet into the port nor can the dispatcher establish a connection to that
port. The only recourse is to log into the main port (i.e., portless) connection,
set privileged mode, and logout the spurious "connections" on
ports 2001 and 2003, as described below. We would love to find a way to
get the Lantronix to ignore these stray characters on power up, or to prevent
the Galil from generating them on power up. So far we haven't. The problem
can be corrected as follows:
- From the esieng account on koki, first stop any ESI dispatchers using
the commands: esi stop dispatcher2.0 esi stop dispatcher2.1
- Telnet directly into the IP address of the Lantronix, but without specifying
any port number:
telnet 192.168.5.2
- It should respond with:
Trying 192.168.5.2...
Connected to 192.168.5.2.
Escape character is '^]'.
Lantronix ETS8P Version V3.5/1(970325)
Type HELP at the 'Local_10> ' prompt for assistance
Username>
- In response to the username prompt, type "esi",
then hit the RETURN key. It should respond with: Local_10>
- To display the status of port 2001 (i.e., the one that connects to
the main serial port of Galil controller 0), in response to the "Local_10>"
prompt, type:
show port 1.
- It should respond with something like:
Local_10> show port 1
Port 1 : Username: Physical
Port 1 (Job Service)
Char Size/Stop Bits: 8/1 Input Speed: 19200
Flow Ctrl: Cts/Rts Output
Speed: 19200
Parity: None Modem
Control: None
Access: Remote Local
Switch: None
Backward: None Port
Name: Port_1
Break Ctrl: Local Session
Limit: 4
Forward: None Terminal
Type: Soft()
Preferred Services: (Lat)
(Telnet)
Authorized Groups : 0
(Current) Groups : 0
Characteristics: Autoprompt Broadcast Loss Notify Verify
Remote
Conf Telnet Pad
- If you have indeed stopped dispatcher 0 and no one else has already
telnetted into this port, then the port should display as (Idle), i.e.,
the top line of the display should look like this:
Port 1 : Username: Physical
Port 1 (Idle)
If, with dispatcher 0 stopped and no one telnetted into port 2001, the
port is shown as being in (Job Service), then the Lantronix is confused
and needs to be set straight. Our initial output from the "show
port 1" command illustrates this state. To correct the Lantronix's
confusion, issue the following commands in response to the "Local_10>"
prompt:
set priv
followed by hitting the RETURN key, to which it will
respond with:
Password>
In response to the "Password>" prompt, type the
password. If you don't know the password, ask one of the software
on-call people.
- If you have done this correctly, the Lantronix should respond with
a slightly different prompt: Local_10>> The second ">"
character confirms that you have put the Lantronix into privileged mode.
- Now type the command: logout port 1 which should put port
2001 back into the "Idle" state. You can confirm this by issuing
another "show port 1" command.
- To reset port 2003 (i.e., the one that connects to the main serial
port on Galil controller 1), you would do the same sequence of operations,
except substituting a "1" for a "3", e.g.:
show port 3
logout port 3
show port 3
- Once you have used the commands "show port 1" and
"show port 3" to confirm that both ports are in the
"Idle" state, you should be able to establish telnet connections
to the respective port numbers 2001 and 2003.
- To check that Galil controller 0 is responding appropriately:
telnet 192.168.5.2 2001
Once connected, hit the return key several times. If the Galil is responding,
it should echo back a colon character (i.e., ":" ) as a prompt.
If, instead, it echoes back a question mark, hit several more return keys
until you get a ":" prompt. If you are unable to obtain a colon
prompt, then that Galil controller probably needs to be reset or power
cycled. Once you have confirmed that Galil controller 0 is responding and
it has issued a ":" prompt, carefully (and without hitting any
other characters) enter the telnet escape sequence (i.e., while holding
down the control key hit the right-bracket key) to terminate the telnet
connection to Lantronix port 2001.
- To check that Galil controller 1 is responding appropriately:
telnet 192.168.5.2 2003
and do the same as you did for the other controller. Once you have confirmed
that Galil controller 1 is responding and it has issued a ":"
prompt, carefully (and without hitting any other characters) enter the
telnet escape sequence (i.e., while holding down the control key hit the
right-bracket key) to terminate the telnet connection to Lantronix port
2003.
- At this point, you have confirmed that both the Lantronix and Galil
controllers are properly initialized, so you should be able to restart
both dispatchers from the Engineering pull-down menu in the esieng account
on koki.
To determine which keywords are handled by which dispatcher, use one
of the commands:
show -s esi memes | grep -i esi_dispatch_0
show -s esi memes | grep -i esi_dispatch_1
Trouble-Shooting the Dispatchers
If you tail the most recent dispatcher log file for a given dispatcher,
it should give you some idea as to what the problem might be. The "esi
status dispatcher2.0" or "esi status dispatcher2.1"
given on koki should give you the path to this log file. To determine whether
or not the Lantronix has booted correctly:
- Trying pinging it, i.e.:
ping -sv 192.168.5.2
- Try logging into its master port (i.e., don't specify a port):
telnet 192.168.5.2
If you can do both of these steps, then the Lantronix has successfully
booted. However, as noted earlier, if the instrument (or pieces of it)
have been power cycled, the Lantronix may have erroneously put either port
2001 or 2003 into "Job Service". In that case, follow the 14-step
procedure outlined above to clear the problem.
ESI Master
3 January 2000