NIRC Boot Problem Diagnostics
|
(Good Midday!) Today is Monday, July 07, 2025. |
Following is an e-mail from Al Conrad which describes various induced failures and the output produced. This should prove useful in diagnosing failures of the NIRC test station. From aconrad Wed Jul 2 16:20:05 1997 To: jp@ssl.berkeley.edu, lipman@isi9.ssl.berkeley.edu, pjb@hypercam.ssl.berkeley.edu, wishnow@hypercam.ssl.berkeley.edu Subject: boot tests on kaimu/taz Cc: aconrad, jchock, randyc I ran 8 boot tests using kaimu/taz boot1 - normal boot boot2 - pulled taz cable from hub boot3 - pulled kaimu cable from hub boot4 - killed the rarpd hme0 daemon boot5 - killed the other four rarpd boot6 - type "boot" at ok prompt instead of "boot hme" boot7 - mv /etc/ethers /etc/ethers.bak --- booted fine boot8 - mv /etc/ethers /etc/ethers.bak and restart rarpd --- booted fine The screen dumps are appended below. Some highlights: 1. Of these, the only boot that duplicates the error message seen at Berkeley was boot6 (typing "boot" instead of "boot hme"). It gives this four line message at 1Hz: Timeout waiting for ARP/RARP packet Lost Carrier (transceiver cable problem?) Cable problem or twisted pair hub link-test disabled. Use the PROM command "help ethernet" for more information. 2. boot2 (crate side pulled from hub) gives this two line message at 1Hz: Timeout reading Link status. Check cable and try again. Timeout waiting for AutoNegotiation Status to be updated. 3. boot3 (kaimu cable pulled) and boot5 (no rarpd on kaimu) give this one line message at 1Hz: Timeout waiting for ARP/RARP packet 4. note that although all three, boot3, boot5, and boot6 give the one line message Timeout waiting for ARP/RARP packet only boot6 gives the additional 3 line message Lost Carrier (transceiver cable problem?) Cable problem or twisted pair hub link-test disabled. Use the PROM command "help ethernet" for more information. 5. I don't think boot4, boot7, or boot8 are relevant, but I include them for the hell of it. 6. Jon tells me that the only explanation for why we have to type "boot hme" and you don't is that the non-volatile RAM (EEPROM) on lwc1 can be configured this way. 7. There are other EEPROM settings that Jon (on taz) and Everett (on lwc1) had to make to get 100baseT to work. Everett sent these to me. I will dig them up and send them in a successive mail (these are covered on pages 4-13 and 4-14 in the manual that came with the board "SunFastEthernet Adapter 2.0 Installation and User's Guide"). 8. as per #7 above, swapping the force board won't work until the eeprom settings are changed on the new board. Hope this helps, Al ---------- boot1 - normal boot ------------------------------------------ Type help for more information ok boot hme Resetting ... initializing TLB initializing cache Allocating SRMMU Context Table Setting SRMMU Context Register Setting SRMMU Context Table Pointer Register Allocating SRMMU Level 1 Table Mapping RAM Mapping ROM ttya initialized Probing Memory Bank #0 8 Megabytes Probing Memory Bank #1 8 Megabytes Probing Memory Bank #2 Nothing there Probing Memory Bank #3 Nothing there Probing Memory Bank #4 Nothing there Probing Memory Bank #5 Nothing there Probing Memory Bank #6 Nothing there Probing Memory Bank #7 Nothing there Probing CPU FMI,MB86904 VMEbus Interface Test Slave Base Register test ... passed. Control Register test ... passed. S4 Slave Map Register ... passed. S4 Mailbox Register ... passed. S4 Mailbox Interrupt Level Register ... passed. Performing system configuration (Watchdog Timer, Abort Switch, etc) ... done! Probing /iommu@0,10000000/sbus@0,10001000 at 5,0 espdma esp sd st SUNW,bpp ledma le Probing /iommu@0,10000000/sbus@0,10001000 at 1,0 Nothing there Probing /iommu@0,10000000/sbus@0,10001000 at 2,0 SUNW,hme Probing /iommu@0,10000000/sbus@0,10001000 at 3,0 BCE,SBusDSP screen not found. Can't open input device. Keyboard not present. Using tty for input and output. SPARC CPU-5CE, No Keyboard ROM Rev. 2.15.1, 16 MB memory installed, Serial #9138410. Ethernet address 0:80:42:b:0:ea, Host ID: 808b70ea. Initialising VMEbus device ... clearing SYSFAIL* signal ... done! Rebooting with command: hme Boot device: /iommu@0,10000000/sbus@0,10001000/hme@2,8c00000 File and args: Using Onboard Transceiver - Link Up. 1d400 Server IP address: 192.168.2.100 Client IP address: 192.168.2.101 Using Onboard Transceiver - Link Up. Using Onboard Transceiver - Link Up. Using IP Address 192.168.2.101 = C0A80265 hostname: tazp domainname: yp2.keck.hawaii.edu server name 'kaimup' root pathname '/export/root/tazp' root on kaimup:/export/root/tazp fstype nfs Boot: vmunix Size: 1318912..... [i stopped it at this point] ---------- boot2 - pulled taz cable from hub ---------------------------- ok boot hme Resetting ... initializing TLB initializing cache Allocating SRMMU Context Table Setting SRMMU Context Register Setting SRMMU Context Table Pointer Register Allocating SRMMU Level 1 Table Mapping RAM Mapping ROM ttya initialized Probing Memory Bank #0 8 Megabytes Probing Memory Bank #1 8 Megabytes Probing Memory Bank #2 Nothing there Probing Memory Bank #3 Nothing there Probing Memory Bank #4 Nothing there Probing Memory Bank #5 Nothing there Probing Memory Bank #6 Nothing there Probing Memory Bank #7 Nothing there Probing CPU FMI,MB86904 VMEbus Interface Test Slave Base Register test ... passed. Control Register test ... passed. S4 Slave Map Register ... passed. S4 Mailbox Register ... passed. S4 Mailbox Interrupt Level Register ... passed. Performing system configuration (Watchdog Timer, Abort Switch, etc) ... done! Probing /iommu@0,10000000/sbus@0,10001000 at 5,0 espdma esp sd st SUNW,bpp ledma le Probing /iommu@0,10000000/sbus@0,10001000 at 1,0 Nothing there Probing /iommu@0,10000000/sbus@0,10001000 at 2,0 SUNW,hme Probing /iommu@0,10000000/sbus@0,10001000 at 3,0 BCE,SBusDSP screen not found. Can't open input device. Keyboard not present. Using tty for input and output. SPARC CPU-5CE, No Keyboard ROM Rev. 2.15.1, 16 MB memory installed, Serial #9138410. Ethernet address 0:80:42:b:0:ea, Host ID: 808b70ea. Initialising VMEbus device ... clearing SYSFAIL* signal ... done! Rebooting with command: hme Boot device: /iommu@0,10000000/sbus@0,10001000/hme@2,8c00000 File and args: Using Onboard Transceiver - Timeout waiting for AutoNegotiation Status to be updated. Timeout reading Link status. Check cable and try again. Timeout waiting for AutoNegotiation Status to be updated. Timeout reading Link status. Check cable and try again. Timeout waiting for AutoNegotiation Status to be updated. Timeout reading Link status. Check cable and try again. Timeout waiting for AutoNegotiation Status to be updated. <over and over at 1Hz> ---------- boot3 - pulled kaimu cable from hub -------------------------- ok boot hme Boot device: /iommu@0,10000000/sbus@0,10001000/hme@2,8c00000 File and args: Using Onboard Transceiver - Link Up. Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet ~ ok ---------- boot4 - killed the rarpd hme0 daemon -------------------------- kaimu# ps auxw | grep rarp root 119 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd hme0 root 120 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd hme0 root 122 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd -a root 123 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd -a root 124 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd -a root 125 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd -a root 20842 0.0 0.7 32 200 p5 S 15:28 0:00 grep rarp kaimu# grep rarpd /etc/rc.local # added the next line in to force rarpd to look at the rarpd hme0; echo -n ' rarpd hme0 ' rarpd -a; echo -n ' rarpd' kaimu# kill -9 119 kaimu# !ps ps auxw | grep rarp root 20852 0.0 0.7 32 200 p5 S 15:29 0:00 grep rarp root 122 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd -a root 123 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd -a root 124 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd -a root 125 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd -a kaimu# then it bootd up fine: Performing system configuration (Watchdog Timer, Abort Switch, etc) ... done! Probing /iommu@0,10000000/sbus@0,10001000 at 5,0 espdma esp sd st SUNW,bpp ledma le Probing /iommu@0,10000000/sbus@0,10001000 at 1,0 Nothing there Probing /iommu@0,10000000/sbus@0,10001000 at 2,0 SUNW,hme Probing /iommu@0,10000000/sbus@0,10001000 at 3,0 BCE,SBusDSP screen not found. Can't open input device. Keyboard not present. Using tty for input and output. SPARC CPU-5CE, No Keyboard ROM Rev. 2.15.1, 16 MB memory installed, Serial #9138410. Ethernet address 0:80:42:b:0:ea, Host ID: 808b70ea. Initialising VMEbus device ... clearing SYSFAIL* signal ... done! Rebooting with command: hme Boot device: /iommu@0,10000000/sbus@0,10001000/hme@2,8c00000 File and args: Using Onboard Transceiver - Link Up. 1d400 Server IP address: 192.168.2.100 Client IP address: 192.168.2.101 Using Onboard Transceiver - Link Up. Using Onboard Transceiver - Link Up. Using IP Address 192.168.2.101 = C0A80265 hostname: tazp domainname: yp2.keck.hawaii.edu server name 'kaimup' root pathname '/export/root/tazp' root on kaimup:/export/root/tazp fstype nfs Boot: vmunix Size: 1318912+426864+111136 bytes ----------- boot5 - killed the other four rarpd --------------------------- kaimu# !ps ps auxw | grep rarp root 20852 0.0 0.7 32 200 p5 S 15:29 0:00 grep rarp root 122 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd -a root 123 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd -a root 124 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd -a root 125 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd -a kaimu# kill -9 122 kaimu# !ps ps auxw | grep rarp root 20932 0.0 0.7 32 200 p5 S 15:31 0:00 grep rarp root 124 0.0 0.0 48 0 ? IW Jun 9 0:00 rarpd -a root 125 0.0 0.0 24 0 ? IW Jun 9 0:00 rarpd -a kaimu# kill -9 124 kaimu# !ps ps auxw | grep rarp root 20936 0.0 0.7 32 200 p5 S 15:31 0:00 grep rarp kaimu# boot timed out with: ok boot hme Resetting ... initializing TLB initializing cache Allocating SRMMU Context Table Setting SRMMU Context Register Setting SRMMU Context Table Pointer Register Allocating SRMMU Level 1 Table Mapping RAM Mapping ROM ttya initialized Probing Memory Bank #0 8 Megabytes Probing Memory Bank #1 8 Megabytes Probing Memory Bank #2 Nothing there Probing Memory Bank #3 Nothing there Probing Memory Bank #4 Nothing there Probing Memory Bank #5 Nothing there Probing Memory Bank #6 Nothing there Probing Memory Bank #7 Nothing there Probing CPU FMI,MB86904 VMEbus Interface Test Slave Base Register test ... passed. Control Register test ... passed. S4 Slave Map Register ... passed. S4 Mailbox Register ... passed. S4 Mailbox Interrupt Level Register ... passed. Performing system configuration (Watchdog Timer, Abort Switch, etc) ... done! Probing /iommu@0,10000000/sbus@0,10001000 at 5,0 espdma esp sd st SUNW,bpp ledma le Probing /iommu@0,10000000/sbus@0,10001000 at 1,0 Nothing there Probing /iommu@0,10000000/sbus@0,10001000 at 2,0 SUNW,hme Probing /iommu@0,10000000/sbus@0,10001000 at 3,0 BCE,SBusDSP screen not found. Can't open input device. Keyboard not present. Using tty for input and output. SPARC CPU-5CE, No Keyboard ROM Rev. 2.15.1, 16 MB memory installed, Serial #9138410. Ethernet address 0:80:42:b:0:ea, Host ID: 808b70ea. Initialising VMEbus device ... clearing SYSFAIL* signal ... done! Rebooting with command: hme Boot device: /iommu@0,10000000/sbus@0,10001000/hme@2,8c00000 File and args: Using Onboard Transceiver - Link Up. Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet Timeout waiting for ARP/RARP packet <every second over and over> ---------- boot6 - type "boot" at ok prompt instead of "boot hme" -------- ok boot Resetting ... initializing TLB initializing cache Allocating SRMMU Context Table Setting SRMMU Context Register Setting SRMMU Context Table Pointer Register Allocating SRMMU Level 1 Table Mapping RAM Mapping ROM ttya initialized Probing Memory Bank #0 8 Megabytes Probing Memory Bank #1 8 Megabytes Probing Memory Bank #2 Nothing there Probing Memory Bank #3 Nothing there Probing Memory Bank #4 Nothing there Probing Memory Bank #5 Nothing there Probing Memory Bank #6 Nothing there Probing Memory Bank #7 Nothing there Probing CPU FMI,MB86904 VMEbus Interface Test Slave Base Register test ... passed. Control Register test ... passed. S4 Slave Map Register ... passed. S4 Mailbox Register ... passed. S4 Mailbox Interrupt Level Register ... passed. Performing system configuration (Watchdog Timer, Abort Switch, etc) ... done! Probing /iommu@0,10000000/sbus@0,10001000 at 5,0 espdma esp sd st SUNW,bpp ledma le Probing /iommu@0,10000000/sbus@0,10001000 at 1,0 Nothing there Probing /iommu@0,10000000/sbus@0,10001000 at 2,0 SUNW,hme Probing /iommu@0,10000000/sbus@0,10001000 at 3,0 BCE,SBusDSP screen not found. Can't open input device. Keyboard not present. Using tty for input and output. SPARC CPU-5CE, No Keyboard ROM Rev. 2.15.1, 16 MB memory installed, Serial #9138410. Ethernet address 0:80:42:b:0:ea, Host ID: 808b70ea. Initialising VMEbus device ... clearing SYSFAIL* signal ... done! Rebooting with command: Boot device: /iommu/sbus/ledma@5,8400010/le@5,8c00000 File and args: Lost Carrier (transceiver cable problem?) Cable problem or twisted pair hub link-test disabled. Use the PROM command "help ethernet" for more information. Timeout waiting for ARP/RARP packet Lost Carrier (transceiver cable problem?) Cable problem or twisted pair hub link-test disabled. Use the PROM command "help ethernet" for more information. Timeout waiting for ARP/RARP packet Lost Carrier (transceiver cable problem?) Cable problem or twisted pair hub link-test disabled. Use the PROM command "help ethernet" for more information. |