Subject: more ESP bugs and OFW ramblings....
To: NetBSD/sparc Discussion List <port-sparc@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: port-sparc
Date: 10/18/2002 20:45:29
So I finally got a second carrier for another disk in the SS5 that's to
become my new router and I thought I'd try an experiment with inserting
it into the live system.  Well that just didn't work very well at all,
despite the fact that I didn't even get a chance to probe it or
anything.  Suddenly the existing disk became very slow and in some cases
inaccessible, but there were no error messages on the console.  I was
able to login via rlogin, and on the console, but couldn't access the
binaries for some commands (including scsictl, shutdown, and halt), nor
could I even "ls -l" some files (including some /dev files, but not all)

Unfortunately there's currently no way to reset the internal SCSI bus in
a sparcstation other than it seems by power-cycling it.  Dropping down
to the OpenBoot firmware and running "probe-scsi" didn't help.  However
it did reveal that the new disk had become the only visible one:

 telnet> send brk
 Stopped at      cpu_Debugger+0x4:       jmpl            [%o7 + 0x8], %g0
 db> machine prom
 Type  'go' to resume
 Type  help  for more information
 ok probe-scsi
 Target 1 
   Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
                     Copyright (c) 1996 Seagate
                     All rights reserved 0000
 ok

The original target ID#3 Connor seems to have become invisible.

DDB's "reboot" command paniced because it couldn't write to the disk.

This kind of screws up my plan to show that an older sparcstation can be
made into a very reliable little server with hot-swappable mirrored
drives.  I'm hoping I can help fix and enhance the esp driver so that
this goal is eventually achievable.  I.e. I'm hoping that with the right
driver hooks the bus really can be reset properly on a live system.


Even an OFW "reset" command didn't seem to reset the bus, but then the
new disk, the SEAGATE, showed up!

 Resetting ... 
 SPARCstation 5, No Keyboard
 ROM Rev. 2.15 Pilot, 32 MB memory installed, Serial #3540954.
 Ethernet address 8:0:20:21:99:db, Host ID: 803607da.
 
 
 
 Rebooting with command:                                               
 Boot device: /iommu/sbus/espdma@5,8400000/esp@5,8800000/sd@3,0  File and args: 
 Type  help  for more information
 ok probe-scsi
 Target 1 
   Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
                     Copyright (c) 1996 Seagate
                     All rights reserved 0000
 ok reset
 Resetting ... 
 SPARCstation 5, No Keyboard
 ROM Rev. 2.15 Pilot, 32 MB memory installed, Serial #3540954.
 Ethernet address 8:0:20:21:99:db, Host ID: 803607da.
 
 
 
                                                                       
 Type  help  for more information
 ok probe-scsi
 Target 1 
   Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
                     Copyright (c) 1996 Seagate
                     All rights reserved 0000
 ok 

Unfortunately I didn't think soon enough to try "cd /iommu/sbus/dma/esp"
and then execute that node's "reset" word (and I'm not really sure I'd
know exactly how to do this for real).

Only after I power-cycled the box did the new disk probe properly (and
of course then I got bit by the stupid target ID#3 is the bottom slot
and had been sd0 and the new disk as target ID#1 became sd0, but was
unbootable....)

 ok probe-scsi
 Target 1 
   Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
                     Copyright (c) 1996 Seagate
                     All rights reserved 0000
 Target 3 
   Unit 0   Disk     CONNER  CFP1080E SUN1.0546496BDB
 ok 

I flipped those drives over so that the GENERIC kernel would still work! :-)

 ok probe-scsi
 Target 1 
   Unit 0   Disk     CONNER  CFP1080E SUN1.0546496BDB
 Target 3 
   Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
                     Copyright (c) 1996 Seagate
                     All rights reserved 0000
 ok 


I'm also still a little confused by the default device aliases in the
2.x PROM.  "boot disk" now fails and I have to say "boot disk1".  This
part does makes sense since of course "disk" should be the same as
"disk3", right?

 ok boot disk  
 Boot device: /iommu/sbus/espdma@5,8400000/esp@5,8800000/sd@3,0  File and args: 
 Bad magic number in disk label
 Can't open disk label package
 
 Can't open boot device
 
 ok boot disk1
 Boot device: /iommu/sbus/espdma@5,8400000/esp@5,8800000/sd@1,0  File and args: 
 >> NetBSD/sparc Secondary Boot, Revision 1.9


However this part doesn't make sense -- there are no aliases!

 telnet> send brk
 Stopped at      cpu_Debugger+0x4:       jmpl            [%o7 + 0x8], %g0
 db> machine prom
 Type  'go' to resume
 ok devalias
 ok devalias disk
 disk ?
 ok .version
 Release 2.15 Pilot Version 0 created 93/12/21 16:00:45
 ok 

Using the phantom aliases with "cd" doesn't seem to work:

 ok cd disk3
 ok pwd
 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd
 ok cd /
 ok cd disk 
 ok pwd
 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd
 ok cd /
 ok cd disk1
 ok pwd
 /iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd
 ok cd /
 ok device-end

Last time I got this deep into OFW stuff it was with a 3.x PROM on a
SunFire VF150, and there the devalias did show what I expected (note
that on that machine I'd modified the alias for "disk"):

 ok devalias
 disk                     /pci@1f,0/ide@d/disk@2,0
 rtc                      /pci@1f,0/isa@7/rtc@0,70
 usb                      /pci@1f,0/usb@a
 flash                    /pci@1f,0/isa@7/flashprom@1f,0
 lom                      /pci@1f,0/isa@7/SUNW,lomh@0,8010
 i2c-nvram                /pci@1f,0/pmu@3/i2c@0,0/i2c-nvram@0,aa
 net1                     /pci@1f,0/ethernet@5
 dload1                   /pci@1f,0/ethernet@5:,
 dload                    /pci@1f,0/ethernet@c:,
 net0                     /pci@1f,0/ethernet@c
 net                      /pci@1f,0/ethernet@c
 cdrom                    /pci@1f,0/ide@d/cdrom@3,0:f
 disk3                    /pci@1f,0/ide@d/disk@3,0
 disk2                    /pci@1f,0/ide@d/disk@2,0
 disk1                    /pci@1f,0/ide@d/disk@1,0
 disk0                    /pci@1f,0/ide@d/disk@0,0
 ide                      /pci@1f,0/ide@d
 floppy                   /pci@1f,0/isa@7/dma/floppy
 ttyb                     /pci@1f,0/isa@7/serial@0,2e8
 ttya                     /pci@1f,0/isa@7/serial@0,3f8
 ok 

It also works as expected on an Axil 325 (ss20 clone) I have here, and
it claims to have an only slightly newer OFW version:

 login: [halt sent]
 Stopped at      cpu_Debugger+0x4:       jmpl            [%o7 + 0x8], %g0
 db> machine prom
 Type  'go' to resume
 ok .version
 Release 2.19 Version 106 created 95/04/11 16:57:02
 ok devalias
 ttyb           /obio/zs@0,100000:b
 ttya           /obio/zs@0,100000:a
 keyboard!      /obio/zs@0,0:forcemode
 keyboard       /obio/zs@0,0
 floppy         /obio/SUNW,fdtwo
 scsi           /iommu/sbus/espdma@f,400000/esp@f,800000
 net-aui        /iommu/sbus/ledma@f,400010:aui/le@f,c00000
 net-tpe        /iommu/sbus/ledma@f,400010:tpe/le@f,c00000
 net            /iommu/sbus/ledma@f,400010/le@f,c00000
 disk           /iommu/sbus/espdma@f,400000/esp@f,800000/sd@3,0
 cdrom          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@6,0:d
 tape           /iommu/sbus/espdma@f,400000/esp@f,800000/st@4,0
 tape0          /iommu/sbus/espdma@f,400000/esp@f,800000/st@4,0
 tape1          /iommu/sbus/espdma@f,400000/esp@f,800000/st@5,0
 disk3          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@0,0
 disk2          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@2,0
 disk1          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@1,0
 disk0          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@3,0
 ok go
 db> cont

The machine I'm currently typing on is a real Sun SS20, also with ROM
release 2.15, and "devalias" works on it just fine too.

Does anyone know why I don't see something equivalent on my "new" SS5?

-- 
        Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>