|Summary:||sbp2 feature: integrate with scsi_wait_scan module? (was: Failure to re-scan SCSI devices after firewire modules loaded, doesn't see firewire disks)|
|Product:||Drivers||Reporter:||Don Krajewski (dkrajews)|
|Component:||IEEE1394||Assignee:||Stefan Richter (stefanr)|
Description Don Krajewski 2004-08-16 14:42:14 UTC
Distribution: Gentoo Hardware Environment: Dell Inspiron 8200 1.9 GHz Pentium 4 30 GB Hard Disk (100 MB boot Ext2 partition, rest NTFS WindowsXP) Western Digital 120 GB Firewire Drive (2 GB Swap, rest ext3 partitions - root, var, user, home, data) 512 MB RAM Nvidia Video (GeForce 440 2 Go 64MB) Software Environment: GCC 3.3.3r6 Gentoo-dev-sources (2.6.7-r14) Problem Description: Upon boot, modules are loaded SCSI first, then others as it finds them. This fails to rescan SCSI devices, particularly for emulation and recognition of USB or Firewire drives. Modules loaded but don't see disks off USB or firewire. Note: you can "scsi add-single-device 0 0 0 0 > /proc/scsi/scsi" after boot or run "rescan-scsi-bus.sh" after boot, and device works fine. The problem occurs when booting to the firewire device before proc is mounted or bash is availble. I have attempted to modify the initrd.scripts so that it executes other scripts (copied by gen_initrd.sh during gentoo's genkernel process) and executes as the last step of "modules_scan" proceure at boot boot time. Nothing I have inserted seems to work as the resources are only found after booting the system. Gentoo's kernel team indicate the bug is upstream from them and to file a bug report at kernel.org (here). I am hoping that someone can guide me in the right direction for getting the SCSI emulation module unloaded and reloaded if it found it before, to not load it until the very last module so that all SCSI buses are scanned, or to come up with a workaround that will work reliably. An article at IBM's developerWorks describes the problem on the 2.4.X kernel: http://www-106.ibm.com/developerworks/linux/library/l-fireboot.html?ca=dgr-lnxw06FireBoot Steps to reproduce: 1) Install Linux with 100 MB boot partition on primary hard disk and remaining linux partions on firewire drive. 2) Install kernel with initrd boot (initial ramdisk) 3) Install firewire drivers as modules for firewire (installing built-in to kernel doesn't help, either) 4) Boot system - system loads firewire modules, but doesn't see any of the /dev/sda disks. 5) System reports as invalid sda disk( "the device /dev/sda1 is invalid" and then requests for a root partition or go into shell). If SCSI emulation rescanned the bus, it should see the disk.
Comment 1 Adrian Bunk 2006-12-06 01:48:45 UTC
Is this issue still present with kernel 2.6.19?
Comment 2 Stefan Richter 2006-12-06 05:08:04 UTC
The problem still exist in principle. Note, it's not just a FireWire thing; all more modern SCSI transports like SAS or iSCSI and the USB storage have to deal with it one way or another. The process of device discovery is practically non-deterministic, unlike with the old parallel SCSI hardware. The FireWire driver stack really only deals with the hot-plug case and has no special provisions for cold-plugging. Note though that some distributors have successfully set up environments for "install from FireWire disk" or even "root filesystem on FireWire disk". The way to go is a basic hot-plug capable environment in an initrd. Recently though there was infrastructure added to the SCSI stack for asynchronous parallelized SCSI scanning. I intend to look into its usefulness for sbp2 once I have time for actual feature additions. AFAICS we would have to let the administrator specify a timeout and/or a minimum number of SBP-2 units to wait for. I suggest we keep this bug entry as a reminder for integration of sbp2 with async SCSI scanning, although it is not an ieee1394 subsystem bug per se. Two side notes: - The "scsi add-single-device" method is not useful for FireWire disks under Linux 2.6 anymore, it's only a necessary hack for Linux 2.4. - Of course each bugfix related to device recognition in the hot-plug case is relevant to the cold-plug case too. Needless to say, there have been a lot improvements in this department since 2004.
Comment 3 Don Krajewski 2006-12-06 05:56:28 UTC
The final solution was to write a custom initrd for the system where two scans are performed (not just one) with a bit of testing to make sure the timing would always work. I haven't tried setting up booting from a firewire or USB drive recently, but know of people who currently do something similar to this that don't have this problem in the 2.6.19 kernel. Thank you for reclassifying as a "low" level problem. DK
Comment 4 Don Krajewski 2007-09-04 08:02:51 UTC
I will be out of the office until Monday, 10 Sept. 2007. I will not be able to check e-mail while gone and will respond as soon as I am able to do so. If you need a quicker response, please contact CNC through the web service request form on the CNC web site (http://www.engr.iupui.edu/cnc). Thanks, Don Krajewski
Comment 5 Natalie Protasevich 2007-09-19 05:41:32 UTC
Stefan, is this a "project" material? Thanks.
Comment 6 Stefan Richter 2007-09-19 06:32:40 UTC
I have listed it at http://wiki.linux1394.org/ToDo which can be reached via http://kernelnewbies.org/KernelProjects. If bugzilla.kernel.org is meant to be narrowed down to real bugs (as opposed to missing functionality, or "missed" functionality), then this bug entry should be "rejected" as "invalid" or maybe "documented".
Comment 7 Don Krajewski 2007-09-20 16:38:55 UTC
Maybe the question to ask is if this is a problem with the kernel module or other packages? If the answer that this is a problem with the kernel module, was this intended or something that was "missed" in design? Finally, is it still relavant today? From the various projects I have worked on over the past year, this hasn't shown up and I haven't been forced to use an initrd (workaround) for systems like this. On the other hand, I haven't tested this fully to know if this still is or isn't a problem. My $0.02: Close this issue as "rejected" and mark resolution as "documented". This issue is probably no longer relevant, and this was "missed" functionality from my understanding of design at the time the bug was reported. DK
Comment 8 Stefan Richter 2007-09-20 23:01:26 UTC
Distros which support booting from FireWire have it typically solved in initrd by (1) loading the required drivers, (b) periodically looking if the desired block device is there --- I guess by means of sysfs, (c) proceeding to mount the root filesystem. The new infrastructure for parallelized SCSI scanning which I mentioned in comment #2 has a side product, the scsi_wait_scan kernel module. AFAIU, "modprobe scsi_wait_scan" will look up which scsi low-level drivers support it and then sleep until all these low-level drivers tell it that they are finished with (parallelized asynchronous) scanning. I.e. the SCSI folks thus got the synchronization point back which they had before they implemented parallelized scanning. So, this scsi_wait_scan side product could be useful for boot from FireWire too. But since it isn't essential and since it would be a new feature and is tracked as such at the linux1394 TODO list, I close this bug entry as rejected/ documented.