Speeding up initialization of USB devices appears to have created race conditions for consoles and auto-configured network devices. Waiting until no USB events have been received for a while before the kernel init opens /dev/console appears to make the USB console work, but I have not found any such work-around for handling network devices initialized with ip= parameters on the kernel command line. Note that this problem first arose in 2.6.28, but, it's taken a while to narrow this problem down. My configuration consists of a USB EHCI hub with a CP2101 serial device and an RTL8150 Ethernet device. We have a very small INITRAMFS boot filesystem and not much else in the way of devices, so we get to the IP autoconfig quite quickly. The RTL8150 isn't enumerated yet, so autoconfig gives up after its standard two tries. When we try to open /dev/console, we haven't got to the CP2101 and hence have no console devices registered. This causes the open of /dev/console to fail, although this is properly ignored. We're an embedded system, so that's all the devices we have to communicate with at this point. Then the system just sits there, deaf and blind. It's rather sad, really.
test
testing
What happens if you stick a call to async_synchronize_full(); at the start of net/ipv5/ipconfig.c:ip_auto_config() that will make the new asynchronous stuff all complete before this function runs, and looks like it might be needed for the old style kernel nfs root
net/ipv4 rather..
USB does not use the async infrastructure so that's unlikely to help you much (other than maybe being an equivalend of mdelay() ). USB discovery is async by nature; the devices come "online" some arbitrary time after powering on the chip. That your setup worked was sort of "luck", a side effect of boot just taking enough time for the network USB chip to come online. In 2.6.28 the usb code was changed to not wait 100 msec for each port, but wait 100 msec for all ports together; the boot got faster as a result, with the apparent result that now your network USB chip does not come online early enough. I suspect that your userland dhcp needs to just retry a few times...
On Wed, 25 Mar 2009, Andrew Morton wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=12944 > > > > Summary: Parallel initialization of USB creates races for boot > > devices > > Speeding up initialization of USB devices appears to have created race > > conditions for consoles and auto-configured network devices. Isn't this really a more general problem? You can't set up devices before they have been detected -- that's just as true for serial and Ethernet devices as anything else. What you really need is a way to wait until the devices have been registered. How about polling in a loop before running the initramsfs setup routine? Alan Stern
On Thu, 26 Mar 2009 14:31:06 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=12944 > > > > > > --- Comment #6 from Alan Stern <stern@rowland.harvard.edu> > 2009-03-26 14:31:05 --- On Wed, 25 Mar 2009, Andrew Morton wrote: > > > > http://bugzilla.kernel.org/show_bug.cgi?id=12944 > > > > > > Summary: Parallel initialization of USB creates races > > > for boot devices > > > > Speeding up initialization of USB devices appears to have created > > > race conditions for consoles and auto-configured network devices. > > Isn't this really a more general problem? You can't set up devices > before they have been detected -- that's just as true for serial and > Ethernet devices as anything else. > > What you really need is a way to wait until the devices have been > registered. How about polling in a loop before running the initramsfs > setup routine? the hard one here is that it's just pure luck; a side effect of booting faster. USB devices come online asynchronously by nature.... if you boot fast enough and your device slow enough you're going to miss it. Alan Stern can correct me if I'm wrong, but my understanding is that for USB you just cannot know if you're done probing, the things just come in asynchronous from the hardware level...
On Thu, 26 Mar 2009, Alan Stern wrote: > > > Speeding up initialization of USB devices appears to have created race > > > conditions for consoles and auto-configured network devices. > What you really need is a way to wait until the devices have been > registered. How about polling in a loop before running the initramsfs > setup routine? Sorry, I missed the part about this being needed for a kernel-command-line driven IP autoconfig. I'm not so sure that adding a call to async_synchronize_full() will help. The asynchronous USB probe routines will return before the root hub's children have been detected, because detection occurs in a separate thread (khubd). In fact, the meaning of the ip= parameter isn't really clear. To what network interface is it supposed to apply? The first one? But what if the first one hasn't been detected yet when the autoconfig code runs? Maybe the ip_auto_config stuff should be changed so that it is invoked when the first (non-loopback) interface is registered, instead of at a predefined point during the system startup. At any rate this doesn't seem to be a USB problem, or at least, not a bug in the USB stack. Alan Stern
> I'm not so sure that adding a call to async_synchronize_full() will > help. The asynchronous USB probe routines will return before the > root hub's children have been detected, because detection occurs in a > separate thread (khubd). For USB I don't think it will, for non USB however it looks like it is needed to get back the old behaviour. > At any rate this doesn't seem to be a USB problem, or at least, not a > bug in the USB stack. ipconfig in the kernel is kind of deprecated and certainly doesn't work event driven. -- "Alan, I'm getting a bit worried about you." -- Linus Torvalds
For many (most, I think) of us in embedded Linuxland, we MUST have USB consoles AND the ability to use NFS over USB network devices. I do apologize for shouting but I don't see that this point is understood. Saying that it is "luck" that things worked before may very well be greeted with quite a hostile reaction in the embedded community. That being said, I think we have some latitude about how this is done. I'm much more of a processor weenie than a USB god, but I understand that initialization is asynchronous. Things are easier if device enumeration is asynchronous, but I suspect this is not the case. If it is, then you simply wait until enumeration of serial and network devices is complete and, if you have any, you wait for those types of devices right before you need them. My feeling is that it is not unreasonable to bound the amount of time the kernel is willing to wait for USB devices used during boot. And it is not always necessary to wait. For example: o You should not wait if USB is not configured o You should not wait for USB serial devices if CONFIG_USB_CONSOLE is not set o You should not wait for any network devices if you don't have ip= on the command line. We can mandate that all USB network devices depend on a configurable (not sure, but we might be able to use USB_USBNET) and then, if that isn't set, we don't have to wait for USB network devices. One thing I've already been playing around with a bit is waiting until we don't get USB events for some period of time and interpreting this to mean that initialization is done. I know that this is not the right solution, but it may be that we can wait until we don't see some subset of messages used only during initialization for some period.
The USB case is quite sucky. It's not the software that is asynchronous, it is the hardware. And there is no way for the software to know that the hardware is done enumerating; the devices just come in as they go. Some come quickly, some take a long time, I don't think the USB standard has a time within which a device needs to respond after you apply power. It could be 5+ seconds easily! As for solutions, I much rather have something like * If you had no console, and a suitable USB device comes on post boot, THEN make it the console. USB is hotplug after all... might as well deal with it.
The console could be handled that way, but I'm pretty sure that's not going to work for mounting NFS over USB network devices.
well that's what your initramfs is for...
Umm, my INITRAMFS filesystem is 30 MiB. By the time I rebuild it, download it, burn it into flash, and reboot my system, I have had plenty of time to miss the ease of dropping a tweak into my NFS root filesystem. This is going to be a rather unpopular suggestion. Additionally, I'm afraid that, though it seemed like a good idea at first, there is a problem with making a USB serial device a console after boot is complete. This works great for buffered output, such as when the kernel dumps log_buf to a console when it gets registered. The problem is unbuffered output, such as from the shell (via the kernel init) running on /dev/console. The bottom line is that you won't even see the shell prompt, along with all other previous post-boot output, because it's already been discarded by the time the USB device is ready. This is unusable. USB consoles and network devices have been working for years and the embedded community relies on them. I understand the repeated statement that this has been "luck", but that's not a solution. I think an approach that depends on USB devices responding within some timeframe is quite viable and can be a separate configurable in order to avoid impacting boot time on systems that will never use USB devices during boot. There are undoubtedly other approaches, as well. I would stress, though, that this is a *big* deal if we can't solve it. It will hurt adoption of Linux on embedded systems at a time it is being dubbed "ubiquitous" on such systems.
Your initrd for NFS root testing simply needs to sit on its backside waiting for the console and network to appear, then NFS mount the rootfs. That should be in kilobytes not megabytes, especially if you use busybox like everyone else already dones.
Lest anyone get the wrong impression, I *do* use busybox. I will freely admit that my system has somewhat odd requirements but nothing that really pushes Linux. Three observations about writing an initrd as suggested: 1. We need to fix /dev/console so that it works even if no consoles have been registered. It also has to work when a console is registered after /dev/console is open. This is not a terribly big deal, and should be done anyway. I'll add it to my list of things to do but I'll be perfectly happy if someone beats me to the punch. 2. I'm comfortable with coming up with an initrd that waits for a console device, configures it, waits for a network device, configures that, and finally, if desired, mounts and switchroots to an NFS filesystem. I'm still concerned about this approach, though, because the first chunk of the setup has to be done blind. There is no console, so if you make a mistake, you're going to have to figure out where things went wrong with no output. You will need to mount /sys and /dev, configure hotplugging, use whatever method you've chosen to wait for the console and configure it. Not such a big deal for someone experienced but this could be painful for someone who is not. 3. A distinction has now been created between USB and non-USB devices that never existed before. Well, I'm sure that some would claim the distinction has always been there, but it has never been visible before. One of the major claims of UNIX-like systems is the simplicity with which devices have been created and this chips away at that. So, I'd still like to see this issue addressed. If the long-term goal is to transition towards an initrd approach, let's do that but let's give people enough warning so we can develop some standard approaches.
Okay, I think I've talked myself into Alan Stern's point of view. By treating this as a more general issue, and splitting it into waiting for network devices and waiting for consoles, life gets easier. The infrastructure for registering consoles already exists and starting the kernel init is late enough that waiting is easy, so that's a pretty straightforward thing to do. Some code rearrangement may be necessary for network devices, but I have to dig into that.