Bug 17261
Summary: | Freezes on bootup | ||
---|---|---|---|
Product: | Other | Reporter: | Dan Dart (dandart) |
Component: | Other | Assignee: | other_other |
Status: | CLOSED OBSOLETE | ||
Severity: | normal | CC: | akpm, alan, bjorn.helgaas, florian, maciej.rutecki, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.35.x | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 16055 | ||
Attachments: |
Output of sudo lshw
.config for 2.6.35.2 dmesg output (plugging trick) dmesg output (pci=nocrs) dmesg output (boots faster for some reason) dmesg output (23/07, shows problem still occuring) dmesg output (no usb devices) |
Created attachment 28291 [details]
.config for 2.6.35.2
This could be due to the number of USB devices attached and the kernel not correctly detecting the number of ports I have on my mainboard (it is faurly new). After about 5 minutes with both USB and PS2 pkeyboards plugged in, (which seems to be a toss up if they work), lots of messages appeared about udevd worker threads failing, and I removed my phone from the USB to take a photo and the system booted before I could. Yes, this is more likely to be related to USB devices, because I can make the system boot by plugging and unplugging a device a couple of times. Also, when the system is in use, and too many devices are connected, one device (such as the wi-fi card) would be kicked off. Is this the correct behaviour? If you can get the system to boot, please attach a dmesg log. If you can't get it to boot, please boot with "ignore_loglevel vga=0x0f07" and attach a picture of the console when it's hung. You might also try "pci=nocrs". If that helps, please attach the dmesg log. Yes, it boots... if I try the plugging trick, or if I wait a long time, some other odd messages pop up after a few minutes sometimes, I'll post a dmesg of those when that happens again, but here's my "plugging boot dmesg: Created attachment 30022 [details]
dmesg output (plugging trick)
Adding pci=nocrs seems to let me use the keyboard when it wouldn't usually be able to, after the "ureadahead" message appears. Attaching new dmesg. Created attachment 30042 [details]
dmesg output (pci=nocrs)
Dan, can you double-check that "pci=nocrs" really makes a difference? I don't see any real difference in the dmesg logs you posted, other than the resources available on bus 04, which doesn't have any devices on it. Seems to do. But recently it's been starting up much faster - perhaps a particular USB device is bogging it down? I didn't have to mess around with USBs this time (uploading log) Created attachment 30692 [details]
dmesg output (boots faster for some reason)
(In reply to comment #0) > I first noticed this problem in 2.6.35. So would you say this is a regression? If so, since which kernel version? Yes, this hasn't happened before 2.6.35 I think. I don't recall seeing it in 2.6.34 - and I don't think in Ubuntu's 2.6.32 kernel. I'm confused. Is this still a problem? In comments 10 and 11 (with no special boot options) it sounds like things are working normally -- you didn't have to mess around with USB devices. Maybe the problem would be clearer if you could take a video of the boot with "ignore_loglevel". It still happens - I had to do it this evening. (uploading log) Created attachment 31142 [details]
dmesg output (23/07, shows problem still occuring)
So the problem seems intermittent. From attachment 31142 [details]:
[ 5.693079] hub 6-0:1.0: Cannot enable port 3. Maybe the USB cable is bad?
[ 6.484122] udev: starting version 151
[ 329.607825] cfg80211: Calling CRDA to update world regulatory domain
[ 329.668289] ACPI: resource piix4_smbus [io 0x0b00-0x0b07] conflicts with ACPI region SOR1 [io 0x0b00-0x0b0f pref disabled]
I notice the USB cable question ... is there any possibility there is a
problem with an unreliable cable or hub? Does it make any difference if
you boot with no USB devices attached at all?
Created attachment 31152 [details]
dmesg output (no usb devices)
Uploaded file dmesg.nousb.log - dmesg for no USB devices, I let it run and it still froze, and I had to put the keyboard in to get it to respond. (I suppose plugging a device means "wake up" somewhere).
By the way, I'm now on 2.6.35.4 - and .3 and .4 haven't helped, if that's useful to anyone.
Will now try the video suggestion.
Where would be the best place to put it? I could youtube->tinyogg and link? Hope this helps. http://www.tinyogg.com/watch/k88KB/ Ok, let me back up a bit. Please correct any misconceptions below: 1) The problem is a hang after we've started running user-mode init scripts (udev, etc). 2) It never happens with 2.6.34. 3) It sometimes happens with 2.6.35, but not always. 4) When the system is hung, plugging in a USB device gets it going again. Let's see if we can figure out what is hanging. In /etc/udev/udev.conf, set udev_log to "debug". The user-mode output doesn't go to dmesg, so you'll have to dig around in /var/log or capture it with video. Both 2.6.34 and 2.6.35 have "pci=use_crs" turned on by default. If the problem never happens with 2.6.34, I'm even more confused about how "pci=nocrs" can make a difference in 2.6.35. 2.6.34 and 2.6.35 should be basically the same in that area. Maybe it'd be worth collecting a dmesg log from 2.6.34 and comparing it with the 2.6.35 one. If the problem is reproducible enough, I suppose bisection between 2.6.34 and 2.6.35 would be one option. Not sure about 2) - I'll have to test it - but I don't recall it. (2nd to last paragraph).. 3) - I think it depends on the power the system is using. 4) - Well, after a few goes, anyway. Going to boot 2.6.35.4 now with pci=nocrs ignore_loglevel and film it. Strange - without pci-nocrs, making the change to the udev.conf file just made it boot sensibly! No idea what's up with that. Dan, do you still experience problems with current kernels? Can you still reproduce the hang or did you find out what made udev hang at bootup? Still no idea - but it's fixed in 2.6.37 at least. Possibly 2.6.36 too. I'm still puzzled how making udev more verbose actually helped. |
Created attachment 28281 [details] Output of sudo lshw Kernel freezes on bootup with message: init: bridge-network-interface (lo) pre-start process (335) terminated with status 2 at which it hangs. I used to be able to get past the screen by alerting the computer (usually Alt+SysRq alone would do the trick) but I'm having more difficulty doing so now that that PS2 keyboard is screwed. After those messages there are similar ones for eth0 and wlan1 but they don't freeze the bootup and they've been harmless. When the USB & (broken) PS2 keyboard were connected this morning, I could not get past, nor could I get any sort of USB keyboard noticed by Grub2. Using only the USB jeyboard didn't work so well either. Having just the PS2 keyboard connected, it managed to get past the freezing, but I had to connect the USB one afterward anyway to get Linux to recognise I had a keyboard attached. Note at that early stage the Magic SysRq keys did not work. I first noticed this problem in 2.6.35.