Bug 1373 - uhci-hcd fails after software suspend
Summary: uhci-hcd fails after software suspend
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Alan Stern
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-10-18 15:34 UTC by John Mock
Modified: 2004-05-11 13:49 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.0-test8
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
'dmesg' output (11.64 KB, text/plain)
2003-10-18 15:36 UTC, John Mock
Details
linux-2.6.0-test8/.config (29.71 KB, text/plain)
2003-10-18 15:37 UTC, John Mock
Details
Patch to fix UHCI suspend problem (803 bytes, patch)
2004-02-23 06:39 UTC, Alan Stern
Details | Diff
/proc/driver/uhci/0000:00:1d.0 after a fresh reboot (641 bytes, text/plain)
2004-03-02 12:02 UTC, Aleksey Nogin
Details
/proc/driver/uhci/0000:00:1d.0 after an rmmod, suspend, boot with resume, modprobe (641 bytes, text/plain)
2004-03-02 12:03 UTC, Aleksey Nogin
Details
/proc/driver/uhci/0000:00:1d.0 after suspend/resume w/o rmmod (647 bytes, text/plain)
2004-03-02 12:10 UTC, Aleksey Nogin
Details
[with patch] /proc/driver/uhci/0000:00:1d.0 after a fresh reboot (641 bytes, text/plain)
2004-03-02 22:57 UTC, Aleksey Nogin
Details
[with patch] /proc/driver/uhci/0000:00:1d.0 after suspend/resume w/o rmmod (647 bytes, text/plain)
2004-03-02 23:08 UTC, Aleksey Nogin
Details
Reinitialize the Framelist Base Address register (626 bytes, patch)
2004-03-03 07:54 UTC, Alan Stern
Details | Diff

Description John Mock 2003-10-18 15:34:58 UTC
Distribution: kernel.org
Hardware Environment: Sony VAIO R505RE (see 'dmesg' attachment for details)
    00:00.0 Host bridge: Intel Corp. 82830 830 Chipset Host Bridge (rev 04)
    00:02.0 VGA compatible controller: Intel Corp. 82830 CGC [Chipset Graphics
Controller] (rev 04)
    00:02.1 Display controller: Intel Corp. 82830 CGC [Chipset Graphics Controller]
    00:1d.0 USB Controller: Intel Corp. 82801CA/CAM USB (Hub #1) (rev 02)
    00:1d.1 USB Controller: Intel Corp. 82801CA/CAM USB (Hub #2) (rev 02)
    00:1d.2 USB Controller: Intel Corp. 82801CA/CAM USB (Hub #3) (rev 02)
    00:1e.0 PCI bridge: Intel Corp. 82801BAM/CAM PCI Bridge (rev 42)
    00:1f.0 ISA bridge: Intel Corp. 82801CAM ISA Bridge (LPC) (rev 02)
    00:1f.1 IDE interface: Intel Corp. 82801CAM IDE U100 (rev 02)
    00:1f.3 SMBus: Intel Corp. 82801CA/CAM SMBus (rev 02)
    00:1f.5 Multimedia audio controller: Intel Corp. 82801CA/CAM AC'97 Audio
(rev 02)
    00:1f.6 Modem: Intel Corp. 82801CA/CAM AC'97 Modem (rev 02)
    02:02.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000
Controller (PHY/Link)
    02:05.0 CardBus bridge: Ricoh Co Ltd RL5c475 (rev 80)
    02:08.0 Ethernet controller: Intel Corp. 82801CAM (ICH3) PRO/100 VE (LOM)
Ethernet Controller (rev 42)
Software Environment: (see dnesg attachment)

Problem Description:
    After software suspend (either flavor), USB controller seems hopelessly
confused:

Steps to Reproduce:

    Debian GNU/Linux testing/unstable tvr-vaio tty1

    tvr-vaio login: root
    Password:
    Last login: Sat Oct 18 14:53:29 2003 on tty1
    Linux tvr-vaio 2.6.0-test8 #3 Sat Oct 18 10:51:09 PDT 2003 i686 GNU/Linux
    You have new mail.
    tvr-vaio:~# cat /proc/swaps
    Filename                                Type            Size    Used    Priority
    /dev/hda7                                partition      500460  0       -1
    tvr-vaio:~# sync; echo 4b > /proc/acpi/sleep

	[press power button]  lilo: 2.6.0-test8 resume=/dev/hda7

    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    hub 3-0:1.0: new USB device on port 1, assigned address 3
    tvr-vaio:~# e100: eth0 NIC Link is Up 100 Mbps Full duplex
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    usb 3-1: control timeout on ep0out
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    drivers/usb/host/uhci-hcd.c: 1840: host controller halted. very bad
    usb 3-1: control timeout on ep0out

    tvr-vaio:~#
    tvr-vaio:~# cat > /tmp/console.log

'dmesg' and '.config' attachments to follow.  Machine will also not shutdown
cleanly at this point, and 'rmmod uhci-hcd' gets an 'oops' (details available
cheerfully upon request). (Even using 'sysrq' to sync and unmount still doesn't
avoid an 'fsck' in this case.)
Comment 1 John Mock 2003-10-18 15:36:13 UTC
Created attachment 1091 [details]
'dmesg' output
Comment 2 John Mock 2003-10-18 15:37:47 UTC
Created attachment 1092 [details]
linux-2.6.0-test8/.config
Comment 3 John Mock 2003-10-18 16:11:11 UTC
I forgot to mention that the workaround is to 'rmmod uhci-hcd' in the hiberation
script, but this isn't a great solution for something with a file system, like a
digital camera.
Comment 4 Greg Kroah-Hartman 2003-12-14 00:02:19 UTC
still fail on 2.6.0-test11?
Comment 5 John Mock 2003-12-14 05:20:22 UTC
Yes.  There is no apparent change, alas.
Comment 6 Alexander Willner 2003-12-18 21:04:05 UTC
I's the same very with me (kernel 2.6final) 
Comment 7 Greg Kroah-Hartman 2004-02-18 09:32:30 UTC
Still happen on 2.6.3?
Comment 8 nbdifferentiation 2004-02-21 18:07:36 UTC
The following may or may not be related...

With uhci-hcd, my USB mouse fails after resuming from APM suspend.  It worked
well in 2.4.xx, and hasn't worked since 2.6.0 (and still doesn't work in 2.6.3).

I receive the "control timeout on ep0out" message that was already mentioned.

Sometimes (I don't know under what circumstances), the USB mouse will partially
work (update its location once every few seconds -- making it completely
unusable).  However, most of the time it completely fails.  One more thing: it's
an optical mouse and sometimes the light comes on and sometimes it doesn't.

At the same time the USB mouse doesn't work, my PS/2 touchpad and PS/2 pointing
stick work just fine.

Thanks!  This bug drives me crazy!  :(

Nick.
Comment 9 Alan Stern 2004-02-23 06:39:46 UTC
Created attachment 2216 [details]
Patch to fix UHCI suspend problem

I think this may the solution.	Apparently on some systems the UHCI controllers
get their interrupts disabled during power management suspend/resume.  This
patch will re-enable interrupts.
Comment 10 Aleksey Nogin 2004-03-01 18:47:11 UTC
I see this too. What's worse, after an ACPI software suspend w/o unloading
uhci_hcd first, I try unplugging my external USB hub, machine freezes.

% lspci -v | grep USB
00:1d.0 USB Controller: Intel Corp. 82801CA/CAM USB (Hub #1) (rev 02) (prog-if
00 [UHCI])

I am seeing this with Red Hat's kernel-2.6.3-1.116
Comment 11 Alan Stern 2004-03-02 07:11:53 UTC
Are you using my patch from attachment id=2216?

Do you get any interrupt requests from the UHCI controller after resume?
Comment 12 Aleksey Nogin 2004-03-02 09:21:50 UTC
> Are you using my patch from attachment id=2216?

Not yet, will try that next.

> Do you get any interrupt requests from the UHCI controller after resume?

I do not know. What I do see is

uhci_hcd 0000:00:1d.0: host controller process error, something bad happened!
uhci_hcd 0000:00:1d.0: host controller halted, very bad!

first thing right after resume.
Comment 13 Alan Stern 2004-03-02 09:57:53 UTC
Can you post the contents of the /proc/driver/uhci/0000:00:1d.0 file, both
before suspending and after resuming?
Comment 14 Aleksey Nogin 2004-03-02 12:02:06 UTC
Created attachment 2269 [details]
/proc/driver/uhci/0000:00:1d.0 after a fresh reboot
Comment 15 Aleksey Nogin 2004-03-02 12:03:07 UTC
Created attachment 2270 [details]
/proc/driver/uhci/0000:00:1d.0 after an rmmod, suspend, boot with resume, modprobe
Comment 16 Aleksey Nogin 2004-03-02 12:10:31 UTC
Created attachment 2271 [details]
/proc/driver/uhci/0000:00:1d.0 after suspend/resume w/o rmmod

In the first two cases (fresh reboot or suspend/resume with rmmod/modprobe) I
saw that every time I access the /proc/driver/uhci/0000:00:1d.0 file, at least
two lines (usbfrnum and flbaseadd) would always change. After a non-rmmod
suspend/resume, file stayed completely static (and none of the USB stuff worked
either).

Next, I tried "rmmod uhci_hcd". I observed the following:

Mar  2 11:30:37 xxx kernel: uhci_hcd 0000:00:1d.0: remove, state 1
Mar  2 11:30:37 xxx kernel: usb usb1: USB disconnect, address 1
Mar  2 11:30:37 xxx kernel: usb 1-1: USB disconnect, address 2
Mar  2 11:30:37 xxx kernel: usb 1-1.2: USB disconnect, address 3
Mar  2 11:30:37 xxx kernel: drivers/usb/class/usblp.c: usblp0: removed
Mar  2 11:30:37 xxx kernel: usb 1-1.3: USB disconnect, address 4
Mar  2 11:30:37 xxx kernel: floppy0: unexpected interrupt
Mar  2 11:30:37 xxx kernel: floppy0: sensei repl[0]=c0 repl[1]=0
Mar  2 11:30:37 xxx kernel: floppy0: sensei repl[0]=c1 repl[1]=0
Mar  2 11:30:37 xxx kernel: floppy0: sensei repl[0]=c2 repl[1]=0
Mar  2 11:30:37 xxx kernel: floppy0: sensei repl[0]=c3 repl[1]=0

This floppy0 stuff right after USB stuff is reproducible - this is not the
first time I see this combination (and I do not have a floppy drive). And the
rmmod never actaully finished (it just hang, and when I tried running lsmod, I
hanged too).

Next I tried unplugging the external USB hub from the machine and at this point
it froze - I was able to switch between virtual consoles and some consoles
would echo back what I was typing, but beyond that the machine was frozen.

Still haven't tried your patch yet.
Comment 17 Aleksey Nogin 2004-03-02 22:57:28 UTC
Created attachment 2274 [details]
[with patch] /proc/driver/uhci/0000:00:1d.0 after a fresh reboot

I am now using using Red Hat's kernel kernel-2.6.3-1.118 (which is based on
2.6.4-rc1-bk2) _with_ the patch you provided. It still does not work...
Comment 18 Aleksey Nogin 2004-03-02 23:08:55 UTC
Created attachment 2275 [details]
[with patch] /proc/driver/uhci/0000:00:1d.0 after suspend/resume w/o rmmod

1) During the resume I saw

...
[nosave c035b000]<4>Freeing prev allocated pagedir
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: host controller process error, something bad happened!
uhci_hcd 0000:00:1d.0: host controller halted, very bad!
orinoco_lock() called with hw_unavailable (dev=dd88abc0)
...

2) After the resume, /proc/driver/uhci/0000:00:1d.0 had HCHalted in it (see
attachment).

3) I tried unpluggin the external hub. This time the machine did _not_ freeze
(at least there is some progress!), giving me the following

Mar  2 22:21:17 xxx kernel: usb 1-1: USB disconnect, address 2
Mar  2 22:21:17 xxx kernel: usb 1-1.2: USB disconnect, address 3
Mar  2 22:21:17 xxx kernel: drivers/usb/class/usblp.c: usblp0: removed
Mar  2 22:21:17 xxx kernel: usb 1-1.3: USB disconnect, address 4
Mar  2 22:21:18 xxx kernel: floppy0: unexpected interrupt
Mar  2 22:21:18 xxx kernel: floppy0: sensei repl[0]=c0 repl[1]=0
Mar  2 22:21:18 xxx kernel: floppy0: sensei repl[0]=c1 repl[1]=0
Mar  2 22:21:18 xxx kernel: floppy0: sensei repl[0]=c2 repl[1]=0
Mar  2 22:21:18 xxx kernel: floppy0: sensei repl[0]=c3 repl[1]=0
Mar  2 22:21:19 xxx kernel: uhci_hcd 0000:00:1d.0: host controller process
error, something bad happened!
Mar  2 22:21:19 xxx kernel: uhci_hcd 0000:00:1d.0: host controller halted, very
bad!
Mar  2 22:21:20 xxx kernel: hub 1-0:1.0: connect-debounce failed, port 1
disabled

4) Plugged it back in:

Mar  2 22:22:09 xxx kernel: uhci_hcd 0000:00:1d.0: host controller process
error, something bad happened!
Mar  2 22:22:09 xxx kernel: uhci_hcd 0000:00:1d.0: host controller halted, very
bad!
Mar  2 22:22:10 xxx kernel: usb 1-1: new full speed USB device using address 5
Mar  2 22:22:15 xxx kernel: usb 1-1: control timeout on ep0out

5) rmmod uhci-hcd

Mar  2 22:23:12 xxx kernel: uhci_hcd 0000:00:1d.0: remove, state 1
Mar  2 22:23:12 xxx kernel: usb usb1: USB disconnect, address 1

(but rmmod did not complete)

6) Unplug the hub

Mar  2 22:23:32 dell kernel: uhci_hcd 0000:00:1d.0: host controller process
error, something bad happened!
Mar  2 22:23:32 dell kernel: uhci_hcd 0000:00:1d.0: host controller halted,
very bad!
Mar  2 22:23:32 xxx kernel: usb 1-1: device not accepting address 5, error -108

Mar  2 22:23:33 xxx kernel: hub 1-0:1.0: Cannot enable port 1.	Maybe the USB
cable is bad?
Mar  2 22:23:33 xxx kernel: hub 1-0:1.0: cannot disable port 1 (err = -19)
Mar  2 22:23:33 xxx kernel: hub 1-0:1.0: hub_hub_status failed (err = -19)
Mar  2 22:23:33 xxx kernel: hub 1-0:1.0: get_hub_status failed
Mar  2 22:23:33 xxx kernel: uhci_hcd 0000:00:1d.0: USB bus 1 deregistered

and rmmod finishes!

7) modprobe uhci-hcd

Mar  2 22:24:09 xxx kernel: usb 1-1: new full speed USB device using address 2
Mar  2 22:24:09 xxx kernel: hub 1-1:1.0: USB hub found
Mar  2 22:24:09 xxx kernel: hub 1-1:1.0: 4 ports detected
Mar  2 22:24:10 xxx kernel: usb 1-1.2: new full speed USB device using address
3
Mar  2 22:24:11 xxx kernel: drivers/usb/class/usblp.c: usblp0: USB
Bidirectional printer dev 3 if 0 alt 1 proto 2 vid 0x03F0 pid 0x0317
Mar  2 22:24:11 xxx kernel: usb 1-1.3: new low speed USB device using address 4

Mar  2 22:24:13 xxx kernel: input: USB HID v1.10 Mouse [Logitech USB Receiver]
on usb-0000:00:1d.0-1.3

and things work again... Fun!

P.S. I guess this is as much testing as I can do without further guidance.
Comment 19 John Mock 2004-03-02 23:36:16 UTC
I've seen improvement as early as 2.6.1-rc1 and definitely since then.  By
compiling 'usb-uchi' into the kernel rather than loading it as a module, not
only does UCHI come up without significant complaints, but i can even suspend
with a digital camera connected ('usb-storage') and have its file system still 
work afterwards.  This was something that i didn't expect.  So that's a big
change for me.  (.config available upon request.)

Surprisingly enough, under 2.6.3, even the PCMCIA IDE card came back up
after hibernation, albeit with a few minor complaints. So in some ways,
the current version may be better if one doesn't need certain modules to
work.

However, that does make it difficult to disconnect USB device in software,
which is unfortunate.  So it's an improvement but not a solution.

I tried to make the module flavor work under 2.6.1-rc1, but i didn't get
very far.  I could get it to come up without major complaints, but could
not get it to retain any connections.  That may be fine for an HID device,
but not much good for a file-oriented device.  It really isn't much better
than rmmod'ing just before suspending (as the effect is the same).  I have
not tried the Stern patch yet, and will do so when i'm less busy.

By the way, software suspend under X is still as bad as before, but that 
is not the subject of this bug. Alas it does mean i still feel stuck with 
Windows when i'm likely to be running off of batteries.
Comment 20 Alan Stern 2004-03-03 07:54:52 UTC
Created attachment 2278 [details]
Reinitialize the Framelist Base Address register

Aleksey, try this patch in addition to the one from before.  Do a suspend
without rmmod, and look at that /proc file both before and after the suspend. 
The values should be pretty much the same, except for usbfrnum and the last
three digits of flbaseadd (which always agree with usbfrnum).  It's normal for
the usbfrnum value to change in realtime, by the way.

John, try using this patch along with latest 2.6 system you can get your hands
on, and try compiling the driver as a module.  If it doesn't work, I'd like to
find out why.

Just in case either of you has problems applying the patch, here is the text of
the uhci_resume() function from the driver.  If necessary, you can just copy it
directly into the source file.

---------------
static int uhci_resume(struct usb_hcd *hcd)
{
	struct uhci_hcd *uhci = hcd_to_uhci(hcd);

	pci_set_master(to_pci_dev(uhci_dev(uhci)));

	if (uhci->state == UHCI_SUSPENDED) {

		/*
		 * Some systems clear the Interrupt Enable register during
		 * PM suspend/resume, so reinitialize it.  Likewise, set
		 * the Framelist Base Address register.
		 */
		outw(USBINTR_TIMEOUT | USBINTR_RESUME | USBINTR_IOC |
				USBINTR_SP, uhci->io_addr + USBINTR);
		outl(uhci->fl->dma_handle, io_addr + USBFLBASEADD);
		uhci->resume_detect = 1;
	} else {
		reset_hc(uhci);
		start_hc(uhci);
	}
	uhci->hcd.state = USB_STATE_RUNNING;
	return 0;
}
Comment 21 Fabrice Bellet 2004-03-04 03:46:04 UTC
This patch resolved the part of the problem warning me about "host controller
halted, very bad!" in my logs. But the external USB mouse still doesn't work
after resume from ACPI suspend (echo 3 > /proc/acpi/sleep). The file
/proc/driver/uhci/0000:00:1d.1 contains these differences :

before suspend :

  usbstat   =     0000   
Skeleton QH's
- skel_int8_qh
    [f7db0270] link (37db0152) element (37db2080)
      0: [f7db2080] link (00000001) e3 SPD LS IOC Active NAK Length=7ff MaxLen=3
DT1 EndPt=1 Dev=2, PID=69(IN) (buf=37462020)

and after suspend :
  usbstat   =     0001   USBINT 
Skeleton QH's
- skel_int8_qh
    [f7db0240] link (37db0152) element (00000001)
     Element != First TD
      0: [f7db2040] link (00000001) e3 SPD LS IOC Length=3 MaxLen=3 DT0 EndPt=1
Dev=2, PID=69(IN) (buf=37462020)

Then I unplug the mouse :

  usbstat   =     0021   HCHalted USBINT 
  stat2     =     0080  

And I replug it :

  usbstat   =     0025   HCHalted ResumeDetect USBINT 
  stat2     =     0185   LowSpeed Enabled Connected
Skeleton QH's
- skel_ls_control_qh
    [f7db0270] link (37db01b2) element (37db2080)
      0: [f7db2080] link (37db20c0) e3 LS Active Length=0 MaxLen=7 DT0 EndPt=0
Dev=0, PID=2d(SETUP) (buf=2674faf8)
      1: [f7db20c0] link (00000001) e3 LS IOC Active Length=0 MaxLen=7ff DT1
EndPt=0 Dev=0, PID=69(IN) (buf=00000000)

The machine is a thinkpad T40, running Fedora kernel-2.6.3-1.118 modified with
your patch.
Comment 22 Karol Kozimor 2004-03-04 14:52:07 UTC
Hi,
I want to report that with 2.6.4-rc1-mm1 (which apparently has the patch #2216
merged) + patch #2278 (typo: s/io_addr/uhci->io_addr) I can finally use S3 not
having to unload USB modules before suspend. Great work, thanks a lot!

Note You need to log in before you can comment on or make changes to this bug.