Bug 6193

Summary: with uhci_hcd loaded S3 resume immediately
Product: Drivers Reporter: Tim Dijkstra (newsuser)
Component: USBAssignee: Alan Stern (stern)
Status: CLOSED CODE_FIX    
Severity: normal CC: greg
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.15.4 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 5089    
Attachments: dmesg of immediate resume
lspci -vvv
dmidecode
Disable EGSM on ASUS motherboard

Description Tim Dijkstra 2006-03-08 07:19:28 UTC
Most recent kernel where this bug did not occur: ?
Distribution: Debian
Hardware Environment: A7V8X motherboard (VT8235 PCI Bridge, VT82xxxxx UHCI USB
1.1 Controller)
Software Environment: Vanilla 2.6.15.4 kernel
Problem Description: 

When I put the system in suspend-to-ram 
 
  echo mem > /sys/power/state

all seems well, but the system resumes immediately (but normally). After this
the system seems prefectly stable.

If I try the same after unloading the driver, suspend-to-ram works as 
expected.

Steps to reproduce:

modprobe uhci_hcd
echo mem > /sys/power/state
Comment 1 Tim Dijkstra 2006-03-08 07:20:41 UTC
Created attachment 7533 [details]
dmesg of immediate resume
Comment 2 Tim Dijkstra 2006-03-08 07:23:10 UTC
Created attachment 7534 [details]
lspci -vvv
Comment 3 Greg Kroah-Hartman 2006-03-08 10:12:29 UTC
Alan, this is for you :)
Comment 4 Alan Stern 2006-03-08 12:35:19 UTC
From the log:

uhci_hcd 0000:00:10.0: uhci_resume
uhci_hcd 0000:00:10.0: uhci_check_and_reset_hc: legsup = 0x2000

This is bad; it indicates the BIOS is setting bits that it doesn't control.  I
have no way of knowing whether this could be the cause of the immediate resume,
however.  (I sort of doubt it; after all, the BIOS shouldn't care whether or not
uhci-hcd is loaded.)

Here's a good test to try.  In drivers/usb/core/hcd-pci.c:usb_hcd_pci_suspend(),
find the two lines that call pci_enable_wake() and comment them out.  In theory
that will prevent the UHCI controllers from issuing wakeup requests.  In
practice...  The conditions under which a controller will issue a wakeup request
aren't documented anywhere, nor are the actions taken by the BIOS or the ACPI
interpreter.  So there's no way to tell without trying it.
Comment 5 Tim Dijkstra 2006-03-08 14:10:02 UTC
Nope commenting out pci_enable_wake() didn't help, unfortunately. 
Comment 6 Alan Stern 2006-03-09 08:22:03 UTC
I'm inclined to call this a bug in the BIOS.  Why else should the system wake up
immediately when the controllers aren't enabled for making wakeup requests?

Just out of curiosity, does it make any difference if you unplug all your USB
devices before suspending?

Also, it's worth a shot fiddling with the USB settings in your BIOS setup. 
Maybe you can convince the BIOS to stop interfering with things the operating
system is supposed to be in control of.
Comment 7 Tim Dijkstra 2006-03-09 11:39:47 UTC
I have a card-reader build in the machine. I just tried disconnecting if from
the motherboard, and indeed, now suspend works as expected!

So it has something to do with the card-reader... but still the controller is
doing something wrong, isn't it?

[as of tomorrow I will be away for a week, so my response will lag a bit]
Comment 8 Alan Stern 2006-03-09 12:01:32 UTC
Or it has something to do with the fact that a device is connected.  But yes,
something is going wrong somewhere.  It's not so easy to tell what or where,
however.  There's no way to debug the BIOS or fix problems in it.

You can try doing this (with the card reader plugged back in):

  echo 3 >/sys/devices/pci0000:00/0000:00:10.2/power/state

which should suspend just that one UHCI controller, and then do

  lspci -vvv -s10.2

This should show whether or not the suspended controller is sending a wakeup signal.

In fact, do the test twice: once with the original kernel and once with those
pci_enable_wake() calls commented out.
Comment 9 Tim Dijkstra 2006-03-09 16:12:32 UTC
I guess I need "USB selective suspend/resume and wakeup" (CONFIG_USB_SUSPEND)
for that, or not?

Also, what should I see in the output of lspci -vvv to see if a controller is
sending wake events?
Comment 10 Alan Stern 2006-03-10 07:29:26 UTC
For this test it's better if you don't define CONFIG_USB_SUSPEND.  Here's the
relevant part of your earlier lspci output:

0000:00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80) (prog-if 00 [UHCI])
	Subsystem: Asustek Computer, Inc. VT6202 USB2.0 4 port controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping-
SERR- FastB2B-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
	Latency: 32, Cache Line Size: 0x08 (32 bytes)
	Interrupt: pin C routed to IRQ 18
	Region 4: I/O ports at a800 [size=32]
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

The important parts are on the last line.  The "PME-" means that the controller
doesn't want to assert the PME (Power Management Event) signal, and the
"PME-Enable-" means that it's not allowed to assert PME even if it wants to.
Comment 11 Tim Dijkstra 2006-03-20 13:52:12 UTC
I recompilled the usbcore module both with and without the lines commented out
is described in comment #4. I rmmod'ed and modprobe'd to change between them (no
reboot), I this is not sufficient please tell me.

Then I carried out the tests, in all cases the status line stays the same:
    Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Note that after 
   echo 3 >/sys/devices/pci0000:00/0000:00:10.2/power/state
This file never contains anything but a 0, also for controllers without anything
connected to it.
Comment 12 Alan Stern 2006-03-21 12:30:51 UTC
I forgot a couple of things for the test...  Before doing

   echo 3 >/sys/devices/pci0000:00/0000:00:10.2/power/state

you first have to do this:

   rmmod usb-storage
   echo 3 >/sys/devices/pci0000:00/0000:00:10.2/3-2/3-2:1.0/power/state
   echo 3 >/sys/devices/pci0000:00/0000:00:10.2/usb3/3-2/power/state
   echo 3 >/sys/devices/pci0000:00/0000:00:10.2/usb3/3-0:1.0/power/state
   echo 3 >/sys/devices/pci0000:00/0000:00:10.2/usb3/power/state

That's because the device PM core won't allow you to suspend a device without
suspending all its children first.

After running this test, you can try making a change to the UHCI driver.  In
drivers/usb/host/uhci-hcd.c, find the resume_detect_interrupts_are_broken()
routine and make it always return 1.  It will be interesting to see if this
causes a change in behavior.
Comment 13 Tim Dijkstra 2006-03-22 13:33:01 UTC
OK, I did:

rmmod usb-storage
echo 3 >/sys/devices/pci0000:00/0000:00:10.2/usb3/3-2/3-2:1.0/power/state
echo 3 >/sys/devices/pci0000:00/0000:00:10.2/usb3/3-2/power/state
echo 3 >/sys/devices/pci0000:00/0000:00:10.2/usb3/3-0:1.0/power/state
echo 3 >/sys/devices/pci0000:00/0000:00:10.2/usb3/power/state
echo 3 >/sys/devices/pci0000:00/0000:00:10.2/power/state
lspci -vvv -s 10.2

With all four permutations of chosing two out of:
  pci_enable_wake() commented out / not commented out 
  resume_detect_interrupts_are_broken() returning 1 / doing the normal thing

This was done without CONFIG_USB_SUSPEND.
After that I also tried 'return 1 from resume_detect..' with CONFIG_USB_SUSPEND=y.

In all cases the last line of lspci -vvv was the same, also all power/state file
s still contained 0. Except for the first one, that has a 1.
Comment 14 Alan Stern 2006-03-22 13:51:26 UTC
The power/state files should contain 2 or 3.  I don't know why they don't.  What
shows up in the dmesg log when you run that test?

Also, did changing resume_detect_... fix the behavior?
Comment 15 Alan Stern 2006-03-23 09:48:27 UTC
Just did some checking...  It was a foolish mistake on my part.  You have to write

   echo -n 3 >...

on each line.  Without the "-n" it doesn't work.
Comment 16 Tim Dijkstra 2006-03-23 13:13:41 UTC
OK, that makes the test have more diverse output;) With no modifications the
last line of lspci now reads:

   Status: D3 PME-Enable+ DSel=0 DScale=0 PME-

With pci_enable_wake() commented out:

   Status: D3 PME-Enable- DSel=0 DScale=0 PME-

Having resume_detect_interrupts_are_broken() return 1 doesn't make a difference
and doesn't fix the problem.
Comment 17 Alan Stern 2006-03-23 13:41:45 UTC
So in one case PME is enabled and in the other it isn't, as you would expect. 
But in neither case is it on, so it's not the cause of your immediate resume.

At this point I'm quite sure there's something wrong with your BIOS.  You could
check to see if an upgrade is available.

In the meantime, here's something else to try.  Keep the resume_detect...
change, and also edit the suspend_rh() routine just below.  Change the line that
says

	outw(USBCMD_EGSM | USBCMD_CF, uhci->io_addr + USBCMD);

to

	outw(USBCMD_CF, uhci->io_addr + USBCMD);

In other words, get rid of the USBCMD_EGSM.  (The string EGSM occurs only twice
in the source file, so it should be easy to spot.)  This will cause the
suspended controller to be left in essentially the same state as if uhci-hcd had
never been loaded.
Comment 18 Tim Dijkstra 2006-03-23 14:02:21 UTC
Removing USBCMD_EGSM helped. Now resume/suspend works as it should. Does this
mean there's indeed a bios bug? I'll check the asus site for updates...
Comment 19 Alan Stern 2006-03-24 07:32:42 UTC
It means there's a bug in either the BIOS or the controller hardware.  Maybe
both.  Or perhaps it's an undocumented "feature".  If you can get any
information out of Asus it might help.  VIA refuses to answer questions about
this sort of thing.
Comment 20 Tim Dijkstra 2006-03-24 11:09:07 UTC
OK, I can try that, but what do I have to ask precisely? What is the bios doing
wrong? What does USBCMD_EGSM mean? 

Does hurt to keep USBCMD_EGSM removed from that outw()?
Comment 21 Alan Stern 2006-03-24 11:45:29 UTC
You could ask them under what conditions the UHCI controller will wake up a
system in various sleep states.  In particular, ask them why it would wake up
the system when a device is connected but there is no connect change or wakeup
request pending.  (At least, I assume the card reader isn't sending a wakeup
request.  If it is, that might explain why the computer wakes up immediately. 
But it's not supposed to be sending a wakeup request.  You could find out for
certain by doing all those "echo -n 3 >..." commands, then mounting
/sys/kernel/debug with "-t debugfs", and then copying the contents of
/sys/kernel/debug/uhci/0000:00:10.2.)

I don't know exactly what the BIOS is doing.  But one thing it's doing wrong is
setting the USBPIRQDEN bit in the UHCI LEGSUP register.  That bit is supposed to
be controlled entirely by the operating system; the BIOS is not supposed to
touch it at all.

USBCMD_EGSM is the "Enter Global Suspend Mode" bit in the UHCI USB Command
register.  It tells the controller to suspend the entire USB bus and to respond
to wakeup requests.  It's okay to leave it turned off, provided you also make
sure that resume_detect_interrupts... always returns 1.  (If you don't make that
second change then the controller won't do anything when you plug in a USB device!)
Comment 22 Tim Dijkstra 2006-03-25 01:29:21 UTC
After the series of 'echo -n 3 > ...', this is the contents of 
/sys/kernel/debug/uhci/0000:00:10.2

Root-hub state: suspended
HC status
  usbcmd    =     0048   Maxp32 CF EGSM
  usbstat   =     0020   HCHalted
  usbint    =     0002
  usbfrnum  =   (1)144
  flbaseadd = 031bb000
  sof       =       40
  stat1     =     0480   OverCurrent
  stat2     =     1495   Suspend OverCurrent Enabled Connected
Frame List
Skeleton QHs
Comment 23 Alan Stern 2006-03-25 11:07:03 UTC
Since the only bit set in the USB Status register is the HCHalted bit, we can be
certain that the controller doesn't want to issue an interrupt.  We can also be
pretty sure that it isn't trying to wake up the system.  But apparently some
combination of the BIOS firmware and motherboard circuitry is causing it to do
so regardless.
Comment 24 Tim Dijkstra 2006-04-18 12:17:47 UTC
OK, I tried to get some info out of ASUS, but I can't get past the people at
their support that say "sorry we don't do linux".

Some more questions from me before I give up on fixing this issue. :( 
What functionality am I missing if I remove USBCMD_EGSM (Comment #17)? It
doesn't seem to hurt, but why is it there in the first place?
Comment 25 Alan Stern 2006-04-19 06:48:29 UTC
Without the EGSM you might lose a remote wakeup capability.  For example, if you
have a USB mouse plugged into the controller and both the root hub and the mouse
are suspended, the system might not automatically wake them up when you move the
mouse or press a button.  I'm not sure because I've never tried it.

Also, you lose a very small amount of system overhead.  With EGSM set, the
controller will interrupt the CPU whenever a device is plugged or unplugged. 
Without it, the CPU has to poll the controller 4 times per second.  It's not a
big deal.
Comment 26 Pavel Machek 2006-09-29 04:18:28 UTC
Can you retry with 2.6.18? USB suspend changed a lot...
Comment 27 Tim Dijkstra 2006-09-29 12:21:31 UTC
Problem still exists in 2.6.18. But I think alan concluded it was a bug in my
hardware.
Comment 28 Alan Stern 2006-09-29 13:35:28 UTC
That's what it looks like.  The only remaining question is whether the driver
should change, say by detecting your particular type of motherboard and then
avoiding EGSM.  Or would you be happy enough just leaving the built-in card
reader unplugged?

Here's an interesting idea that just occurred to me.  Suppose you do unplug the
card reader but then plug in an external USB device (it would have to be a
full-speed or low-speed device, or else you would have to rmmod ehci-hcd). 
Would you then see the same immediate resume behavior?
Comment 29 Tim Dijkstra 2006-09-30 00:41:03 UTC
I just tried with the card reader unplugged and a usb camera plugged in; the
same behaviour occurs.

And, yes, I rather have the workaround in the kernel, that way I don't have to
patch the kernel I compile with every release. Unplugging a built in cardreader
every time you resume is a bit inconvenient...
Comment 30 Alan Stern 2006-10-04 08:25:23 UTC
The workaround needs to be specific to your sort of motherboard; as far as I
know nobody else has the same kind of problem.  Please attach the output from
dmidecode for your system.
Comment 31 Tim Dijkstra 2006-10-04 11:39:06 UTC
Created attachment 9153 [details]
dmidecode
Comment 32 Alan Stern 2006-10-04 14:21:47 UTC
Created attachment 9155 [details]
Disable EGSM on ASUS motherboard

Here's a patch for 2.6.18 you can try out.  It avoids turning on EGSM if it
sees a motherboard of your type with a device connected.
Comment 33 Tim Dijkstra 2006-10-06 12:34:43 UTC
Yup, that works.

Thanks for working on this btw!
Comment 34 Alan Stern 2006-10-06 19:11:14 UTC
Okay, I'll queue up the patch for submission and mark this bug report closed.