Bug 6560
Summary: | Engaged ehci_hcd raises CPU temperature. Prevents fan slowdown. | ||
---|---|---|---|
Product: | Drivers | Reporter: | Mats Johannesson (spamcan) |
Component: | USB | Assignee: | David Brownell (dbrownell) |
Status: | RESOLVED WILL_NOT_FIX | ||
Severity: | normal | CC: | alan, greg, protasnb, stern |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.22-rc5-git3 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 5089 | ||
Attachments: |
experimental ehci unlink patch
usb-test-proper.txt usb-test-patched.txt |
Description
Mats Johannesson
2006-05-15 13:02:54 UTC
Created attachment 8299 [details]
experimental ehci unlink patch
Yeech, VIA again. We know there are hardware issues with this,
it doesn't issue some IRQs it's supposed to issue. Try the patch
I've attached here, which changes how those hardware issues get
worked around ... maybe it will help, maybe not.
Also, with CONFIG_USB_DEBUG, when it's getting this overheat thing,
please look at /sys/class/usb_host/.../registers for that controller
(the file will say inside that it's EHCI). Look at it several times,
see if its contents are changing during this overheat thing, and
please attach a copy of it (plus a description of any changes you
noticed).
Compiled two 2.6.17-rc6-git4 kernels with CONFIG_USB_DEBUG and used the patch on one of them. Wrote a script to capture the data you requested - I'm in no position to 'notice' changes in this area, except the temperature. Unfortunately the temp didn't stay put with the patched kernel. Testing procedure was: Cold boot. Wait for the core CPU temp to reach 49C. Start script. Wait 1 minute. Plug in HD (no mount). Wait 1 minute. Unplug HD. Search for "plugged" to find the crossover data points. #!/bin/sh if ! grep -q ehci /proc/modules; then modprobe ehci_hcd fi echo "" >usb-test.txt echo "USB Test Begin" >>usb-test.txt echo "**********" >>usb-test.txt touch /root/.usb while (true) do if [ -e /root/.usb ]; then if grep -q Prolific /proc/bus/usb/devices; then rm -f /root/.usb echo "" >>usb-test.txt echo "**********" >>usb-test.txt echo "HD plugged!" >>usb-test.txt echo "**********" >>usb-test.txt fi fi echo "----------" >>usb-test.txt date >>usb-test.txt echo "----------" >>usb-test.txt cat /proc/acpi/thermal_zone/*/temperature >>usb-test.txt cat /sys/class/usb_host/usb_host1/registers >>usb-test.txt if ! [ -e /root/.usb ]; then if ! grep -q Prolific /proc/bus/usb/devices; then echo "" >>usb-test.txt echo "**********" >>usb-test.txt echo "HD unplugged!" >>usb-test.txt echo "**********" >>usb-test.txt for i in 1 2 3 4 5; do echo "----------" >>usb-test.txt date >>usb-test.txt echo "----------" >>usb-test.txt cat /proc/acpi/thermal_zone/*/temperature >>usb-test.txt cat /sys/class/usb_host/usb_host1/registers >>usb-test.txt sleep 2s done rmmod ehci_hcd echo "" >>usb-test.txt echo "**********" >>usb-test.txt echo "ehci driver unloaded..." >>usb-test.txt echo "**********" >>usb-test.txt for i in 1 2 3 4 5; do echo "----------" >>usb-test.txt date >>usb-test.txt echo "----------" >>usb-test.txt cat /proc/acpi/thermal_zone/*/temperature >>usb-test.txt sleep 2s done exit fi fi sleep 2s done Created attachment 8303 [details]
usb-test-proper.txt
unpatched kernel
Created attachment 8304 [details]
usb-test-patched.txt
patched kernel
Still broken in 2.6.18 final Mats, any updates on the problem? How are the new releases working for you? Thanks, --Natalie Natalie, linux-2.6.22-rc5-git3 under Ubuntu 7.04. No change. Running my test-script above shows the core CPU temperature rising from 43C to 45C eight seconds after the HD was plugged in. For me this is no longer a problem. I've done surgery on the notebook and installed passive cooling through various fins and plates on all hotspots, drilled extra ventilation holes and, most importantly, attached a variable resistor to the fan. The machine is whisper quiet. This is great workaround, should be offered as a patch ;) But seriously, this way the test system is no longer available, Mats! can you please put is all back as it was before... Is there known erratas on this chipset? Maybe this problem needs to be brought to attention of ACPI people? Eh... the test system is exactly as before in terms of _symptom_ (2 to 4 degrees core CPU temp rise on HD engagement through ehci_hcd), it's only the _consequence_ (high fan speed == noise) that has been mitigated through my hardware modifications. The changes are irrevocable, unless you want me to desolder resistors and plug forty one mm ventilation holes etc (it's a work of enginering art ;-) I don't know about chipset errata, but as you see in comment #1 the USB people know about VIA... From what I've seen on the net, VIA is not particularly friendly visavis open source developers. According to kernel sources, and other evidence, people worked under an NDA when eg doing IDE stuff for this southbridge (the VT8235). As for involving ACPI, I can't see a future there. The fan speeds seem to be controlled purely through hardware, reacting against certain temperature thresh-holds (nothing in /proc/acpi controls it). Copying to Alan, to help sort out problem with EHCI and overheating. (and decide if to keep this bug open) I have no idea what's wrong, other than the fact that some VIA EHCI chips are known to configure themselves incorrectly. It would be good to try 2.6.25-rc6; that kernel includes a fix for a problem known to affect lots of EHCI controllers (including VIA's). Also, a patch was submitted last week to prevent some of them from hogging the PCI bus (not applicable to the vt8235, unfortunately). Maybe something similar is needed to prevent the overheating. FYI, the bus-hogging patch is <http://marc.info/?l=linux-usb&m=120599996404777&w=2> |