Bug 6395 - Fail to resume on Tecra M2 with ADM1032 and Intel 82801DBM
Summary: Fail to resume on Tecra M2 with ADM1032 and Intel 82801DBM
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: I2C (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Jean Delvare
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-04-16 06:12 UTC by Daniele Gaffuri
Modified: 2007-03-24 07:42 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.16 - 2.6.17-rc1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Fix i2c-i801 resume when PEC is enabled (1.22 KB, patch)
2006-04-16 07:51 UTC, Jean Delvare
Details | Diff
Fix i2c-i801 suspend and shutdown when PEC is enabled (1.58 KB, patch)
2006-04-16 10:42 UTC, Daniele Gaffuri
Details | Diff
Clear PEC bit after every transaction (974 bytes, patch)
2006-04-18 13:50 UTC, Jean Delvare
Details | Diff

Description Daniele Gaffuri 2006-04-16 06:12:38 UTC
Hi

probably it's a BIOS problem of my laptop and not a kernel bug, I've googled a
lot without finding anything similar. Anyway, having identified the patch which
introduces the problem, I'm posting here to make this info public.

Most recent kernel where this bug did not occur: 2.6.15

Distribution: Gentoo

Hardware Environment: Toshiba Tecra M2 laptop

# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 13
model name      : Intel(R) Pentium(R) M processor 2.00GHz
stepping        : 6
cpu MHz         : 600.000
cache size      : 2048 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr mce cx8 apic sep mtrr pge mca cmov pat
clflush dts acpi mmx fxsr sse sse2 ss tm pbe up est tm2
bogomips        : 1198.08

# lspci
00:00.0 Host bridge: Intel Corporation 82855PM Processor to I/O Controller (rev 21)
00:01.0 PCI bridge: Intel Corporation 82855PM Processor to AGP Controller (rev 21)
00:1d.0 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #2 (rev 03)
00:1d.2 USB Controller: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M)
USB UHCI Controller #3 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801DB/DBM (ICH4/ICH4-M) USB2 EHCI
Controller (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 83)
00:1f.0 ISA bridge: Intel Corporation 82801DBM (ICH4-M) LPC Interface Bridge
(rev 03)
00:1f.1 IDE interface: Intel Corporation 82801DBM (ICH4-M) IDE Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) SMBus
Controller (rev 03)
00:1f.5 Multimedia audio controller: Intel Corporation 82801DB/DBL/DBM
(ICH4/ICH4-L/ICH4-M) AC'97 Audio Controller (rev 03)
00:1f.6 Modem: Intel Corporation 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) AC'97
Modem Controller (rev 03)
01:00.0 VGA compatible controller: nVidia Corporation NV34M [GeForce FX Go5200
32M/64M] (rev a1)
02:05.0 Network controller: Intel Corporation PRO/Wireless 2200BG Network
Connection (rev 05)
02:07.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000
Controller (PHY/Link)
02:09.0 Ethernet controller: Intel Corporation 82540EP Gigabit Ethernet
Controller (LOM) (rev 03)
02:0b.0 CardBus bridge: Toshiba America Info Systems ToPIC100 PCI to Cardbus
Bridge with ZV Support (rev 32)
02:0b.1 CardBus bridge: Toshiba America Info Systems ToPIC100 PCI to Cardbus
Bridge with ZV Support (rev 32)
02:0d.0 System peripheral: Toshiba America Info Systems SD TypA Controller (rev 03)

# sensors
adm1032-i2c-0-4c
Adapter: SMBus I801 adapter at d880

Software Environment: lm_sensors-2.10.0

Problem Description:

Unable to resume after power down: pressing the power button will turn the power
led on but the laptop doesn't start. I have to keep the power button pressed for
5 seconds to power off and then press it once again to restart.

When rebooting the power goes off after showing a blinking cursor for some
second, and before the GRUB screen is showed. Same procedure is required to restart.

When suspending to RAM I get a "failure to resume" message from BIOS. Same
procedure is required to restart.

Steps to reproduce: load the lm90 kernel module in 2.6.16 or 2.6.17-rc1 and halt
or reboot.

Using git-bisect (great feature) I've been able to identify the offending patch:

[PATCH] i2c: i2c-i801 explicitly enables/disables PEC

This patch tweaks i2c-i801.c so that the driver always sets the SMBAUXCTL
register (which enables/disables PEC) explicitly before each transaction.

Signed-off-by: Mark M. Hoffman <mhoffman at lightlink.com>
Signed-off-by: Jean Delvare <khali at linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh at suse.de>

---
commit 2e3e13f8e9d9b2111404cdccaa4e1b988b70acce
tree de95ee215c2189cbfb98829e32e7fb117c94a160
parent 46f25dffbaba48c571d75f5f574f31978287b8d2
author Mark M. Hoffman <mhoffman at lightlink.com> Sun, 06 Nov 2005 23:04:51 +0100
committer Greg Kroah-Hartman <gregkh at suse.de> Thu, 05 Jan 2006 22:16:20 -0800

 drivers/i2c/busses/i2c-i801.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/drivers/i2c/busses/i2c-i801.c b/drivers/i2c/busses/i2c-i801.c
index ac3eafa..1c752dd 100644
--- a/drivers/i2c/busses/i2c-i801.c
+++ b/drivers/i2c/busses/i2c-i801.c
@@ -468,8 +468,7 @@ static s32 i801_access(struct i2c_adapte
 		return -1;
 	}
 
-	if (hwpec)
-		outb_p(1, SMBAUXCTL);	/* enable hardware PEC */
+	outb_p(hwpec, SMBAUXCTL);	/* enable/disable hardware PEC */
 
 	if(block)
 		ret = i801_block_transaction(data, read_write, size, hwpec);
@@ -478,9 +477,6 @@ static s32 i801_access(struct i2c_adapte
 		ret = i801_transaction();
 	}
 
-	if (hwpec)
-		outb_p(0, SMBAUXCTL);	/* disable hardware PEC */
-
 	if(block)
 		return ret;
 	if(ret)

If I restore these two lines

	if (hwpec)
		outb_p(0, SMBAUXCTL);	/* disable hardware PEC */

the problem disappears. If I disable HW PEC via sysfs after loading the sensors
modules and I run the sensors program at least once reboot is ok too.

Let me know if I've to post more info.

Thanks in advance.
Comment 1 Jean Delvare 2006-04-16 07:50:55 UTC
Thanks for the very complete bug report and the detailed analysis.

So it looks like your system won't resume properly if the i801 PEC bit was set
when suspend happened.

Please undo your own change and try the following patch instead. I don't use
suspend myself, and my ADM1032 chip is not on my i801 adapter anyway, so I can't
test it. I don't know much about suspend and resume also so this is a first try,
which may or may not work. Please let me know.
Comment 2 Jean Delvare 2006-04-16 07:51:55 UTC
Created attachment 7882 [details]
Fix i2c-i801 resume when PEC is enabled
Comment 3 Daniele Gaffuri 2006-04-16 10:42:10 UTC
Created attachment 7883 [details]
Fix i2c-i801 suspend and shutdown when PEC is enabled

Thank you very much for the ready answer. Your patch works correctly for the
suspend/resume case, and I've elaborated on it to cover shutdown and reboot.

I'm not an hacker, so I hope I've used the correct hooks in pci_driver struct:
I've added similar code in the shutdown and in the remove functions to cover
both the built-in and the module configurations.

Tested on 2.6.17-rc1, it works both in kernel and as a module.

Thank you again for your help.
Comment 4 Jean Delvare 2006-04-18 05:08:01 UTC
Daniele, maybe you can still report the problem to Toshiba? They really should
fix their BIOS code.
Comment 5 Jean Delvare 2006-04-18 13:50:56 UTC
Created attachment 7896 [details]
Clear PEC bit after every transaction

After some discussion, this more simple fix seems to be prefered. It should
work just as fine, and is also more robust with regards to unclean reboots.
Please test.
Comment 6 Daniele Gaffuri 2006-04-18 13:58:51 UTC
Hi Jean

that's exactly the first patch I tried, and it worked. I'll test it again and
I'll let you know. I'm also trying to find how to report the problem to Toshiba,
but there's not an obvious way to do it.

Thanks again

Daniele
Comment 7 Daniele Gaffuri 2006-04-18 14:33:06 UTC
I confirm that tha last patch works, tested on 2.6.17-rc1
Comment 8 Frans Pop 2006-04-20 18:32:23 UTC
I've tested this patch and it solves the identical problem I had on my Toshiba 
Satellite A40. 
 
But more importantly, it also solves the problem I've been having since 2.6.16 
that my fan no longer started automatically when the processor heats up, 
allowing it to overheat dangerously. 
See http://bugzilla.kernel.org/show_bug.cgi?id=6315. 
 
This patch magically restores fan/temperature control to what I was used to 
with 2.6.15. 
Please push this patch through for both 2.6.17 and 2.6.16. 
 
Comment 9 Jean Delvare 2006-04-21 00:54:19 UTC
Frans, does the Tecra M2 have automatic fan speed regulation, as Frans described
for the Satellite A40?
Comment 10 Daniele Gaffuri 2006-04-21 01:02:18 UTC
Hi Jean

I suppose you're asking to me. Yes, also Tecra M2 has auto fan control. In 
fact I've noticed some strange fan behaviour, but in my case the fan didn't 
turn off at low temps, so I didn't worry too much. It also seems that this 
doesn't happen anymore after patching.
Comment 11 Jean Delvare 2006-04-21 01:08:25 UTC
Oops, yes that was a question for you, Daniele.

So this explains why the SMBus was hidden on both laptops. Toshiba seem to know
their business. I think we will have to drop both quirks and leave the SMBus
hidden again on these laptops. I'm sorry about that, but the current situation
is unsafe, and we don't want users to burn their hardware.
Comment 12 Daniele Gaffuri 2006-04-21 14:42:19 UTC
Jean, the users too don't want to fry their hardware -). I'm sad because the
quirk for Tecra M2 was my first and only patch to linux kernel, but I realize it
probably should be dropped until a safe solution is found. I've searched a lot
without finding any documentation, do you think that you, as a kernel developer,
may ask some clarification to Toshiba?

Cheers

Daniele
Comment 13 Jean Delvare 2006-04-23 08:21:05 UTC
No, I have no technical contact at Toshiba.

I'll not remove the quirks right away. Let's take some time to discuss the
alternatives and try a few hacks first. Maybe the ACPI folks with have an idea.
Comment 14 Jean Delvare 2007-03-24 07:42:12 UTC
As was discussed in bug #6315, it is now clear that on these Toshiba laptops
(Satellite A40 and Tecra M2) the thermal management is done by SMM code, which
can access the SMBus at any time. Thus the only safe option is to remove the
quirk that was unhiding the SMBus. I am sorry about that, I understand that you
liked that quirk and I agree that the risk that something actually goes wrong
with it is thin, but still this is a risk I am not willing to take.

Note You need to log in before you can comment on or make changes to this bug.