Bug 5082 - speedstep_smi produces FATAL error on first load, works on second load
Summary: speedstep_smi produces FATAL error on first load, works on second load
Status: CLOSED UNREPRODUCIBLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Venkatesh Pallipadi
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-08-17 11:03 UTC by Sanjoy Mahajan
Modified: 2007-07-25 21:39 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.15.1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
dmesg on first loading speedstep-smi (which fails) (1.95 KB, application/octet-stream)
2005-10-25 14:37 UTC, Sanjoy Mahajan
Details
dmesg output on second load of speedstep-smi (3.38 KB, application/octet-stream)
2005-10-25 14:38 UTC, Sanjoy Mahajan
Details
let the system settle down a bit first (2.12 KB, patch)
2006-01-08 14:22 UTC, Dominik Brodowski
Details | Diff
patch to call notify_smm (but it didn't help) (950 bytes, patch)
2006-02-04 23:00 UTC, Sanjoy Mahajan
Details | Diff

Description Sanjoy Mahajan 2005-08-17 11:03:19 UTC
New?  No, I've seen this with all 2.6 kernels since at least 2.6.11 (maybe
earlier but I don't remember).

Distribution: Debian testing/unstable
Hardware Environment: TP 600X, Pentium III (Coppermine) 650MHz, 440BX bridge,
                      (the 650-MHz TP 600X model supports speed changing)
Software Environment: 
Problem Description: The first time I modprobe speedstep_smi (which usually
happens during boot since it's in /etc/modules), it gives a FATAL error.  The
second time, it seems to work (and I can control the speed by catting to the
file in /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed):

# modprobe speedstep_smi

FATAL: Error inserting speedstep_smi
(/lib/modules/2.6.13-rc6-git8-local01/kernel/arch/i386/kernel/cpu/cpufreq/speedstep-smi.ko):
No such device
# 
# modprobe speedstep_smi

# 
# 

However several issues are slightly dubious:

1. The 'dmesg' shows:

  cpufreq: change failed with new_state 0 and result 2

2. The extra blank lines (as if the return at the end of the modprobe command
was being doubled, similar to what happens sometimes with 'acpi').

3. Once, with 2.6.13-rc6-git8, the system locked up completely when I did the
second modprobe.  I've only seen that happen once, and haven't been able to
reproduce it.

Steps to reproduce:
Comment 1 Len Brown 2005-08-17 11:22:23 UTC
happens also with acpi=off?
If so, this should probably be under the Power/cpufreq category,
but bugzilla seems to malfunction when i try to update it
Comment 2 Sanjoy Mahajan 2005-08-17 11:33:56 UTC
Not sure if it's good news (no need to fix Bugzilla right now) or bad news, but
it loads fine with acpi=off

Comment 3 Venkatesh Pallipadi 2005-10-23 17:04:50 UTC
Can you compile the kernel with CPU_FREQ_DEBUG enabled and pass 
cpufreq.debug=7 in kernel boot parameter and get the dmesg when speedstep-smi 
fails the first time and succeeds the second time. That should help us 
narrowing this down.

Thanks.
Comment 4 Venkatesh Pallipadi 2005-10-23 17:04:55 UTC
Can you compile the kernel with CPU_FREQ_DEBUG enabled and pass 
cpufreq.debug=7 in kernel boot parameter and get the dmesg when speedstep-smi 
fails the first time and succeeds the second time. That should help us 
narrowing this down.

Thanks.
Comment 5 Sanjoy Mahajan 2005-10-25 14:37:24 UTC
Created attachment 6387 [details]
dmesg on first loading speedstep-smi (which fails)
Comment 6 Sanjoy Mahajan 2005-10-25 14:38:41 UTC
Created attachment 6388 [details]
dmesg output on second load of speedstep-smi
Comment 7 Dominik Brodowski 2006-01-08 14:22:13 UTC
Created attachment 6972 [details]
let the system settle down a bit first

This bug is really strange... does this patch help, by chance?
Comment 8 Sanjoy Mahajan 2006-01-09 15:36:26 UTC
Same problem.  This time I waited until well after the reboot (and after
I had logged in via xdm and the emacsen had started up):

# modprobe -v speedstep-smi
insmod /lib/modules/2.6.15/kernel/arch/i386/kernel/cpu/cpufreq/speedstep-lib.ko
insmod /lib/modules/2.6.15/kernel/arch/i386/kernel/cpu/cpufreq/speedstep-smi.ko
FATAL: Error inserting speedstep_smi (/lib/modules/2.6.15/kernel/arch/i386/kernel/cpu/cpufreq/speedstep-smi.ko): No such device

# modprobe -v speedstep-smi
insmod /lib/modules/2.6.15/kernel/arch/i386/kernel/cpu/cpufreq/speedstep-smi.ko

2nd time's a charm.

The dmesg output is almost the same.  Here's the diff for the first
attempt (1st attempt without your patch vs 1st attempt with your patch):

18a19
> speedstep-smi: try 0, previous result 0, waiting...
25,29c26,32
< speedstep-smi: retry 1, previous result 2, waiting...
< speedstep-smi: retry 2, previous result 2, waiting...
< speedstep-smi: retry 3, previous result 2, waiting...
< speedstep-smi: retry 4, previous result 2, waiting...
< speedstep-smi: retry 5, previous result 2, waiting...
---
> speedstep-smi: try 0, previous result 0, waiting...
> speedstep-smi: try 1, previous result 2, waiting...
> speedstep-smi: try 2, previous result 2, waiting...
> speedstep-smi: try 3, previous result 2, waiting...
> speedstep-smi: try 4, previous result 2, waiting...
> speedstep-smi: try 5, previous result 2, waiting...
> speedstep-smi: try 6, previous result 2, waiting...
39d41
< 

And for the 2nd attempt (which works):

18a19
> speedstep-smi: try 0, previous result 0, waiting...
24a26
> speedstep-smi: try 0, previous result 0, waiting...
31,35c33,39
< speedstep-smi: retry 1, previous result 2, waiting...
< speedstep-smi: retry 2, previous result 2, waiting...
< speedstep-smi: retry 3, previous result 2, waiting...
< speedstep-smi: retry 4, previous result 2, waiting...
< speedstep-smi: retry 5, previous result 2, waiting...
---
> speedstep-smi: try 0, previous result 0, waiting...
> speedstep-smi: try 1, previous result 2, waiting...
> speedstep-smi: try 2, previous result 2, waiting...
> speedstep-smi: try 3, previous result 2, waiting...
> speedstep-smi: try 4, previous result 2, waiting...
> speedstep-smi: try 5, previous result 2, waiting...
> speedstep-smi: try 6, previous result 2, waiting...
43c47
< freq-table: setting show_table for cpu 0 to e49da600
---
> freq-table: setting show_table for cpu 0 to e4a055c0



-Sanjoy

`A society of sheep must in time beget a government of wolves.'
   - Bertrand de Jouvenal

Comment 9 Hiroshi Miura 2006-01-31 21:41:32 UTC
IBM release some BIOS update for TP600X;
http://www-307.ibm.com/pc/support/site.wss/MIGR-4FYS2U.html
What relese do you use?
Current version is 1.11 which is for Windows XP support.

We can show Microsoft's document;
http://download.microsoft.com/download/5/7/7/577a5684-8a83-43ae-9272-ff260a9c20e2/Windows%20Native%20Processor%20Performance%20Control.doc
It define the interface for Windows XP.

The same document also said;
FADT PSTATE_CNT Value
To allow the operating system to assume control of performance state transitions
from the BIOS, provide the proper control value in the PSTATE_CNT field of the
Fixed ACPI Description Table (FADT) at byte offset 55.  As described in the
Intel BIOS Writer
Comment 10 Sanjoy Mahajan 2006-01-31 22:19:26 UTC
I'm using bios release 1.11.  Here are a few bytes from
"od -v -Ad -N 60 -w1 -t x1 /proc/acpi/fadt":

0000054 a2
0000055 00
0000056 00
...
0000082 00
0000083 00
0000084 00
0000085 00
0000086 00

where the second group is in case the 'byte offset 55' was an offset
in hex (the offsets above are in decimal).  So should I modify the
fadt by hand (with seek and write)?  That seems a bit bold, so maybe
there is another way.

-Sanjoy

`Never underestimate the evil of which men of power are capable.'
         --Bertrand Russell, _War Crimes in Vietnam_, chapter 1.

Comment 11 Venkatesh Pallipadi 2006-02-01 18:05:02 UTC
There is a acpi_processor_notify_smm() i drivers/acpi/processor_perflib.c that
does exactly this. Other drivers like speedstep-centrino use it.

Not sure whether speedstep-smi needs it as well. Doing something similar in
speedstep-smi is worth a try.
Comment 12 Sanjoy Mahajan 2006-02-01 20:43:25 UTC
I looked around speedstep-smi.c but didn't find an obvious place to
call acpi_processor_notify_smm(), since the speedstep-smi.c and
speedstep-centrino.c modules are so different.  Where would you put
the call?  I'll test and report.

Comment 13 Venkatesh Pallipadi 2006-02-02 17:02:14 UTC
Instead of calling notify_smm(), you can directly write the value to the port, 
by including asm/i.h and using one of outb(value,port) (or outw/outl depending 
on what needs to be written here). Doing it in speedstep_smi_ownership() 
should work.

Comment 14 Sanjoy Mahajan 2006-02-04 22:59:05 UTC
I didn't have the courage to use outb() directly, since I didn't know how to
convert offset 55 in the FADT to an address (the presumably generic) outb()
wants.  But I inserted a call to acpi_processor_notify_smm() near the end of
speedstep_smi_ownership() [diff attached].  It didn't change anything that I
could find.  The first insmod of speedstep-smi fails, and the second one works.
 The only change to the failing dmesg attached already is one more debugging
line I added for the notify_smm call:

 speedstep-smi: trying to obtain ownership with command 47534982 at port b2
 speedstep-smi: result is 0
+speedstep-smi: grabbing w/ acpi_processor_notify_smm: result -5
 speedstep-smi: trying to determine frequencies with command 47534982 at port b2
 speedstep-smi: result 47534982, low_freq 0, high_freq 4
Comment 15 Sanjoy Mahajan 2006-02-04 23:00:13 UTC
Created attachment 7247 [details]
patch to call notify_smm (but it didn't help)
Comment 16 Venkatesh Pallipadi 2006-03-24 09:29:49 UTC
Can you check the latest patch from Andrew in bug #5553.
Comment 17 Sanjoy Mahajan 2006-03-26 12:52:44 UTC
> Can you check the latest patch from Andrew in bug #5553.

It gives me a 

Freeing unused kernel memory: 228k freed
FATAL: kernel too old
kernel panic not syncing: Attempted to kill init!

I thought maybe I forgot to run lilo (hence the 'too old') so I reran
lilo but the error is the same.  This is the command line:

root=305 idebus=66 apm=off acpi=force pci=noacpi console=ttyS0,115200 console=tty0 acpi_sleep=s3_bios cpufreq.debug=7 acpi_dbg_level=0x10 acpi_dbg_layer=0x10

With these config lines:

$ cat /boot/config-2.6.16-rc5.904b0f361ebf | egrep -i 'cpufreq|speedstep'
# CPUFreq processor drivers
CONFIG_X86_ACPI_CPUFREQ=m
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
CONFIG_X86_SPEEDSTEP_ICH=m
CONFIG_X86_SPEEDSTEP_SMI=m
# CONFIG_X86_CPUFREQ_NFORCE2 is not set
# CONFIG_X86_ACPI_CPUFREQ_PROC_INTF is not set
CONFIG_X86_SPEEDSTEP_LIB=m
# CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK is not set

Comment 18 Venkatesh Pallipadi 2007-07-25 19:29:33 UTC
Can you recheck this with latest kernel and reopen bug if the problem persists.
Comment 19 Sanjoy Mahajan 2007-07-25 19:32:33 UTC
> Can you recheck this with latest kernel

Alas, I gave away the TP 600X so I don't have a speedstep_smi machine.

Note You need to log in before you can comment on or make changes to this bug.