Distribution: Debian testing Hardware Environment: TP 600X Software Environment: Problem Description: Running the attached C program will generate these errors, as well as drive the load up to 5 or 6: osl-0965 [2790] os_wait_semaphore : Failed to acquire semaphore[e3fce600|1|0], AE_TIME The errors seem correlated with 'acpi -t' hanging for a fraction of a second. The attached program repeatedly sets the polling frequency to 10, as fast as it can. It was the most reliable way to regenerate the semaphore errors, though they often happen using acpi or acpi -t, or when waking up from S3. See Bug#4967 (which also has an acpidmp). Steps to reproduce:
Created attachment 5522 [details] C program to generate semaphore error THM2 happens to be the name of one of the thermal zones in the TP600X (it has THM0, THM2, THM6, THM7). Probably different in other laptops.
It happens with -rc6 too, so I changed the kernel version accordingly. With -rc6, once the semaphore errors show up, acpi can hang for a few seconds. Here's one of the worse examples: $ time acpi Battery 1: charged, 86% Battery 2: discharging, 68%, 00:54:38 remaining real 0m3.428s user 0m0.001s sys 0m1.594s The system time is a lot, almost 50% of the actual time (spinlocking on a semaphore?).
Please try linux-2.6.13-rc6 with patch filed at bug 3851. Pleast test it with ec_burst=1 and ec_burst=0.
I get the semaphore errors even with that patch, booting either with no ec_burst option (which should be like ec_burst=0 ?) or with ec_burst=1. (This is with the preemptible kernel.)
This AE_TIME error is just a warning. Please ignore it. what about "time acpi"?
acpi still takes a while. Here's a typical 'time acpi' (similar for no ec_burst or for ec_burst=1): $ time acpi Battery 1: charged, 100% Battery 2: charged, 98% real 0m1.390s user 0m0.000s sys 0m1.360s -Sanjoy
please do: echo 0x10 > debug_layer echo 0x10 > debug_level and send out dmesg after run "time acpi"
> send out dmesg after run "time acpi" Even with those debug values, that produced no dmesgs. -Sanjoy
echo -n 0x4100000 > debug_layer echo -n 0xffffffff > debug_level
Hmm, acpi debugging seems a bit broken on this kernel, which may be why I'm not getting any dmesgs. No matter what I echo to debug_layer or debug_level, the values stay the same: # cat /proc/acpi/debug_layer Description Hex SET ACPI_UTILITIES 0x00000001 [*] ACPI_HARDWARE 0x00000002 [*] ACPI_EVENTS 0x00000004 [*] ACPI_TABLES 0x00000008 [*] ACPI_NAMESPACE 0x00000010 [*] ACPI_PARSER 0x00000020 [*] ACPI_DISPATCHER 0x00000040 [*] ACPI_EXECUTER 0x00000080 [*] ACPI_RESOURCES 0x00000100 [*] ACPI_CA_DEBUGGER 0x00000200 [*] ACPI_OS_SERVICES 0x00000400 [*] ACPI_CA_DISASSEMBLER 0x00000800 [*] ACPI_COMPILER 0x00001000 [*] ACPI_TOOLS 0x00002000 [*] ACPI_ALL_DRIVERS 0xFFFF0000 [*] -- debug_layer = 0xFFFF3FFF (* = enabled, - = partial) # cat /proc/acpi/debug_level Description Hex SET ACPI_LV_ERROR 0x00000001 [*] ACPI_LV_WARN 0x00000002 [*] ACPI_LV_INIT 0x00000004 [*] ACPI_LV_DEBUG_OBJECT 0x00000008 [*] ACPI_LV_INFO 0x00000010 [ ] ACPI_LV_INIT_NAMES 0x00000020 [ ] ACPI_LV_PARSE 0x00000040 [ ] ACPI_LV_LOAD 0x00000080 [ ] ACPI_LV_DISPATCH 0x00000100 [ ] ACPI_LV_EXEC 0x00000200 [ ] ACPI_LV_NAMES 0x00000400 [ ] ACPI_LV_OPREGION 0x00000800 [ ] ACPI_LV_BFIELD 0x00001000 [ ] ACPI_LV_TABLES 0x00002000 [ ] ACPI_LV_VALUES 0x00004000 [ ] ACPI_LV_OBJECTS 0x00008000 [ ] ACPI_LV_RESOURCES 0x00010000 [ ] ACPI_LV_USER_REQUESTS 0x00020000 [ ] ACPI_LV_PACKAGE 0x00040000 [ ] ACPI_LV_ALLOCATIONS 0x00100000 [ ] ACPI_LV_FUNCTIONS 0x00200000 [ ] ACPI_LV_OPTIMIZATIONS 0x00400000 [ ] ACPI_LV_MUTEX 0x01000000 [ ] ACPI_LV_THREADS 0x02000000 [ ] # In the debug_level output, it doesn't tell me the current state (it finishes the 'cat' prematurely and then gives me back my shell prompt, the '#' at the end of the last line of the output). I've noticed that problem a few times before, but I never had a problem changing the debug values. Here is /proc/cmdline: BOOT_IMAGE=+three-patches ro root=305 idebus=66 apm=off acpi=force pci=noacpi console=ttyS0,115200 console=tty0 acpi_sleep=s3_bios ec_burst=1 The 'three patches' in the BOOT_IMAGE are the S3 sleep hang workaround from shaohua.li@intel.com for bug #5037 (didn't work but that's another story), the specific-hotkey patch to deal with bug #5035, and the ec.c patch to deal with bug #5008 and #5049. -Sanjoy
Oh, I just found the problem (which may be another bug): echo -n 0x10 > /proc/acpi/debug_level does NOT change the settings, whereas echo 0x10 > /proc/acpi/debug_level works. Maybe some sscanf assuming there's a return at the end? I'll now redo the acpi test and send a dmesg. -Sanjoy
Strange, It works for me. My version is 2.6.13-rc5.
And now it's working for me again (except for the debug_level output being truncated, but that's a longstanding problem for me). It's very strange. Anyway back to the dmesgs... -Sanjoy
With 0x10 as the debug_level and debug_layer: Execute Method: [\_SB_.PCI0.ISA0.EC0_.BAT1._BST] (Node c156f248) Execute Method: [\_SB_.PCI0.ISA0.EC0_.BAT1._BIF] (Node c156f288) Execute Method: [\_SB_.PCI0.ISA0.EC0_.BAT0._BST] (Node c156f608) Execute Method: [\_SB_.PCI0.ISA0.EC0_.BAT0._BIF] (Node c156f648) -Sanjoy
Please try echo 0x100000 > debug_layer echo 0xffffffff > debug_level And post dmesg
> Please try > echo 0x100000 > debug_layer > echo 0xffffffff > debug_level Can't right now: Now the echo -n is not working again, and neither is echo (which probably wasn't working before, it's just that changing the setting started working between my testing 'echo -n' and 'echo'). Notice how in the transcript below debug_layer stays at 0x10 no matter what: # cat /proc/acpi/debug_layer Description Hex SET ACPI_UTILITIES 0x00000001 [ ] ACPI_HARDWARE 0x00000002 [ ] ACPI_EVENTS 0x00000004 [ ] ACPI_TABLES 0x00000008 [ ] ACPI_NAMESPACE 0x00000010 [*] ACPI_PARSER 0x00000020 [ ] ACPI_DISPATCHER 0x00000040 [ ] ACPI_EXECUTER 0x00000080 [ ] ACPI_RESOURCES 0x00000100 [ ] ACPI_CA_DEBUGGER 0x00000200 [ ] ACPI_OS_SERVICES 0x00000400 [ ] ACPI_CA_DISASSEMBLER 0x00000800 [ ] ACPI_COMPILER 0x00001000 [ ] ACPI_TOOLS 0x00002000 [ ] ACPI_ALL_DRIVERS 0xFFFF0000 [ ] -- debug_layer = 0x00000010 (* = enabled, - = partial) # echo -n 0x4100000 > debug_layer # cat /proc/acpi/debug_layer Description Hex SET ACPI_UTILITIES 0x00000001 [ ] ACPI_HARDWARE 0x00000002 [ ] ACPI_EVENTS 0x00000004 [ ] ACPI_TABLES 0x00000008 [ ] ACPI_NAMESPACE 0x00000010 [*] ACPI_PARSER 0x00000020 [ ] ACPI_DISPATCHER 0x00000040 [ ] ACPI_EXECUTER 0x00000080 [ ] ACPI_RESOURCES 0x00000100 [ ] ACPI_CA_DEBUGGER 0x00000200 [ ] ACPI_OS_SERVICES 0x00000400 [ ] ACPI_CA_DISASSEMBLER 0x00000800 [ ] ACPI_COMPILER 0x00001000 [ ] ACPI_TOOLS 0x00002000 [ ] ACPI_ALL_DRIVERS 0xFFFF0000 [ ] -- debug_layer = 0x00000010 (* = enabled, - = partial) # echo 0x4100000 > debug_layer # cat /proc/acpi/debug_layer Description Hex SET ACPI_UTILITIES 0x00000001 [ ] ACPI_HARDWARE 0x00000002 [ ] ACPI_EVENTS 0x00000004 [ ] ACPI_TABLES 0x00000008 [ ] ACPI_NAMESPACE 0x00000010 [*] ACPI_PARSER 0x00000020 [ ] ACPI_DISPATCHER 0x00000040 [ ] ACPI_EXECUTER 0x00000080 [ ] ACPI_RESOURCES 0x00000100 [ ] ACPI_CA_DEBUGGER 0x00000200 [ ] ACPI_OS_SERVICES 0x00000400 [ ] ACPI_CA_DISASSEMBLER 0x00000800 [ ] ACPI_COMPILER 0x00001000 [ ] ACPI_TOOLS 0x00002000 [ ] ACPI_ALL_DRIVERS 0xFFFF0000 [ ] -- debug_layer = 0x00000010 (* = enabled, - = partial)
Please file another bug for debug issue. Thanks.
It's Bug #5054
Created attachment 5619 [details] debugging output for 'time acpi' with ec_burst=1 Setting debug_level and debug_layer started working again, so I took the opportunity to get this debug output from running 'time acpi' (which took 13 seconds with all the debugging generated). I used this script (I set the thermal polling frequencies to 0 before running it so that those msgs wouldn't clutter the debug log, and then set them back to 10 at the end with the 'quiet-fan' script): #!/bin/bash echo 0x100000 > debug_layer echo 0xffffffff > debug_level cat /proc/acpi/debug_layer cat /proc/acpi/debug_level sleep 1 logger BEGIN ACPI DEBUGGING time acpi logger END ACPI DEBUGGING echo 0x10 > debug_layer echo 0x10 > debug_level quiet-fan
Please try this to collect timing info along with debug_layer=0x00100000 debug_level=0xffffffff --- ec.c.0 2005-08-12 23:53:17.000000000 +0800 +++ ec.c 2005-08-12 23:56:34.000000000 +0800 @@ -328,6 +328,7 @@ u8 address, u32 *data) { + printk("...jiffies = %lu \n", jiffies); if (acpi_ec_polling_mode) return acpi_ec_polling_read(ec, address, data); else @@ -339,6 +340,7 @@ u8 address, u8 data) { + printk("...jiffies = %lu \n", jiffies); if (acpi_ec_polling_mode) return acpi_ec_polling_write(ec, address, data); else @@ -607,6 +609,7 @@ union acpi_ec *ec, u32 *data) { + printk("...jiffies = %lu \n", jiffies); if (acpi_ec_polling_mode) return acpi_ec_polling_query(ec, data); else
Created attachment 5620 [details] debug msgs for 'time acpi' acpi debugging is working fine -- my stupdity (see Bug #5054). The attached log file is from running 'time acpi' with debug_layer=0x00100000 and debug_level=0xffffffff and with the ec jiffies patch.
Hi, the log file is not with debug_layer = 0x100000 Please retest. thanks.
Created attachment 5622 [details] debug msgs for 'time acpi' (2nd try) Sorry, my mistake. Here's another attempt. This time it should be right. The script I ran to collect the debugging said this before running 'time acpi', which took 13 seconds (real and system) with all the debugging: debug_layer = 0x00100000 (* = enabled, - = partial)
From the log, The jiffes intervals between two sequential ec operations (acpi_ec_read , acpi_ec_write)is about 40. And the log doesn't show anything abnormal. The root cause of the slowness in accessing battery status could be releated to the mutex--I2CM. Or, this is not acpi driver bug.
*** Bug 4967 has been marked as a duplicate of this bug. ***
Please build kernel 2.6.14-rc1 with acpi debug option disabled. boot kernel with ec_burst=1.
I tested on a sony T26 with kernel of acpi debug option enabled , and kernel of acpi debug option disabled. acpi debug option enabled: real: 0m0.013s user: 0m0.000s sys: 0m0.012s acpi debug option disabled: real: 0m0.003s user: 0m0.000s sys: 0m0.000s So, I believe you should disable acpi debug option. If it is still a issue, please reopen. Thanks.
*** This bug has been marked as a duplicate of 5121 ***
> So, I believe you should disable acpi debug option. I just started running 2.6.14 (with acpi debugging) and it is much faster than 2.6.13 (also with acpi debugging). With 2.6.14: real 0m0.041s user 0m0.004s sys 0m0.020s Which is more than fast enough. Whereas with 2.6.13, it would often hang for half a second or more. So I don't think acpi debugging was the problem, but I also don't know how 2.6.14 improved the situation.