Bug 1185
Summary: | Broken FAN detection prevents booting | ||
---|---|---|---|
Product: | ACPI | Reporter: | Martin Mokrejs (mmokrejs) |
Component: | Power-Fan | Assignee: | Shaohua (shaohua.li) |
Status: | CLOSED CODE_FIX | ||
Severity: | blocking | CC: | acpi-bugzilla, sziwan |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.4.21-pre7 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
dmidecode
acpidmp dmesg Dmesg with the oops |
Created attachment 815 [details]
dmidecode
I have downgraded to older bios, but I was told by Karol Kozimor that this bug
appears there too. I'll add his email message to thi bug report.
Created attachment 816 [details]
acpidmp
Created attachment 817 [details]
dmesg
From: Karol Kozimor <sziwan@hell.org.pl> To: "Brown, Len" <len.brown@intel.com> Cc: Martin Mokrejs <mmokrejs@natur.cuni.cz>, linux-kernel@vger.kernel.org, acpi-devel@lists.sourceforge.net Date: Thu, 4 Sep 2003 10:53:15 +0200 Subject: Re: [ACPI] RE: ACPI kernel crash with 2.4.22-pre7 on ASUS L3800C Thus wrote Brown, Len: > Martin, > Does this still happen with 2.4.22? > If yes, can I trouble you to drop the info into bugzilla so we can put > it in the queue? FYI, I just had it *after* boot, i.e. some 30 seconds after the swsusp resume (trace below), and _again_ _after_ I warm-rebooted the machine using SysRq+B. The subsequent warm-reboot went OK. Linux 2.4.21 + ACPI 20030619 ksymoops 2.4.9 on i686 2.4.21-xacs. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.21-xacs/ (default) -m /usr/src/linux/System.map (default) Warning: You did not tell me where to find symbol information. I will assume that the log matches the kernel and modules that are running right now and I'll use the default options above for symbol resolution. If the current kernel and/or modules do not match the log, you can get more accurate output by telling me the kernel version and where to find map, modules, ksyms etc. ksymoops -h explains the options. c01d7600 Oops: 0000 8139too mii snd-intel8x0 snd-pcm snd-timer snd-ac97-codec snd-page-alloc snd-mpu401-uart snd-rawmidi snd-seq-device snd soundcore ppp_deflate zlib_inflate zlib_deflate ppp_async ppp_generic slhc ptserial pctel sr_mod scsi_mod cdrom radeon agpgart asus_acpi mousedev hid input uhci usbcore ds yenta_socket pcmcia_core CPU: 0 EIP: 0010:[<c01d7600>] Tainted: P Z Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00210293 eax: 00000627 ebx: 872d3184 ecx: cff0fe08 edx: 00000000 esi: 872d3184 edi: cff0fe70 ebp: c01e6bcc esp: cff0fe10 ds: 0018 es: 0018 ss: 0018 Process keventd (pid: 2, stackpage=cff0f000) Stack: 00000000 c01d837d 872d3184 cff0fe48 cff0fe70 872d3184 872d3184 c01e6c6f 872d3184 c01e6bcc cff0fe70 872d3184 cff0fe74 cff0fea0 00010000 c02912cb c0291280 c01ee511 872d3184 cff0fe70 872d3184 cff0fea4 cff12e00 00000000 Call Trace: [<c01d837d>] [<c01e6c6f>] [<c01e6bcc>] [<c01ee511>] [<c01ee964>] [<c01eed14>] [<c01e70bc>] [<c01f2c73>] [<c01f2f10>] [<c01bdc09>] [<c0118d5c>] [<c011fecd>] [<c0105668>] Code: 80 3b aa 75 0b 89 d8 eb 09 8d b4 26 00 00 00 00 31 c0 5b c3 >>EIP; c01d7600 <acpi_ns_map_handle_to_node+1c/30> <===== >>ecx; cff0fe08 <_end+fbda4b0/124de708> >>edi; cff0fe70 <_end+fbda518/124de708> >>ebp; c01e6bcc <acpi_bus_data_handler+0/44> >>esp; cff0fe10 <_end+fbda4b8/124de708> Trace; c01d837d <acpi_get_data+39/6a> Trace; c01e6c6f <acpi_bus_get_device+5f/b4> Trace; c01e6bcc <acpi_bus_data_handler+0/44> Trace; c01ee511 <acpi_power_get_context+61/cc> Trace; c01ee964 <acpi_power_off_device+4c/1e0> Trace; c01eed14 <acpi_power_transition+100/15c> Trace; c01e70bc <acpi_bus_set_power+1b0/29c> Trace; c01f2c73 <acpi_thermal_active+d3/1cc> Trace; c01f2f10 <acpi_thermal_check+18c/2ac> Trace; c01bdc09 <acpi_os_execute_deferred+5d/7c> Trace; c0118d5c <__run_task_queue+50/5c> Trace; c011fecd <context_thread+121/1a0> Trace; c0105668 <arch_kernel_thread+28/38> Code; c01d7600 <acpi_ns_map_handle_to_node+1c/30> 00000000 <_EIP>: Code; c01d7600 <acpi_ns_map_handle_to_node+1c/30> <===== 0: 80 3b aa cmpb $0xaa,(%ebx) <===== Code; c01d7603 <acpi_ns_map_handle_to_node+1f/30> 3: 75 0b jne 10 <_EIP+0x10> Code; c01d7605 <acpi_ns_map_handle_to_node+21/30> 5: 89 d8 mov %ebx,%eax Code; c01d7607 <acpi_ns_map_handle_to_node+23/30> 7: eb 09 jmp 12 <_EIP+0x12> Code; c01d7609 <acpi_ns_map_handle_to_node+25/30> 9: 8d b4 26 00 00 00 00 lea 0x0(%esi,1),%esi Code; c01d7610 <acpi_ns_map_handle_to_node+2c/30> 10: 31 c0 xor %eax,%eax Code; c01d7612 <acpi_ns_map_handle_to_node+2e/30> 12: 5b pop %ebx Code; c01d7613 <acpi_ns_map_handle_to_node+2f/30> 13: c3 ret 1 warning issued. Results may not be reliable. [the kernel is tainted by swsusp and pctel module, but it shouldn't really matter since this oops happens mainly at boot] I'll have yet to see if it still happens with 2.4.22. Best regards, -- Karol 'sziwan' Kozimor sziwan@hell.org.pl From: Karol Kozimor <sziwan@hell.org.pl> To: Martin MOKREJ Would you please turn on ACPI debug option and have kernel-2.6.0-test5 a try. Please attach dmesg of boot and dmesg of problem happening. Thanks a lot. Hi, I can't reproduce this using 2.6.0-test5-mm2, but that's not authoritative (I'm not really using 2.6 as my main kernel). Or, more specifically, there are errors (see below) reported, but no oops. Under 2.4 the oops mostly happens only when booting on battery, I can't remember having seen that when using AC (which doesn't necessarily mean it didn't occur at all, I could have disregarded it). Oddly enough, there are times when even on battery the oops cannot be triggered, this may depend on the temperature of the machine -- it has an active trip point at 40 degrees celsius and if it is passed during boot, the oops is likely. I'll attach the dmesg later on, the only output when the problem occurs (apart from the oops itself) are these lines: acpi_power-0363 [842] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [841] acpi_bus_set_power : Error transitioning device [CFAN] to D3 acpi_thermal-0567 [840] acpi_thermal_active : Unable to turn cooling device [c12d24a8] 'off' (or D0, respectively) To spice the problem up, the above lines were a result of a resume of a suspended kernel (swsusp), when the cooling would decrease the temperature below 40 (where the fan should turn off) -- what is odd, is that the machine was both suspended and resumed on AC... (no oops this time) After the oops, the machine (at least mine) is quite unstable, but usable. The only problem is that the keyboard stops responding (sometimes SysRq works), but the Synaptics Touchpad works fine: /proc/interrupts shows nothing abnormal. As for the debugging options: I have debug statements compiled in by default, but I have no clue on what to set to debug_* options. My only attempts resulted in getting megabytes of logs, but that's not what you want probably, right? Maybe, if those debug flags were somehow documented (hint hint)... can you attach the full dmesg when oops?
can you add the below code into 'acpi_fan_add' of fan.c:
>printk("%d devices in D0,%d devices in D3\n",device->power.states
[ACPI_STATE_D0].resources.count, device->power.states
[ACPI_STATE_D3].resources.count),
and what it happen when boot?
Created attachment 943 [details] Dmesg with the oops This is the dmesg with the oops at the bottom. Note that the oops is trigerred manually, by switching cooling_mode to passive and back to active. Below is what ksymoops says (though its output no different than others): Unable to handle kernel paging request at virtual address 872c31c4 c01fd497 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[<c01fd497>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: 00000000 ebx: 872c31c4 ecx: cff0fdd8 edx: 00000006 esi: 872c31c4 edi: cff0fe34 ebp: c020c608 esp: cff0fde0 ds: 0018 es: 0018 ss: 0018 Process keventd (pid: 2, stackpage=cff0f000) Stack: 00001001 c01fe17f 872c31c4 cff0fe0c cff0fe34 872c31c4 cff0fe70 c020c686 872c31c4 c020c608 cff0fe34 00010000 c02cd715 c02cd6ca 00000050 cff0fe38 cff0fe38 872c31c4 c0213efe 872c31c4 cff0fe34 00000000 00800000 c02ce900 Call Trace: [<c01fe17f>] [<c020c686>] [<c020c608>] [<c0213efe>] [<c02142f5>] [<c0214651>] [<c020ca83>] [<c0208e00>] [<c0218342>] [<c02186bf>] [<c01e484d>] [<c011d02a>] [<c01256b3>] [<c0125580>] [<c0105000>] [<c01058ee>] [<c0125580>] Code: 80 3b aa 0f 44 c3 5b c3 a1 d4 6b 3d c0 eb f7 8b 44 24 04 c3 >>EIP; c01fd497 <acpi_ns_map_handle_to_node+17/26> <===== >>ecx; cff0fdd8 <_end+fb1a000/1241e288> >>edi; cff0fe34 <_end+fb1a05c/1241e288> >>ebp; c020c608 <acpi_bus_data_handler+0/39> >>esp; cff0fde0 <_end+fb1a008/1241e288> Trace; c01fe17f <acpi_get_data+38/5d> Trace; c020c686 <acpi_bus_get_device+45/ae> Trace; c020c608 <acpi_bus_data_handler+0/39> Trace; c0213efe <acpi_power_get_context+4a/ae> Trace; c02142f5 <acpi_power_off_device+4a/1a7> Trace; c0214651 <acpi_power_transition+113/13c> Trace; c020ca83 <acpi_bus_set_power+170/298> Trace; c0208e00 <acpi_ut_track_stack_ptr+1f/26> Trace; c0218342 <acpi_thermal_active+c4/190> Trace; c02186bf <acpi_thermal_check+29d/2ec> Trace; c01e484d <acpi_os_execute_deferred+39/75> Trace; c011d02a <__run_task_queue+5a/70> Trace; c01256b3 <context_thread+133/1d0> Trace; c0125580 <context_thread+0/1d0> Trace; c0105000 <_stext+0/0> Trace; c01058ee <arch_kernel_thread+2e/40> Trace; c0125580 <context_thread+0/1d0> Code; c01fd497 <acpi_ns_map_handle_to_node+17/26> 00000000 <_EIP>: Code; c01fd497 <acpi_ns_map_handle_to_node+17/26> <===== 0: 80 3b aa cmpb $0xaa,(%ebx) <===== Code; c01fd49a <acpi_ns_map_handle_to_node+1a/26> 3: 0f 44 c3 cmove %ebx,%eax Code; c01fd49d <acpi_ns_map_handle_to_node+1d/26> 6: 5b pop %ebx Code; c01fd49e <acpi_ns_map_handle_to_node+1e/26> 7: c3 ret Code; c01fd49f <acpi_ns_map_handle_to_node+1f/26> 8: a1 d4 6b 3d c0 mov 0xc03d6bd4,%eax Code; c01fd4a4 <acpi_ns_map_handle_to_node+24/26> d: eb f7 jmp 6 <_EIP+0x6> Code; c01fd4a6 <acpi_ns_convert_entry_to_handle+0/5> f: 8b 44 24 04 mov 0x4(%esp,1),%eax Code; c01fd4aa <acpi_ns_convert_entry_to_handle+4/5> 13: c3 ret can you directly open or close the fan, linke that: echo "0" >/proc/acpi/fan/CFAN/state echo "3" >/proc/acpi/fan/CFAN/state In the dsdt, I get below code:
>Name (CFST, Zero)
>Method (_ON, 0, NotSerialized)
>{
> Store (One, CFST)
>}
>Method (_OFF, 0, NotSerialized)
>{
> Store (Zero, CFST)
>}
CFST don't associate with any ioports, so _ON and _OFF can't control CFAN at
all. I guess it's a BIOS error. But it doesn't means it's the reason of oops.
No, echoing neither 0 nor 3 gives any effect on the fan or on the system whatsoever, except for the already mentioned lines appearing in the logs. BTW: A seemingly reliable way to reproduce the oops: 1. Boot 2. echo 1 > /proc/acpi/thermal_zone/THRM/cooling_mode [the fan is in D3 now] 3. echo 0 > /proc/acpi/fan/CFAN/state [error in the logs, no effect on the fan] 4. echo 3 > /proc/acpi/fan/CFAN/state [Oops] ok, did you mean directly opening or closing the fan will cause error yet? so, we can narrow the problem: CFAN device has error. is it true? so please test without thermal zone, in this way we can easily find out the cause. As I mentioned, your DSDT can't control the physical fan( hardware ), that is the CFAN is a pseudo device. so, I create a pseudo fan in my machine, but no error occur. I added the code you asked about. Funny thing is that even if the fan is off, this code will show that it's in D0, specifically: 1 devices in D0,0 devices in D3 ACPI: Fan [CFAN] (off) 1 devices in D0,0 devices in D3 ACPI: Processor [CPU0] (supports C1 C2, 8 throttling states) ACPI: Thermal Zone [THRM] (34 C) (note the temp is below the active trip point) I'll try to boot without the thermal zone module, but it'll take a while, since my kernel tree got somehow corrupted. Anyway, I doubt the problem is in the thermal code: the oopses only happen when the system is about to change the state of the fan (i.e. by accessing CFAN device), which happens at boot (if temperature passes the trip point), and at other times when polling_frequency is set. Otherwise, the thermal code works well and the SLMT method does its job of handling the fan quite well indeed. It seems to me that the problem is indeed in the fan handling, and not in the thermal code, especially that the oops may be trigerred without touching thermal-specific code at all (i.e. echo [03] > CFAN/state). A tangential question: how to determine if a 0x80 thermal event is issued? It would seem to me that the specific GPE (_L00) is executed at times (the system notices temperature changes and runs SLMT), but no such events are passed to the userspace (i.e. /proc/acpi/event)? I compiled the kernel with CONFIG_ACPI_FAN=m and CONFIG_ACPI_THERMAL=m. (2.4.22 with 20030918 ACPI patch) # modprobe fan # echo 3 > /proc/fan/CFAN/state # echo 0 > /proc/fan/CFAN/state [Oops] OTOH, if I modprobe thermal.o without fan.o loaded, the following appears: Unable to handle kernel paging request at virtual address 876c33c4 *pde = 00000000 c01ff71c Oops: 0000 CPU: 0 EIP: 0010:[<c01ff71c>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: 00000000 ebx: 876c33c4 ecx: cebddc94 edx: 00000006 esi: 876c33c4 edi: cebddcf0 ebp: c020e88c esp: cebddc9c ds: 0018 es: 0018 ss: 0018 Process modprobe.old (pid: 1008, stackpage=cebdd000) Stack: 00001001 c0200403 876c33c4 cebddcc8 cebddcf0 876c33c4 cebddd2c c020e90a 876c33c4 c020e88c cebddcf0 00010000 c02cb9c0 c02cb975 00000050 cebddcf4 cebddcf4 876c33c4 c0215cd2 876c33c4 cebddcf0 00000000 00800000 c02ccaf6 Call Trace: [<c0200403>] [<c020e90a>] [<c020e88c>] [<c0215cd2>] [<c02160c9>] [<c0216425>] [<c020ed07>] [<c020b100>] [<d28ebbb6>] [<d28ed41c>] [<d28ed1d6>] [<d28ebf33>] [<d28eca93>] [<d28ed44e>] [<d28ed1d6>] [<d28ed5d2>] [<d28eceaf>] [<d28ed67e>] [<d28ed1d6>] [<d28edd40>] [<c020f782>] [<d28edd40>] [<c020f900>] [<d28edd40>] [<d28edd48>] [<c020f847>] [<c020f32b>] [<d28edd40>] [<d28edd40>] [<c020fc12>] [<c020f847>] [<d28edd40>] [<d28ed09e>] [<d28ed0d0>] [<d28edd40>] [<d28ed6da>] [<d28ed1d6>] [<c0119368>] [<d28eb060>] [<d28eb060>] [<c01075df>] Code: 80 3b aa 0f 44 c3 5b c3 a1 f4 77 35 c0 eb f7 8b 44 24 04 c3 >>EIP; c01ff71c <acpi_ns_map_handle_to_node+17/26> <===== >>ecx; cebddc94 <_end+e8672bc/1249d688> >>edi; cebddcf0 <_end+e867318/1249d688> >>ebp; c020e88c <acpi_bus_data_handler+0/39> >>esp; cebddc9c <_end+e8672c4/1249d688> Trace; c0200403 <acpi_get_data+38/5d> Trace; c020e90a <acpi_bus_get_device+45/ae> Trace; c020e88c <acpi_bus_data_handler+0/39> Trace; c0215cd2 <acpi_power_get_context+4a/ae> Trace; c02160c9 <acpi_power_off_device+4a/1a7> Trace; c0216425 <acpi_power_transition+113/13c> Trace; c020ed07 <acpi_bus_set_power+170/298> Trace; c020b100 <acpi_ut_debug_print+75/9f> Trace; d28ebbb6 <[thermal]acpi_thermal_active+c4/190> Trace; d28ed41c <[thermal].text.end+247/a2b> Trace; d28ed1d6 <[thermal].text.end+1/a2b> Trace; d28ebf33 <[thermal]acpi_thermal_check+29d/2ec> Trace; d28eca93 <[thermal]acpi_thermal_add_fs+176/243> Trace; d28ed44e <[thermal].text.end+279/a2b> Trace; d28ed1d6 <[thermal].text.end+1/a2b> Trace; d28ed5d2 <[thermal].text.end+3fd/a2b> Trace; d28eceaf <[thermal]acpi_thermal_add+f9/1b9> Trace; d28ed67e <[thermal].text.end+4a9/a2b> Trace; d28ed1d6 <[thermal].text.end+1/a2b> Trace; d28edd40 <[thermal]acpi_thermal_driver+0/d4> Trace; c020f782 <acpi_bus_driver_init+6f/134> Trace; d28edd40 <[thermal]acpi_thermal_driver+0/d4> Trace; c020f900 <acpi_bus_attach+b9/138> Trace; d28edd40 <[thermal]acpi_thermal_driver+0/d4> Trace; d28edd48 <[thermal]acpi_thermal_driver+8/d4> Trace; c020f847 <acpi_bus_attach+0/138> Trace; c020f32b <acpi_bus_walk+b1/cc> Trace; d28edd40 <[thermal]acpi_thermal_driver+0/d4> Trace; d28edd40 <[thermal]acpi_thermal_driver+0/d4> Trace; c020fc12 <acpi_bus_register_driver+a4/dc> Trace; c020f847 <acpi_bus_attach+0/138> Trace; d28edd40 <[thermal]acpi_thermal_driver+0/d4> Trace; d28ed09e <[thermal]acpi_thermal_init+39/ac> Trace; d28ed0d0 <[thermal]acpi_thermal_init+6b/ac> Trace; d28edd40 <[thermal]acpi_thermal_driver+0/d4> Trace; d28ed6da <[thermal].text.end+505/a2b> Trace; d28ed1d6 <[thermal].text.end+1/a2b> Trace; c0119368 <sys_init_module+538/690> Trace; d28eb060 <[thermal]acpi_thermal_get_temperature+0/a4> Trace; d28eb060 <[thermal]acpi_thermal_get_temperature+0/a4> Trace; c01075df <system_call+33/38> Code; c01ff71c <acpi_ns_map_handle_to_node+17/26> 00000000 <_EIP>: Code; c01ff71c <acpi_ns_map_handle_to_node+17/26> <===== 0: 80 3b aa cmpb $0xaa,(%ebx) <===== Code; c01ff71f <acpi_ns_map_handle_to_node+1a/26> 3: 0f 44 c3 cmove %ebx,%eax Code; c01ff722 <acpi_ns_map_handle_to_node+1d/26> 6: 5b pop %ebx Code; c01ff723 <acpi_ns_map_handle_to_node+1e/26> 7: c3 ret Code; c01ff724 <acpi_ns_map_handle_to_node+1f/26> 8: a1 f4 77 35 c0 mov 0xc03577f4,%eax Code; c01ff729 <acpi_ns_map_handle_to_node+24/26> d: eb f7 jmp 6 <_EIP+0x6> Code; c01ff72b <acpi_ns_convert_entry_to_handle+0/5> f: 8b 44 24 04 mov 0x4(%esp,1),%eax Code; c01ff72f <acpi_ns_convert_entry_to_handle+4/5> 13: c3 ret using below code to gather some info, what will it happen?
--- power.c 2003-08-29 08:49:20.000000000 +0800
+++ power.c.new 2003-10-09 17:07:18.000000000 +0800
@@ -326,6 +326,7 @@
cl = &device->power.states[device->power.state].resources;
tl = &device->power.states[state].resources;
+ printk("Change device [%s] from D%c to D%c,power resources
number:current %d,target %d\n", device->pnp.bus_id, '0'+device-
>power.state, '0'+state,cl->count,tl->count);
device->power.state = ACPI_STATE_UNKNOWN;
if (!cl->count && !tl->count) {
I guess the number is wrong.
Using the above code: # cd /proc/acpi/fan/CFAN # cat state status: on # echo 3 > state Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 acpi_power-0366 [32] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [31] acpi_bus_set_power : Error transitioning device [CFAN] to D3 [OK so far] [switched to passive cooling] # cat state status: off # echo 0 > state [as above] # echo 3 > state Change device [CFAN] from D/ to D3,power resources ^I^I^Inumber:current 659317645,target 0 [Oops] Weird, huh? Especially the "D/" thing (that's how syslog states, at least). Oct 10 19:16:54 vrapenec kernel: spurious 8259A interrupt: IRQ7. Oct 10 19:17:13 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 10 19:17:13 vrapenec kernel: acpi_power-0365 [36] acpi_power_transition : Error transitioning device [CFAN] to D3 Oct 10 19:17:13 vrapenec kernel: acpi_bus-0496 [35] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Oct 10 19:17:24 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Oct 10 19:17:27 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 10 19:17:27 vrapenec kernel: acpi_power-0365 [38] acpi_power_transition : Error transitioning device [CFAN] to D3 Oct 10 19:17:27 vrapenec kernel: acpi_bus-0496 [37] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Oct 10 19:17:57 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Oct 10 19:18:17 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 10 19:18:58 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Oct 10 19:18:58 vrapenec kernel: acpi_power-0365 [48] acpi_power_transition : Error transitioning device [CFAN] to D0 Oct 10 19:18:58 vrapenec kernel: acpi_bus-0496 [47] acpi_bus_set_power : Error transitioning device [CFAN] to D0 Oct 10 19:19:00 vrapenec kernel: Change device [CFAN] from D/ to D3,power resources number:current 1111804189,target 0 Tested with 2.4.23-pre5-acpi-20030918 with the printk patch as requested below During those "echo $number > " steps I got only oopses in system, my shell got killed. Unfortunately, I decided to connect to net, and that raised my FAN to max RPM and locked computer. ctrl+alt+del worked, kill/initd during init 6 complained that some processes are locked. Do you need the stacktrace resolved? The machine does not reboot sometimes, I get instead: Power down. host/usb-uhci.c: interrupt, status 20, frame #0 host/usb-uhci.c: Host controller halted, trying to restart. .... and nothing happens. Holding 5s the power button helps on L3800C. Oct 10 19:16:54 vrapenec kernel: spurious 8259A interrupt: IRQ7. Oct 10 19:17:13 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 10 19:17:13 vrapenec kernel: acpi_power-0365 [36] acpi_power_transition : Error transitioning device [CFAN] to D3 Oct 10 19:17:13 vrapenec kernel: acpi_bus-0496 [35] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Oct 10 19:17:24 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Oct 10 19:17:27 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 10 19:17:27 vrapenec kernel: acpi_power-0365 [38] acpi_power_transition : Error transitioning device [CFAN] to D3 Oct 10 19:17:27 vrapenec kernel: acpi_bus-0496 [37] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Oct 10 19:17:57 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Oct 10 19:18:17 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 10 19:18:58 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Oct 10 19:18:58 vrapenec kernel: acpi_power-0365 [48] acpi_power_transition : Error transitioning device [CFAN] to D0 Oct 10 19:18:58 vrapenec kernel: acpi_bus-0496 [47] acpi_bus_set_power : Error transitioning device [CFAN] to D0 Oct 10 19:19:00 vrapenec kernel: Change device [CFAN] from D/ to D3,power resources number:current 1111804189,target 0 Tested with 2.4.23-pre5-acpi-20030918 with the printk patch as requested below During those "echo $number > " steps I got only oopses in system, my shell got killed. Unfortunately, I decided to connect to net, and that raised my FAN to max RPM and locked computer. ctrl+alt+del worked, kill/initd during init 6 complained that some processes are locked. Do you need the stacktrace resolved? The machine does not reboot sometimes, I get instead: Power down. host/usb-uhci.c: interrupt, status 20, frame #0 host/usb-uhci.c: Host controller halted, trying to restart. .... and nothing happens. Holding 5s the power button helps on L3800C. I did few more tests. Here's some older output from 2.4.23-pre5-acpi-20030918, which I thought did not get stored on the disk as the computer locked ... but we were lucky: spurious 8259A interrupt: IRQ7. Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 acpi_power-0365 [36] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [35] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 acpi_power-0365 [38] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [37] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 3 > /proc/acpi/fan/CFAN/state Unable to handle kernel paging request at virtual address 43860570 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[<c01eef28>] Not tainted EFLAGS: 00010246 eax: 00000000 ebx: 43860570 ecx: f1b7fe24 edx: 00000006 esi: 43860570 edi: f1b7fe80 ebp: c01fe098 esp: f1b7fe2c ds: 0018 es: 0018 ss: 0018 Process bash (pid: 2233, stackpage=f1b7f000) Stack: 00001001 c01efc0f 43860570 f1b7fe58 f1b7fe80 43860570 f1b7febc c01fe116 43860570 c01fe098 f1b7fe80 00010000 c036e679 c036e62e 00000050 f1b7fe84 f1b7fe84 43860570 c0205a3a 43860570 f1b7fe80 00000000 00800000 c036f877 Call Trace: [<c01efc0f>] [<c01fe116>] [<c01fe098>] [<c0205a3a>] [<c0205e31>] [<c02061c5>] [<c01fe513>] [<c01fa900>] [<c020339c>] [<c015d120>] [<c013ad73>] [<c010749f>] Code: 80 3b aa 0f 44 c3 5b c3 a1 d4 d0 43 c0 eb f7 8b 44 24 04 c3 This is vrapenec.gsf.de.gsf.de (Linux i686 2.4.23-pre5) 19:19:01 vrapenec.gsf.de login: spurious 8259A interrupt: IRQ7. Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 acpi_power-0365 [36] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [35] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 acpi_power-0365 [38] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [37] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 acpi_power-0365 [48] acpi_power_transition : Error transitioning device [CFAN] to D0 acpi_bus-0496 [47] acpi_bus_set_power : Error transitioning device [CFAN] to D0 Change device [CFAN] from D/ to D3,power resources number:current 1111804189,target 0 Unable to handle kernel paging request at virtual address 43860570 printing eip: c01eef28 *pde = 00000000 Oops: 0000 CPU: 0 EIP: 0010:[<c01eef28>] Not tainted EFLAGS: 00010246 eax: 00000000 ebx: 43860570 ecx: f1b7fe24 edx: 00000006 esi: 43860570 edi: f1b7fe80 ebp: c01fe098 esp: f1b7fe2c ds: 0018 es: 0018 ss: 0018 Process bash (pid: 2233, stackpage=f1b7f000) Stack: 00001001 c01efc0f 43860570 f1b7fe58 f1b7fe80 43860570 f1b7febc c01fe116 43860570 c01fe098 f1b7fe80 00010000 c036e679 c036e62e 00000050 f1b7fe84 f1b7fe84 43860570 c0205a3a 43860570 f1b7fe80 00000000 00800000 c036f877 Call Trace: [<c01efc0f>] [<c01fe116>] [<c01fe098>] [<c0205a3a>] [<c0205e31>] [<c02061c5>] [<c01fe513>] [<c01fa900>] [<c020339c>] [<c015d120>] [<c013ad73>] [<c010749f>] Code: 80 3b aa 0f 44 c3 5b c3 a1 d4 d0 43 c0 eb f7 8b 44 24 04 c3 And here is 2.4.23-pre7: vrapenec root # tail -f /var/log/kern.log & [1] 2016 Oct 13 00:26:39 vrapenec kernel: [drm] AGP 0.99 Aperture @ 0xe0000000 256MB Oct 13 00:26:39 vrapenec kernel: [drm] Initialized radeon 1.7.0 20020828 on minor 0 Oct 13 00:26:39 vrapenec kernel: ISO 9660 Extensions: Microsoft Joliet Level 1 Oct 13 00:26:39 vrapenec kernel: ISOFS: changing to secondary root Oct 13 00:26:39 vrapenec kernel: kjournald starting. Commit interval 5 seconds Oct 13 00:26:39 vrapenec kernel: EXT3-fs warning: maximal mount count reached, running e2fsck is recommended Oct 13 00:26:39 vrapenec kernel: EXT3 FS 2.4-0.9.19, 19 August 2002 on ide0(3,3), internal journal Oct 13 00:26:39 vrapenec kernel: EXT3-fs: recovery complete. Oct 13 00:26:39 vrapenec kernel: EXT3-fs: mounted filesystem with ordered data mode. Oct 13 00:26:39 vrapenec kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:30:19 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 13 00:30:19 vrapenec kernel: acpi_power-0365 [30] acpi_power_transition : Error transitioning device [CFAN] to D3 Oct 13 00:30:19 vrapenec kernel: acpi_bus-0496 [29] acpi_bus_set_power : Error transitioning device [CFAN] to D3 echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:30:35 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:30:42 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 Oct 13 00:30:42 vrapenec kernel: acpi_power-0365 [32] acpi_power_transition : Error transitioning device [CFAN] to D3 Oct 13 00:30:42 vrapenec kernel: acpi_bus-0496 [31] acpi_bus_set_power : Error transitioning device [CFAN] to D3 echo 1 > /proc/acpi/thermal_zone/THRM/cooling_mode vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:30:54 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:30:58 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:31:14 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 Oct 13 00:31:14 vrapenec kernel: acpi_power-0365 [40] acpi_power_transition : Error transitioning device [CFAN] to D0 Oct 13 00:31:14 vrapenec kernel: acpi_bus-0496 [39] acpi_bus_set_power : Error transitioning device [CFAN] to D0 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:31:20 vrapenec kernel: Change device [CFAN] from D/ to D3,power resources number:current 0,target 0 Oct 13 00:31:20 vrapenec kernel: acpi_power-0365 [42] acpi_power_transition : Error transitioning device [CFAN] to D3 Oct 13 00:31:20 vrapenec kernel: acpi_bus-0496 [41] acpi_bus_set_power : Error transitioning device [CFAN] to D3 echo 0 > /proc/acpi/thermal_zone/THRM/cooling_mode vrapenec root # Oct 13 00:31:39 vrapenec kernel: Change device [CFAN] from D/ to D0,power resources number:current 0,target 1 echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:32:16 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:32:31 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:32:41 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:32:46 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 1 > /proc/acpi/thermal_zone/THRM/cooling_mode vrapenec root # Oct 13 00:32:52 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:33:03 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 1 > /proc/acpi/thermal_zone/THRM/cooling_mode vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:33:32 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # echo 0 > /proc/acpi/thermal_zone/THRM/cooling_mode vrapenec root # Oct 13 00:33:44 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:34:05 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # echo 1 > /proc/acpi/thermal_zone/THRM/cooling_mode vrapenec root # echo 1 > /proc/acpi/thermal_zone/THRM/cooling_mode vrapenec root # echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:34:36 vrapenec kernel: Change device [CFAN] from D3 to D0,power resources number:current 0,target 1 echo 0 > /proc/acpi/fan/CFAN/state vrapenec root # echo 3 > /proc/acpi/fan/CFAN/state vrapenec root # Oct 13 00:34:40 vrapenec kernel: Change device [CFAN] from D0 to D3,power resources number:current 1,target 0 My conclusion is that the state get's set to "D/" only sometimes when acpi_power-0365 and acpi_bus-0496 give error. Often, the echo commands change the state but the acpi_power and acpi_bus- are somehow not called and therefore no error occurs. I did not manage yet to get Ooops on this 2.4.23-pre7. right, ACPI_STATE_UNKNOWN + '0' = '/'
>Unable to handle kernel paging request at virtual address 43860570
43860570 is below c0000000. sounds like fan device->power.states
[ACPI_STATE_D0].resources.handles have errors. But they have been correct,
because when initialize fan, we must use these handles. I guess they are
changed uncorrectly after initializing fan. Try don't insert fan & thermalzone
modules, and directly control power resource PRCF, what will it happen?
Controlling PRCF on a kernel without thermal and fan modules loaded produces no visible output (i.e. neither any trace in logs, nor a change in PRCF/state). I'll be compiling 2.4.22 with 20031002 ACPI to see if it makes any difference (so far, 20030918 oopses). I applied the changes Martin proposed. I'm still using 20030918 code, for what matters. As usual, hand-triggering the oops doesn't work if cooling_mode is active, so I switch to passive later on. Oh, BTW: why does fan state change internally (in the code), even though the state file still says: on? I.e. if the fan is in D0, echo 3 > state won't turn it off (will produce "error transitioning from D0 to D3"), but subsequent echo 0 > state will produce "error transitioning from D3 to D0" (that's the first two changes in the log below), as if the fan was actually off, which is obviously wrong. Here's the guts: Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 acpi_power-0368 [32] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [31] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources ^I^I^Inumber:current 0,target 1 Target device's handle:c12d21a8 Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 acpi_power-0368 [34] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [33] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 acpi_power-0368 [38] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [37] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources ^I^I^Inumber:current 0,target 1 Target device's handle:c12d21a8 Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 Change device [CFAN] from D3 to D0,power resources ^I^I^Inumber:current 0,target 1 Target device's handle:c12d21a8 acpi_power-0368 [49] acpi_power_transition : Error transitioning device [CFAN] to D0 acpi_bus-0496 [48] acpi_bus_set_power : Error transitioning device [CFAN] to D0 Change device [CFAN] from D/ to D0,power resources ^I^I^Inumber:current 659448717,target 1 Target device's handle:c12d21a8 Current device's handle:872c33cc Unable to handle kernel paging request at virtual address 872c33cc *pde = 00000000 c02009bc Oops: 0000 CPU: 0 EIP: 0010:[<c02009bc>] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010246 eax: 00000000 ebx: 872c33cc ecx: c133de10 edx: 00000006 esi: 872c33cc edi: c133de6c ebp: c020fb2c esp: c133de18 ds: 0018 es: 0018 ss: 0018 Process bash (pid: 1216, stackpage=c133d000) Stack: 00001001 c02016a3 872c33cc c133de44 c133de6c 872c33cc c133dea8 c020fbaa 872c33cc c020fb2c c133de6c 00010000 c02cf380 c02cf335 00000050 c133de70 c133de70 872c33cc c02174ce 872c33cc c133de6c 00000000 00800000 c02d0588 Call Trace: [<c02016a3>] [<c020fbaa>] [<c020fb2c>] [<c02174ce>] [<c02178c5>] [<c0217bcb>] [<c0217c70>] [<c020ffa7>] [<c020c400>] [<c0214e30>] [<c0157b72>] [<c016a0f8>] [<c0147f5d>] [<c01078af>] Code: 80 3b aa 0f 44 c3 5b c3 a1 f4 b7 35 c0 eb f7 8b 44 24 04 c3 >>EIP; c02009bc <acpi_ns_map_handle_to_node+17/26> <===== >>ecx; c133de10 <_end+fc3438/12499688> >>edi; c133de6c <_end+fc3494/12499688> >>ebp; c020fb2c <acpi_bus_data_handler+0/39> >>esp; c133de18 <_end+fc3440/12499688> Trace; c02016a3 <acpi_get_data+38/5d> Trace; c020fbaa <acpi_bus_get_device+45/ae> Trace; c020fb2c <acpi_bus_data_handler+0/39> Trace; c02174ce <acpi_power_get_context+4a/ae> Trace; c02178c5 <acpi_power_off_device+4a/1a7> Trace; c0217bcb <acpi_power_transition+bd/1a5> Trace; c0217c70 <acpi_power_transition+162/1a5> Trace; c020ffa7 <acpi_bus_set_power+170/298> Trace; c020c400 <acpi_ut_trace+a/2c> Trace; c0214e30 <acpi_fan_write_state+b1/de> Trace; c0157b72 <dupfd+52/70> Trace; c016a0f8 <proc_file_write+98/e0> Trace; c0147f5d <sys_write+ad/1e0> Trace; c01078af <system_call+33/38> Code; c02009bc <acpi_ns_map_handle_to_node+17/26> 00000000 <_EIP>: Code; c02009bc <acpi_ns_map_handle_to_node+17/26> <===== 0: 80 3b aa cmpb $0xaa,(%ebx) <===== Code; c02009bf <acpi_ns_map_handle_to_node+1a/26> 3: 0f 44 c3 cmove %ebx,%eax Code; c02009c2 <acpi_ns_map_handle_to_node+1d/26> 6: 5b pop %ebx Code; c02009c3 <acpi_ns_map_handle_to_node+1e/26> 7: c3 ret Code; c02009c4 <acpi_ns_map_handle_to_node+1f/26> 8: a1 f4 b7 35 c0 mov 0xc035b7f4,%eax Code; c02009c9 <acpi_ns_map_handle_to_node+24/26> d: eb f7 jmp 6 <_EIP+0x6> Code; c02009cb <acpi_ns_convert_entry_to_handle+0/5> f: 8b 44 24 04 mov 0x4(%esp,1),%eax Code; c02009cf <acpi_ns_convert_entry_to_handle+4/5> 13: c3 ret I've just tested without thermal.o: 1) If thermal.o is not present, it's not possible to trigger the oops (one needs to switch the fan off, and this is done either when the temp drops below 40C, which is next to impossible during normal operation, or when cooling_mode is switched). 2) If thermal.o was loaded, cooling_mode was switched to passive, and the module immediately unloaded: the oops occurs in the same situation (see log below). FYI: the output in the middle of the log comes from switching the cooling_mode, and not from manually trying to switch the fan. Therefore, no errors seem to occur. [echo n > fan/CFAN/state] Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 acpi_power-0368 [34] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [33] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources ^I^I^Inumber:current 0,target 1 Target device's handle:c12d21a8 Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 acpi_power-0368 [36] acpi_power_transition : Error transitioning device [CFAN] to D3 acpi_bus-0496 [35] acpi_bus_set_power : Error transitioning device [CFAN] to D3 Change device [CFAN] from D3 to D0,power resources ^I^I^Inumber:current 0,target 1 Target device's handle:c12d21a8 ACPI: Thermal Zone [THRM] (49 C) [echo n > thermal_zone/THRM/cooling_mode] Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 Change device [CFAN] from D3 to D0,power resources ^I^I^Inumber:current 0,target 1 Target device's handle:c12d21a8 Change device [CFAN] from D0 to D3,power resources ^I^I^Inumber:current 1,target 0 Current device's handle:c12d21a8 [echo n > fan/CFAN/state] Change device [CFAN] from D3 to D0,power resources ^I^I^Inumber:current 0,target 1 Target device's handle:c12d21a8 acpi_power-0368 [61] acpi_power_transition : Error transitioning device [CFAN] to D0 acpi_bus-0496 [60] acpi_bus_set_power : Error transitioning device [CFAN] to D0 Change device [CFAN] from D/ to D3,power resources ^I^I^Inumber:current 659448717,target 0 Current device's handle:876c33cc [Oops] the below patch should be able to get rid of oops. But I guess error will still exist. --- power.c 2003-09-11 10:48:13.000000000 +0800 +++ power.c.new 2003-10-16 11:37:49.000000000 +0800 @@ -326,8 +326,6 @@ cl = &device->power.states[device->power.state].resources; tl = &device->power.states[state].resources; - device->power.state = ACPI_STATE_UNKNOWN; - if (!cl->count && !tl->count) { result = -ENODEV; goto end; @@ -345,8 +343,6 @@ goto end; } - device->power.state = state; - /* * Then we dereference all power resources used in the current list. */ @@ -356,7 +352,9 @@ goto end; } + device->power.state = state; end: + /*TBD:if error occurs, should we let devices have original state?*/ if (result) ACPI_DEBUG_PRINT((ACPI_DB_WARN, "Error transitioning device [%s] to D%d\n", The oops does not occur when your patch is applied. Errors are understandable, since the appropriate ASL methods do not work as expected by the spec. Also, the error returned is -8, as Martin has already stated. Thanks for your work. FYI: A funny thing I noticed: after some time, the errors stop being reported. I'll try take a look into that. The patch has been merged. I'd like close it. If needed, we can reopen it. |
Distribution: Gentoo Hardware Environment: ASUS L3800C, BIOS 1.21 Software Environment: Problem Description: The machine starts to boot. I know that at the place ACPI prints out the status of CPU, BATTERY, FAN etc. it prints some long message on a line, but it immediately scrolls away. The machine if I remember well locks somewhere after init. > From: Martin Mokrejs [mailto:mmokrejs@natur.cuni.cz] > Sent: Thursday, August 21, 2003 12:15 PM > To: linux-kernel@vger.kernel.org > Subject: ACPI kernel crash with 2.4.22-pre7 on ASUS L3800C > > > Hi, > I observe time to time on cold boot hang of my laptop. I > remember to see > such hangs at least since 2.4.21-pre3. Here's my latest > running kernel: > > # ksymoops --system-map=/boot/System.map-2.4.22-pre7 > --vmlinux=/usr/src/linux-2.4.22-pre7/vmlinux ./cr > ksymoops 2.4.9 on i686 2.4.22-pre7. Options used > -v /usr/src/linux-2.4.22-pre7/vmlinux (specified) > -k /proc/ksyms (default) > -l /proc/modules (default) > -o /lib/modules/2.4.22-pre7/ (default) > -m /boot/System.map-2.4.22-pre7 (specified) > > EFLAGS: 00010246 > eax: 00000000 ebx: 638a05f0 ecx: 00000000 edx: 00000006 > esi: 638a05f0 edi: f7ebddd0 ebp: f7ebdd78 esp: f7ebdd74 > ds: 0018 es: 0018 ss: 0018 > Process keventd: (pid 2, stackpage=f7ebd000) > Stack: f7ebdda4 f7ebdd90 c01ede68 638a05f0 f7ebdda4 f7ebddd0 > 638a05f0 f7ebddc0 > c01fc0f2 638a05f0 c01fc072 f7ebddd0 00010000 c0337755 > c033770a 00000050 > f7ebddd4 f7ebddd4 638a05f0 f7ebddf0 c02037c2 638a05f0 > f7ebddd0 00000000 > Call Trace: [<c01ede68>] [<c01fc0f2>] [<c01fc072>] > [<c02037c2>] [<c01f8a2a>] > [<c0203b8d>] [<c0203ee7>] [<c01fc4c7>] [<c01f8a00>] > [<c0207aed>] [<c0207e5e>] > [<c0208b60>] [<c01dce5a>] [<c01d4f7d>] [<c011ff0a>] > [<c01282e5>] [<c01281b0>] > [<c0105000>] [<c01057ee>] [<c01281b0>] > Code: 80 3b aa 0f 44 c3 5b 5d c3 a1 d4 55 40 c0 eb f6 55 89 e5 8b > Using defaults from ksymoops -t elf32-i386 -a i386 > > > >>edi; f7ebddd0 <_end+37a9304c/3a4f02dc> > >>ebp; f7ebdd78 <_end+37a92ff4/3a4f02dc> > >>esp; f7ebdd74 <_end+37a92ff0/3a4f02dc> > > Trace; c01ede68 <acpi_get_data+34/60> > Trace; c01fc0f2 <acpi_bus_get_device+45/a9> > Trace; c01fc072 <acpi_bus_data_handler+0/3b> > Trace; c02037c2 <acpi_power_get_context+46/a6> > Trace; c01f8a2a <acpi_ut_trace+29/2b> > Trace; c0203b8d <acpi_power_off_device+46/19d> > Trace; c0203ee7 <acpi_power_transition+111/138> > Trace; c01fc4c7 <acpi_bus_set_power+15f/273> > Trace; c01f8a00 <acpi_ut_debug_print_raw+29/2a> > Trace; c0207aed <acpi_thermal_active+bf/187> > Trace; c0207e5e <acpi_thermal_check+295/2e2> > Trace; c0208b60 <acpi_thermal_notify+a6/105> > Trace; c01dce5a <acpi_ev_notify_dispatch+54/7a> > Trace; c01d4f7d <acpi_os_execute_deferred+3a/6c> > Trace; c011ff0a <__run_task_queue+5a/70> > Trace; c01282e5 <context_thread+135/1d0> > Trace; c01281b0 <context_thread+0/1d0> > Trace; c0105000 <_stext+0/0> > Trace; c01057ee <arch_kernel_thread+2e/40> > Trace; c01281b0 <context_thread+0/1d0> Steps to reproduce: