(originally reported at linux-acpi@vger.kernel.org [2], filling bug as suggested) Hello, I've been experiencing kernel panic with NULL pointer dereference in acpi_ns_check_object_type since kernel 3.4 on a MacPro machine. By recompiling as much of ACPI as possible as modules, I was able to get the system running and postpone the error until doing 'modprobe acpi-cpufreq', which now results in oops, not panic. The log is attached as error.log. By bisecting linus tree between 3.3 and 3.4, I found the guilty commit 6a99b1c94d053b3420eaa4a4bc8b2883dd90a2f9 "ACPICA: Object repair code: Support to add Package wrappers" [1] However this patch does not directly touch the functions in the stack trace. Next I created a kdump of the oops, and looked around with gdb. - In acpi_ns_check_package(), the null pointer is in the parameter return_object_ptr, which is dereferenced when initializing the variable return_object. - The calling function acpi_ns_check_package_list() is in the 'case ACPI_PTYPE2_COUNT:' part, the passed null pointer is in the sub_elements variable. - Before the switch, sub_elements is initialized like this: sub_elements = sub_package->package.elements interestingly, in the crashdump, sub_elements is null, but sub_package->package.elements is non-null I've added some printk's and verified that the call of status = acpi_ns_check_object_type(data, &sub_package, ACPI_RTYPE_PACKAGE, i); makes sub_package->package.elements become non-null, but sub_elements was already initialized before this call and remains null. The above led me to create the attached patch which simply moves the initialization of sub_elements after the sub_package check. I think it's this check that results in the Integer to Package conversion/wrapping. After this patch, the null pointer dereference is gone, but the debug output of ACPI (acpi.debug_layer=0xffffffff acpi.debug_level=0x00000008) shows that something is probably still wrong: [ 1.353677] nsrepair-0681 [4294967287] ns_wrap_with_package : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object [ 1.353869] nsrepair-0681 [4294967287] ns_wrap_with_package : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object [ 1.354059] ACPI Warning: For \_PR_.CPU0._PSD: Return Sub-Package[0] is too small - found 1 elements, expected 5 (20120320/nspredef-905) [ 1.354253] ACPI: Invalid package argument [ 1.354322] ACPI: Invalid _PSD data ... (the same for other CPUx) In comparison, 3.3 kernel with same acpi debug options shows only stuff like: [ 1.494238] nsrepair-0728 [4294967287] ns_repair_package_list: \_PR_.CPU0._PSD: Repaired incorrectly formed Package [ 1.494449] nsrepair-0728 [4294967287] ns_repair_package_list: \_PR_.CPU2._PSD: Repaired incorrectly formed Package [ 1.494657] nsrepair-0728 [4294967287] ns_repair_package_list: \_PR_.CPU4._PSD: Repaired incorrectly formed Package ... (the same for other CPUx) Since I don't know much about this subsystem, I figured that I should just report my findings at this point. The patched system is usable, but I guess it's not a complete fix. I also attach the output of acpidump. I hope I didn't forget anything important, please ask for more information if needed. Thanks, Vlastimil Babka [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6a99b1c94d053b3420eaa4a4bc8b2883dd90a2f9 [2] http://marc.info/?t=134121902100001&r=1&w=2 error.log Jun 29 13:50:01 macpro kernel: [ 334.597947] nsrepair-0681 [4294967287] ns_wrap_with_package : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object Jun 29 13:50:01 macpro kernel: [ 334.597951] nsrepair-0681 [4294967287] ns_wrap_with_package : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object Jun 29 13:50:01 macpro kernel: [ 334.597958] BUG: unable to handle kernel NULL pointer dereference at (null) Jun 29 13:50:01 macpro kernel: [ 334.597972] IP: [<ffffffff812da10e>] acpi_ns_check_object_type+0x1a/0x1d2 Jun 29 13:50:01 macpro kernel: [ 334.597985] PGD 36a625067 PUD 36aa2d067 PMD 0 Jun 29 13:50:01 macpro kernel: [ 334.597995] Oops: 0000 [#1] PREEMPT SMP Jun 29 13:50:01 macpro kernel: [ 334.598004] CPU 0 Jun 29 13:50:01 macpro kernel: [ 334.598007] Modules linked in: acpi_cpufreq(+) mperf thermal fan battery acpi_ipmi ipmi_msghandler ac coretemp btusb bluetooth ioatdma snd_hda_codec_realtek snd_hda_intel firewire_ohci snd_hda_codec firewire_core i7core_edac i2c_i801 applesmc processor edac_core dca snd_hwdep shpchp rtc_cmos button Jun 29 13:50:01 macpro kernel: [ 334.598075] Jun 29 13:50:01 macpro kernel: [ 334.598079] Pid: 8683, comm: modprobe Not tainted 3.3.0+ #22 Apple Inc. MacPro4,1/Mac-F221BEC8 Jun 29 13:50:01 macpro kernel: [ 334.598091] RIP: 0010:[<ffffffff812da10e>] [<ffffffff812da10e>] acpi_ns_check_object_type+0x1a/0x1d2 Jun 29 13:50:01 macpro kernel: [ 334.598102] RSP: 0018:ffff88036a75bb58 EFLAGS: 00010292 Jun 29 13:50:01 macpro kernel: [ 334.598107] RAX: ffff8803717c1ee8 RBX: ffff88036bb79500 RCX: 0000000000000000 Jun 29 13:50:01 macpro kernel: [ 334.598113] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88036bb79500 Jun 29 13:50:01 macpro kernel: [ 334.598119] RBP: ffff88036a75bbd8 R08: 0000000000000000 R09: 0000000000000000 Jun 29 13:50:01 macpro kernel: [ 334.598126] R10: 0000000000000000 R11: 0a7463656a626f20 R12: ffff88036bb79500 Jun 29 13:50:01 macpro kernel: [ 334.598132] R13: 0000000000000000 R14: 0000000000000002 R15: 0000000000000000 Jun 29 13:50:01 macpro kernel: [ 334.598138] FS: 00007fa70776e700(0000) GS:ffff88037fc00000(0000) knlGS:0000000000000000 Jun 29 13:50:01 macpro kernel: [ 334.598146] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jun 29 13:50:01 macpro kernel: [ 334.598151] CR2: 0000000000000000 CR3: 000000036fa87000 CR4: 00000000000006f0 Jun 29 13:50:01 macpro kernel: [ 334.598157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jun 29 13:50:01 macpro kernel: [ 334.598163] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jun 29 13:50:01 macpro kernel: [ 334.598173] Process modprobe (pid: 8683, threadinfo ffff88036a75a000, task ffff880369005a40) Jun 29 13:50:01 macpro kernel: [ 334.598180] Stack: Jun 29 13:50:01 macpro kernel: [ 334.598183] 000000000000001e 000000000000001e 7865206874697720 5020646574636570 Jun 29 13:50:01 macpro kernel: [ 334.598196] ffff88036a75bb88 ffffffff812b1c0e ffff88036a75bc18 ffffffff812e695f Jun 29 13:50:01 macpro kernel: [ 334.598208] ffff88036a75bbb8 ffffffff00000030 ffff88036a75bc38 ffffffff8184632e Jun 29 13:50:01 macpro kernel: [ 334.598221] Call Trace: Jun 29 13:50:01 macpro kernel: [ 334.598227] [<ffffffff812b1c0e>] ? acpi_os_vprintf+0x2b/0x2d Jun 29 13:50:01 macpro kernel: [ 334.598234] [<ffffffff812e695f>] ? acpi_debug_print+0xf1/0x100 Jun 29 13:50:01 macpro kernel: [ 334.598241] [<ffffffff812da4b0>] acpi_ns_check_package_list+0x157/0x21a Jun 29 13:50:01 macpro kernel: [ 334.598249] [<ffffffff812daa7c>] acpi_ns_check_predefined_names+0x3dd/0x48d Jun 29 13:50:01 macpro kernel: [ 334.598256] [<ffffffff812b2373>] ? acpi_os_signal_semaphore+0x5f/0x6f Jun 29 13:50:01 macpro kernel: [ 334.598263] [<ffffffff812d8886>] acpi_ns_evaluate+0x32e/0x3b7 Jun 29 13:50:01 macpro kernel: [ 334.598271] [<ffffffff810fd77f>] ? kmem_cache_alloc+0x8f/0xb0 Jun 29 13:50:01 macpro kernel: [ 334.598278] [<ffffffff812dcc06>] acpi_evaluate_object+0x1ec/0x34e Jun 29 13:50:01 macpro kernel: [ 334.598286] [<ffffffff810d80eb>] ? pcpu_alloc+0x90b/0xa10 Jun 29 13:50:01 macpro kernel: [ 334.598295] [<ffffffffa007157f>] acpi_processor_preregister_performance+0x10e/0x458 [processor] Jun 29 13:50:01 macpro kernel: [ 334.598304] [<ffffffff810bc9ad>] ? jump_label_module_notify+0x7d/0x200 Jun 29 13:50:01 macpro kernel: [ 334.598312] [<ffffffffa0226000>] ? 0xffffffffa0225fff Jun 29 13:50:01 macpro kernel: [ 334.598319] [<ffffffffa0226082>] acpi_cpufreq_init+0x82/0xa4 [acpi_cpufreq] Jun 29 13:50:01 macpro kernel: [ 334.598742] [<ffffffff810001ca>] do_one_initcall+0x3a/0x160 Jun 29 13:50:01 macpro kernel: [ 334.599278] [<ffffffff810855c6>] sys_init_module+0xa16/0x1bc0 Jun 29 13:50:01 macpro kernel: [ 334.599811] [<ffffffff81631be2>] system_call_fastpath+0x16/0x1b Jun 29 13:50:01 macpro kernel: [ 334.600337] Code: 00 e8 c9 c8 00 00 31 c0 5a 5b 41 5c 41 5d 5d c3 90 55 48 89 e5 41 57 41 56 41 89 d6 41 55 41 89 cd 41 54 53 48 89 fb 48 83 ec 58 <4c> 8b 26 4d 85 e4 75 13 48 89 f1 44 89 ea 44 89 f6 e8 08 0a 00 Jun 29 13:50:01 macpro kernel: [ 334.600970] RIP [<ffffffff812da10e>] acpi_ns_check_object_type+0x1a/0x1d2 Jun 29 13:50:01 macpro kernel: [ 334.601559] RSP <ffff88036a75bb58> Jun 29 13:50:01 macpro kernel: [ 334.602130] CR2: 0000000000000000 Jun 29 13:50:01 macpro kernel: [ 334.602838] ---[ end trace 217f289557e3f0cd ]---
Created attachment 74631 [details] Patch that removes the null pointer dereference As said in the original comment, this removes the null pointer problem, but debug output shows the repaired data is still malformed.
Created attachment 74641 [details] acpidump from the affected machine
The \_PR_.CPU0._PSD method is indeed ill-formed in the original ASL/AML. And, there is a problem with the ACPICA attempt to repair the return value. Original code: Method (_PSD, 0, NotSerialized) // _PSD: Power State Dependencies { Return (Package (0x05) { 0x05, 0x00, 0x00, 0xFD, 0x08 }) } _PSD is defined to be a "package of packages", thus, the correct code should be: Method (_PSD, 0, NotSerialized) // _PSD: Power State Dependencies { Return (Package (0x01) { Package (0x05) { 0x05, 0x00, 0x00, 0xFD, 0x08 } }) }
The suggested patch does not work correctly, since it generates additional/different warnings and errors, does not return the correct value, and also results in some memory leaks. We will work on fixing the problem. Under AcpiExec utility (with suggested patch applied): - ex \_PR_.CPU0._PSD Executing \_PR_.CPU0._PSD nsrepair-0817 [02] NsWrapWithPackage : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object nsrepair-0817 [02] NsWrapWithPackage : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object ACPI Warning: For \_PR_.CPU0._PSD: Return Sub-Package[0] is too small - found 1 elements, expected 5 (20120620/nspredef- 1021) Outstanding: 0x6 allocations after execution Execution of \_PR_.CPU0._PSD returned object 00035DC8 Buflen 20 [Package] Contains 1 Elements: [Integer] = 0000000000000005 - q 00545278 Length 0x0008 utobject-269 [Not a Descriptor] 00545030 Length 0x002C utcache-422 [Operand] Package RefCount 0x0001 005451F8 Length 0x002C utcache-422 [Operand] Integer RefCount 0x0001 00545178 Length 0x002C utcache-422 [Operand] Integer RefCount 0x0001 00545108 Length 0x0018 dsobject-516 [Not a Descriptor] 006543A8 Length 0x002C utcache-422 [Operand] Integer RefCount 0x0001 004E5570 Length 0x002C utcache-422 [Operand] Integer RefCount 0x0001 00654130 Length 0x002C utcache-422 [Operand] Package RefCount 0x0001 ACPI Error: 8(0x8) Outstanding allocations (20120620/uttrack-776)
Created attachment 74791 [details] patch from Bob Also posted at: http://marc.info/?l=linux-acpi&m=134136758030378&w=2
*** Bug 44181 has been marked as a duplicate of this bug. ***
I have added the patch to our latest master (3.5-rc4) and 12.2 kernel branches (3.4). This will automatically trigger a rebuild and a built kernel package for testing can be retrieved in an hour or so from here: http://download.opensuse.org/repositories/Kernel:/openSUSE-12.2/standard/x86_64/ or here: http://download.opensuse.org/repositories/Kernel:/HEAD/standard/x86_64/ Best check for the date of the rpms and double check after downloading via: rpm -qp --changelog kernel-xy.rpm |less that this changelog is included: Date: Wed Jul 4 10:24:59 2012 +0200 Fix NULL pointer derference in acpi_ns_check_object_type() (kernel bug 44171).
Thanks, patch works fine here.
is there a way to get it for 32 bits kernel?
http://download.opensuse.org/repositories/Kernel:/openSUSE-12.2/standard/i686 > But this one (date) does not have the patch yet: kernel-pae-3.4.4-4.1.i686.rpm 04-Jul-2012 09:30
> get it for 32 bits kernel? Not sure, but the one or other kernel version/flavor currently has build issues. I guess Vlastimil's test is enough to see the patch mainline soon? This one should get a: CC: stable@vger.kernel.org tag and distributions should get the automatically rather quickly.
the patch is available for 23bits kernel that work fine
(In reply to comment #11) > I guess Vlastimil's test is enough to see the patch mainline soon? > This one should get a: > CC: stable@vger.kernel.org > tag and distributions should get the automatically rather quickly. Should I add this CC myself?
patch in comment #5 applied for 3.5-rc, cc 3.4-stable
shipped in 3.5-rc7 shipped in 3.4.6 closed. commit 46befd6b38d802dfc5998e7d7938854578b45d9d Author: Bob Moore <robert.moore@intel.com> Date: Wed Jul 4 10:02:32 2012 +0800 ACPICA: Fix possible fault in return package object repair code