Bug 44171 - BUG: unable to handle kernel NULL pointer dereference at acpi_ns_check_object_type
Summary: BUG: unable to handle kernel NULL pointer dereference at acpi_ns_check_object...
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core (show other bugs)
Hardware: All Linux
: P1 high
Assignee: acpi_acpica-core@kernel-bugs.osdl.org
URL:
Keywords:
: 44181 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-07-03 09:07 UTC by Vlastimil Babka
Modified: 2012-07-25 03:07 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.4
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Patch that removes the null pointer dereference (1.03 KB, patch)
2012-07-03 09:11 UTC, Vlastimil Babka
Details | Diff
acpidump from the affected machine (366.62 KB, text/plain)
2012-07-03 09:13 UTC, Vlastimil Babka
Details
patch from Bob (1.22 KB, patch)
2012-07-04 02:05 UTC, Lin Ming
Details | Diff

Description Vlastimil Babka 2012-07-03 09:07:59 UTC
(originally reported at linux-acpi@vger.kernel.org [2], filling bug as suggested)

Hello,

I've been experiencing kernel panic with NULL pointer dereference in
acpi_ns_check_object_type since kernel 3.4 on a MacPro machine.

By recompiling as much of ACPI as possible as modules, I was able to get
the system running and postpone the error until doing 'modprobe
acpi-cpufreq', which now results in oops, not panic. The log is attached
as error.log.

By bisecting linus tree between 3.3 and 3.4, I found the guilty commit
6a99b1c94d053b3420eaa4a4bc8b2883dd90a2f9
"ACPICA: Object repair code: Support to add Package wrappers" [1]
However this patch does not directly touch the functions in the stack trace.

Next I created a kdump of the oops, and looked around with gdb.
- In acpi_ns_check_package(), the null pointer is in the parameter
return_object_ptr, which is dereferenced when initializing the variable
return_object.
- The calling function acpi_ns_check_package_list() is in the 'case
ACPI_PTYPE2_COUNT:' part, the passed null pointer is in the sub_elements
variable.
- Before the switch, sub_elements is initialized like this:

  sub_elements = sub_package->package.elements

  interestingly, in the crashdump, sub_elements is null, but
  sub_package->package.elements is non-null

I've added some printk's and verified that the call of
 status = acpi_ns_check_object_type(data, &sub_package,
				   ACPI_RTYPE_PACKAGE, i);

 makes sub_package->package.elements become non-null, but sub_elements
 was already initialized before this call and remains null.

The above led me to create the attached patch which simply moves the
initialization of sub_elements after the sub_package check. I think it's
this check that results in the Integer to Package conversion/wrapping.

After this patch, the null pointer dereference is gone, but the debug
output of ACPI (acpi.debug_layer=0xffffffff acpi.debug_level=0x00000008)
shows that something is probably still wrong:

[    1.353677] nsrepair-0681 [4294967287] ns_wrap_with_package  :
\_PR_.CPU0._PSD: Wrapped Integer with expected Package object
[    1.353869] nsrepair-0681 [4294967287] ns_wrap_with_package  :
\_PR_.CPU0._PSD: Wrapped Integer with expected Package object
[    1.354059] ACPI Warning: For \_PR_.CPU0._PSD: Return Sub-Package[0]
is too small - found 1 elements, expected 5 (20120320/nspredef-905)
[    1.354253] ACPI: Invalid package argument
[    1.354322] ACPI: Invalid _PSD data
... (the same for other CPUx)


In comparison, 3.3 kernel with same acpi debug options shows only stuff
like:
[    1.494238] nsrepair-0728 [4294967287] ns_repair_package_list:
\_PR_.CPU0._PSD: Repaired incorrectly formed Package
[    1.494449] nsrepair-0728 [4294967287] ns_repair_package_list:
\_PR_.CPU2._PSD: Repaired incorrectly formed Package
[    1.494657] nsrepair-0728 [4294967287] ns_repair_package_list:
\_PR_.CPU4._PSD: Repaired incorrectly formed Package
... (the same for other CPUx)

Since I don't know much about this subsystem, I figured that I should
just report my findings at this point. The patched system is usable, but
I guess it's not a complete fix.

I also attach the output of acpidump. I hope I didn't forget anything
important, please ask for more information if needed.

Thanks,
Vlastimil Babka

[1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6a99b1c94d053b3420eaa4a4bc8b2883dd90a2f9

[2] http://marc.info/?t=134121902100001&r=1&w=2

error.log

Jun 29 13:50:01 macpro kernel: [  334.597947] nsrepair-0681 [4294967287] ns_wrap_with_package  : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object
Jun 29 13:50:01 macpro kernel: [  334.597951] nsrepair-0681 [4294967287] ns_wrap_with_package  : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object
Jun 29 13:50:01 macpro kernel: [  334.597958] BUG: unable to handle kernel NULL pointer dereference at           (null)
Jun 29 13:50:01 macpro kernel: [  334.597972] IP: [<ffffffff812da10e>] acpi_ns_check_object_type+0x1a/0x1d2
Jun 29 13:50:01 macpro kernel: [  334.597985] PGD 36a625067 PUD 36aa2d067 PMD 0 
Jun 29 13:50:01 macpro kernel: [  334.597995] Oops: 0000 [#1] PREEMPT SMP 
Jun 29 13:50:01 macpro kernel: [  334.598004] CPU 0 
Jun 29 13:50:01 macpro kernel: [  334.598007] Modules linked in: acpi_cpufreq(+) mperf thermal fan battery acpi_ipmi ipmi_msghandler ac coretemp btusb bluetooth ioatdma snd_hda_codec_realtek snd_hda_intel firewire_ohci snd_hda_codec firewire_core i7core_edac i2c_i801 applesmc processor edac_core dca snd_hwdep shpchp rtc_cmos button
Jun 29 13:50:01 macpro kernel: [  334.598075] 
Jun 29 13:50:01 macpro kernel: [  334.598079] Pid: 8683, comm: modprobe Not tainted 3.3.0+ #22 Apple Inc. MacPro4,1/Mac-F221BEC8
Jun 29 13:50:01 macpro kernel: [  334.598091] RIP: 0010:[<ffffffff812da10e>]  [<ffffffff812da10e>] acpi_ns_check_object_type+0x1a/0x1d2
Jun 29 13:50:01 macpro kernel: [  334.598102] RSP: 0018:ffff88036a75bb58  EFLAGS: 00010292
Jun 29 13:50:01 macpro kernel: [  334.598107] RAX: ffff8803717c1ee8 RBX: ffff88036bb79500 RCX: 0000000000000000
Jun 29 13:50:01 macpro kernel: [  334.598113] RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff88036bb79500
Jun 29 13:50:01 macpro kernel: [  334.598119] RBP: ffff88036a75bbd8 R08: 0000000000000000 R09: 0000000000000000
Jun 29 13:50:01 macpro kernel: [  334.598126] R10: 0000000000000000 R11: 0a7463656a626f20 R12: ffff88036bb79500
Jun 29 13:50:01 macpro kernel: [  334.598132] R13: 0000000000000000 R14: 0000000000000002 R15: 0000000000000000
Jun 29 13:50:01 macpro kernel: [  334.598138] FS:  00007fa70776e700(0000) GS:ffff88037fc00000(0000) knlGS:0000000000000000
Jun 29 13:50:01 macpro kernel: [  334.598146] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 29 13:50:01 macpro kernel: [  334.598151] CR2: 0000000000000000 CR3: 000000036fa87000 CR4: 00000000000006f0
Jun 29 13:50:01 macpro kernel: [  334.598157] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 29 13:50:01 macpro kernel: [  334.598163] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 29 13:50:01 macpro kernel: [  334.598173] Process modprobe (pid: 8683, threadinfo ffff88036a75a000, task ffff880369005a40)
Jun 29 13:50:01 macpro kernel: [  334.598180] Stack:
Jun 29 13:50:01 macpro kernel: [  334.598183]  000000000000001e 000000000000001e 7865206874697720 5020646574636570
Jun 29 13:50:01 macpro kernel: [  334.598196]  ffff88036a75bb88 ffffffff812b1c0e ffff88036a75bc18 ffffffff812e695f
Jun 29 13:50:01 macpro kernel: [  334.598208]  ffff88036a75bbb8 ffffffff00000030 ffff88036a75bc38 ffffffff8184632e
Jun 29 13:50:01 macpro kernel: [  334.598221] Call Trace:
Jun 29 13:50:01 macpro kernel: [  334.598227]  [<ffffffff812b1c0e>] ? acpi_os_vprintf+0x2b/0x2d
Jun 29 13:50:01 macpro kernel: [  334.598234]  [<ffffffff812e695f>] ? acpi_debug_print+0xf1/0x100
Jun 29 13:50:01 macpro kernel: [  334.598241]  [<ffffffff812da4b0>] acpi_ns_check_package_list+0x157/0x21a
Jun 29 13:50:01 macpro kernel: [  334.598249]  [<ffffffff812daa7c>] acpi_ns_check_predefined_names+0x3dd/0x48d
Jun 29 13:50:01 macpro kernel: [  334.598256]  [<ffffffff812b2373>] ? acpi_os_signal_semaphore+0x5f/0x6f
Jun 29 13:50:01 macpro kernel: [  334.598263]  [<ffffffff812d8886>] acpi_ns_evaluate+0x32e/0x3b7
Jun 29 13:50:01 macpro kernel: [  334.598271]  [<ffffffff810fd77f>] ? kmem_cache_alloc+0x8f/0xb0
Jun 29 13:50:01 macpro kernel: [  334.598278]  [<ffffffff812dcc06>] acpi_evaluate_object+0x1ec/0x34e
Jun 29 13:50:01 macpro kernel: [  334.598286]  [<ffffffff810d80eb>] ? pcpu_alloc+0x90b/0xa10
Jun 29 13:50:01 macpro kernel: [  334.598295]  [<ffffffffa007157f>] acpi_processor_preregister_performance+0x10e/0x458 [processor]
Jun 29 13:50:01 macpro kernel: [  334.598304]  [<ffffffff810bc9ad>] ? jump_label_module_notify+0x7d/0x200
Jun 29 13:50:01 macpro kernel: [  334.598312]  [<ffffffffa0226000>] ? 0xffffffffa0225fff
Jun 29 13:50:01 macpro kernel: [  334.598319]  [<ffffffffa0226082>] acpi_cpufreq_init+0x82/0xa4 [acpi_cpufreq]
Jun 29 13:50:01 macpro kernel: [  334.598742]  [<ffffffff810001ca>] do_one_initcall+0x3a/0x160
Jun 29 13:50:01 macpro kernel: [  334.599278]  [<ffffffff810855c6>] sys_init_module+0xa16/0x1bc0
Jun 29 13:50:01 macpro kernel: [  334.599811]  [<ffffffff81631be2>] system_call_fastpath+0x16/0x1b
Jun 29 13:50:01 macpro kernel: [  334.600337] Code: 00 e8 c9 c8 00 00 31 c0 5a 5b 41 5c 41 5d 5d c3 90 55 48 89 e5 41 57 41 56 41 89 d6 41 55 41 89 cd 41 54 53 48 89 fb 48 83 ec 58 <4c> 8b 26 4d 85 e4 75 13 48 89 f1 44 89 ea 44 89 f6 e8 08 0a 00 
Jun 29 13:50:01 macpro kernel: [  334.600970] RIP  [<ffffffff812da10e>] acpi_ns_check_object_type+0x1a/0x1d2
Jun 29 13:50:01 macpro kernel: [  334.601559]  RSP <ffff88036a75bb58>
Jun 29 13:50:01 macpro kernel: [  334.602130] CR2: 0000000000000000
Jun 29 13:50:01 macpro kernel: [  334.602838] ---[ end trace 217f289557e3f0cd ]---
Comment 1 Vlastimil Babka 2012-07-03 09:11:51 UTC
Created attachment 74631 [details]
Patch that removes the null pointer dereference

As said in the original comment, this removes the null pointer problem, but debug output shows the repaired data is still malformed.
Comment 2 Vlastimil Babka 2012-07-03 09:13:05 UTC
Created attachment 74641 [details]
acpidump from the affected machine
Comment 3 Robert Moore 2012-07-03 16:18:50 UTC
The \_PR_.CPU0._PSD method is indeed ill-formed in the original ASL/AML. And, there is a problem with the ACPICA attempt to repair the return value.

Original code:

Method (_PSD, 0, NotSerialized)  // _PSD: Power State Dependencies
{
    Return (Package (0x05)
    {
        0x05, 
        0x00, 
        0x00, 
        0xFD, 
        0x08
    })
}

_PSD is defined to be a "package of packages", thus, the correct code should be:

Method (_PSD, 0, NotSerialized)  // _PSD: Power State Dependencies
{
    Return (Package (0x01)
    {
        Package (0x05)
        {
            0x05, 
            0x00, 
            0x00, 
            0xFD, 
            0x08
        }
    })
}
Comment 4 Robert Moore 2012-07-03 16:31:23 UTC
The suggested patch does not work correctly, since it generates additional/different warnings and errors, does not return the correct value, and also results in some memory leaks. We will work on fixing the problem.

Under AcpiExec utility (with suggested patch applied):

- ex \_PR_.CPU0._PSD
Executing \_PR_.CPU0._PSD
nsrepair-0817 [02] NsWrapWithPackage     : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object
nsrepair-0817 [02] NsWrapWithPackage     : \_PR_.CPU0._PSD: Wrapped Integer with expected Package object
ACPI Warning: For \_PR_.CPU0._PSD: Return Sub-Package[0] is too small - found 1 elements, expected 5 (20120620/nspredef-
1021)
Outstanding: 0x6 allocations after execution
Execution of \_PR_.CPU0._PSD returned object 00035DC8 Buflen 20
  [Package] Contains 1 Elements:
    [Integer] = 0000000000000005
- q
00545278 Length 0x0008  utobject-269 [Not a Descriptor]
00545030 Length 0x002C   utcache-422 [Operand]      Package  RefCount 0x0001
005451F8 Length 0x002C   utcache-422 [Operand]      Integer  RefCount 0x0001
00545178 Length 0x002C   utcache-422 [Operand]      Integer  RefCount 0x0001
00545108 Length 0x0018  dsobject-516 [Not a Descriptor]
006543A8 Length 0x002C   utcache-422 [Operand]      Integer  RefCount 0x0001
004E5570 Length 0x002C   utcache-422 [Operand]      Integer  RefCount 0x0001
00654130 Length 0x002C   utcache-422 [Operand]      Package  RefCount 0x0001
ACPI Error: 8(0x8) Outstanding allocations (20120620/uttrack-776)
Comment 5 Lin Ming 2012-07-04 02:05:11 UTC
Created attachment 74791 [details]
patch from Bob

Also posted at:
http://marc.info/?l=linux-acpi&m=134136758030378&w=2
Comment 6 Thomas Renninger 2012-07-04 08:05:35 UTC
*** Bug 44181 has been marked as a duplicate of this bug. ***
Comment 7 Thomas Renninger 2012-07-04 08:41:34 UTC
I have added the patch to our latest master (3.5-rc4) and 12.2 kernel branches (3.4).
This will automatically trigger a rebuild and a built kernel package for testing can be retrieved in an hour or so from here:
http://download.opensuse.org/repositories/Kernel:/openSUSE-12.2/standard/x86_64/
or here:
http://download.opensuse.org/repositories/Kernel:/HEAD/standard/x86_64/

Best check for the date of the rpms and double check after downloading via:

rpm -qp --changelog kernel-xy.rpm |less

that this changelog is included:
Date:   Wed Jul 4 10:24:59 2012 +0200

    Fix NULL pointer derference in acpi_ns_check_object_type()
    (kernel bug 44171).
Comment 8 Vlastimil Babka 2012-07-04 09:44:52 UTC
Thanks, patch works fine here.
Comment 9 marc collin 2012-07-04 09:56:31 UTC
is there a way to get it for 32 bits kernel?
Comment 10 Thomas Renninger 2012-07-04 11:26:50 UTC
http://download.opensuse.org/repositories/Kernel:/openSUSE-12.2/standard/i686
> But this one (date) does not have the patch yet:
kernel-pae-3.4.4-4.1.i686.rpm                 04-Jul-2012 09:30
Comment 11 Thomas Renninger 2012-07-04 11:53:03 UTC
> get it for 32 bits kernel?
Not sure, but the one or other kernel version/flavor currently has build issues.
I guess Vlastimil's test is enough to see the patch mainline soon?
This one should get a:
CC: stable@vger.kernel.org
tag and distributions should get the automatically rather quickly.
Comment 12 marc collin 2012-07-05 10:44:41 UTC
the patch is available for 23bits kernel
that work fine
Comment 13 Vlastimil Babka 2012-07-08 22:05:55 UTC
(In reply to comment #11)
> I guess Vlastimil's test is enough to see the patch mainline soon?
> This one should get a:
> CC: stable@vger.kernel.org
> tag and distributions should get the automatically rather quickly.

Should I add this CC myself?
Comment 14 Len Brown 2012-07-14 15:40:26 UTC
patch in comment #5 applied for 3.5-rc, cc 3.4-stable
Comment 15 Len Brown 2012-07-25 03:07:49 UTC
shipped in 3.5-rc7
shipped in 3.4.6

closed.


commit 46befd6b38d802dfc5998e7d7938854578b45d9d
Author: Bob Moore <robert.moore@intel.com>
Date:   Wed Jul 4 10:02:32 2012 +0800

    ACPICA: Fix possible fault in return package object repair code

Note You need to log in before you can comment on or make changes to this bug.