Bug 77131 - acpi reset register implemented in system memory cannot be mapped from interrupt context
Summary: acpi reset register implemented in system memory cannot be mapped from interr...
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: David Box
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-29 19:39 UTC by Randy Wright
Modified: 2014-09-23 02:11 UTC (History)
4 users (show)

See Also:
Kernel Version: v3.15-rc7 and earlier
Subsystem:
Regression: No
Bisected commit-id:


Attachments
[PATCH] if appropriate create a virtual mapping for acpi reset register (4.04 KB, patch)
2014-05-29 19:39 UTC, Randy Wright
Details | Diff
Use acpi_os_map_generic_address to pre-map the reset register (1.72 KB, patch)
2014-06-03 22:20 UTC, Randy Wright
Details | Diff

Description Randy Wright 2014-05-29 19:39:05 UTC
Created attachment 137701 [details]
[PATCH] if appropriate create a virtual mapping for acpi reset register

This issue was observed on a prototype system on which the ACPI reset register is implemented in system memory, i.e. ACPI_ADR_SPACE_SYSTEM_MEMORY.  When reset is invoked from interrupt context, a kernel BUG will be seen and depending on the exact kernel version and the value of certain sysctl tunables, a loop recursively attempting reset may be entered.

Steps to Reproduce:
Here is an example triggered by NMI. The same may be observed by any call to panic or native_machine_emergency_restart with tunables properly conditioned:

# turn off kdump so reset will occur
service boot.kdump stop

# change tunables to make nmi cause panic and panic cause reset
sysctl kernel.panic_on_io_nmi=1 kernel.panic=10 kernel.printk=9

# unload any module that might intercept nmi, for example
modprobe -r hpwdt

# now externally trigger an nmi

Actual Results:
# the interesting part begins after the line containing ACPI MEMORY or I/O RESET_REG.

[  296.236977] NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0.
[  296.244170] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E 3.15.0-rc7 #103
[  296.252405] Hardware name: HP Prototype
[  296.262267] task: ffffffff81a13460 ti: ffffffff81a00000 task.ti: ffffffff81a00000
[  296.270502] RIP: 0010:[<ffffffff812ea0eb>]  [<ffffffff812ea0eb>] intel_idle+0xbb/0x140
[  296.279234] RSP: 0018:ffffffff81a01e28  EFLAGS: 00000046
[  296.285078] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[  296.292928] RDX: 0000000000000000 RSI: ffffffff81a01fd8 RDI: 0000000000000000
[  296.300778] RBP: ffffffff81a01e58 R08: 000000000002c641 R09: 00000000001c34d1
[  296.308630] R10: 000000451062b316 R11: 0000000000004506 R12: 0000000000000004
[  296.316480] R13: 0000000000000020 R14: 0000000000000004 R15: 0000000000000004
[  296.324332] FS:  0000000000000000(0000) GS:ffff88207fa00000(0000) knlGS:0000000000000000
[  296.333233] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  296.339554] CR2: ffffffffff600400 CR3: 0000000001a0e000 CR4: 00000000001407f0
[  296.347406] Stack:
[  296.349615]  ffffffff81a01e58 00000000810b6eaf ffffffff81a01ea8 ffffe8c080201f00
[  296.357791]  ffffffff81aae820 00000044aa9e8f4b ffffffff81a01ea8 ffffffff813e2252
[  296.365967]  0000000000000001 000000002a29ff70 ffffffff81a01fd8 ffffe8c080201f00
[  296.374143] Call Trace:
[  296.376837]  [<ffffffff813e2252>] cpuidle_enter_state+0x42/0xd0
[  296.383350]  [<ffffffff813e22f2>] cpuidle_enter+0x12/0x20
[  296.389298]  [<ffffffff81096641>] cpuidle_idle_call+0x101/0x1c0
[  296.395813]  [<ffffffff810968d5>] cpu_idle_loop+0x185/0x1a0
[  296.401945]  [<ffffffff8109690e>] cpu_startup_entry+0x1e/0x20
[  296.408272]  [<ffffffff814c4412>] rest_init+0x72/0x80
[  296.413839]  [<ffffffff81b211ed>] start_kernel+0x35d/0x364
[  296.419876]  [<ffffffff81b20cae>] ? repair_env_string+0x5b/0x5b
[  296.426394]  [<ffffffff814ca5f6>] ? memblock_reserve+0x49/0x4e
[  296.432815]  [<ffffffff81b205ad>] x86_64_start_reservations+0x2a/0x2c
[  296.439905]  [<ffffffff81b206f0>] x86_64_start_kernel+0x141/0x148
[  296.446609] Code: 31 d2 65 48 8b 34 25 40 b8 00 00 48 89 d1 48 8d 86 38 e0 ff ff 0f 01 c8 48 8b 86 38 e0 ff ff a8 08 75 08 b1 01 4c 89 e8 0f 01 c9 <65> 48 8b 04 25 40 b8 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 
[  296.467933] Kernel panic - not syncing: NMI IOCK error: Not continuing
[  296.475116] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E 3.15.0-rc7 #103
[  296.483350] Hardware name: HP Prototype
[  296.493208]  0000000000000071 ffff88207fa07df8 ffffffff814cbd14 0000000000000003
[  296.501385]  ffffffff817c3708 ffff88207fa07e78 ffffffff814cba9b ffffffff00000008
[  296.509561]  ffff88207fa07e88 ffff88207fa07e28 00000000000013a9 0000000000000000
[  296.517736] Call Trace:
[  296.520423]  <NMI>  [<ffffffff814cbd14>] dump_stack+0x49/0x5d
[  296.526768]  [<ffffffff814cba9b>] panic+0xb6/0x1e5
[  296.532045]  [<ffffffff814d10cd>] io_check_error+0x9d/0xa0
[  296.538083]  [<ffffffff814d119c>] default_do_nmi+0xcc/0x200
[  296.544215]  [<ffffffff814d1360>] do_nmi+0x90/0xe0
[  296.549488]  [<ffffffff814d0667>] end_repeat_nmi+0x1e/0x2e
[  296.555527]  [<ffffffff812ea0eb>] ? intel_idle+0xbb/0x140
[  296.561467]  [<ffffffff812ea0eb>] ? intel_idle+0xbb/0x140
[  296.567409]  [<ffffffff812ea0eb>] ? intel_idle+0xbb/0x140
[  296.573347]  <<EOE>>  [<ffffffff813e2252>] cpuidle_enter_state+0x42/0xd0
[  296.580740]  [<ffffffff813e22f2>] cpuidle_enter+0x12/0x20
[  296.586679]  [<ffffffff81096641>] cpuidle_idle_call+0x101/0x1c0
[  296.593192]  [<ffffffff810968d5>] cpu_idle_loop+0x185/0x1a0
[  296.599324]  [<ffffffff8109690e>] cpu_startup_entry+0x1e/0x20
[  296.605646]  [<ffffffff814c4412>] rest_init+0x72/0x80
[  296.611205]  [<ffffffff81b211ed>] start_kernel+0x35d/0x364
[  296.617241]  [<ffffffff81b20cae>] ? repair_env_string+0x5b/0x5b
[  296.623757]  [<ffffffff814ca5f6>] ? memblock_reserve+0x49/0x4e
[  296.630177]  [<ffffffff81b205ad>] x86_64_start_reservations+0x2a/0x2c
[  296.637264]  [<ffffffff81b206f0>] x86_64_start_kernel+0x141/0x148
[  296.645127] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[  296.664050] Rebooting in 10 seconds..
[  306.601201] ACPI MEMORY or I/O RESET_REG.
[  306.605859] ------------[ cut here ]------------
[  306.610940] kernel BUG at mm/vmalloc.c:1319!
[  306.615637] invalid opcode: 0000 [#1] SMP 
[  306.620160] Modules linked in: mptctl(E) mptbase(E) af_packet(E) cpufreq_conservative(E) cpufreq_userspace(E) cpufreq_powersave(E) fuse(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) loop(E) ipv6(E) iTCO_wdt(E) iTCO_vendor_support(E) ixgbe(E) ptp(E) pps_core(E) lpc_ich(E) ioatdma(E) mdio(E) ehci_pci(E) dca(E) mfd_core(E) hpilo(E) ses(E) sg(E) pcspkr(E) enclosure(E) ipmi_si(E) acpi_cpufreq(E) ipmi_msghandler(E) rtc_cmos(E) button(E) ext3(E) jbd(E) mbcache(E) dm_service_time(E) dm_queue_length(E) dm_round_robin(E) dm_multipath(E) mgag200(E) ttm(E) drm_kms_helper(E) drm(E) i2c_algo_bit(E) sysimgblt(E) sysfillrect(E) i2c_core(E) syscopyarea(E) sd_mod(E) crc_t10dif(E) crct10dif_common(E) uhci_hcd(E) ehci_hcd(E) qla2xxx(E) scsi_transport_fc(E) scsi_tgt(E) usbcore(E) usb_common(E) processor(E) thermal_sys(E) hwmon(E) scsi_dh_emc(E) scsi_dh_rdac(E) scsi_dh_hp_sw(E) scsi_dh_alua(E) scsi_dh(E) scsi_mod(E) dm_snapshot(E) dm_bufio(E) dm_mod(E) [last unloaded: hpwdt]
[  306.714102] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E 3.15.0-rc7 #103
[  306.722336] Hardware name: HP Prototype
[  306.732194] task: ffffffff81a13460 ti: ffffffff81a00000 task.ti: ffffffff81a00000
[  306.740428] RIP: 0010:[<ffffffff81155b21>]  [<ffffffff81155b21>] __get_vm_area_node+0x141/0x150
[  306.750020] RSP: 0018:ffff88207fa07b88  EFLAGS: 00010006
[  306.755864] RAX: 0000000080110000 RBX: 0000000000000010 RCX: ffffc90000000000
[  306.763714] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000001000
[  306.771565] RBP: ffff88207fa07bd8 R08: ffffe8ffffffffff R09: 00000000000000d0
[  306.779417] R10: ffff88207150de00 R11: 0000000000000001 R12: 0000000000000001
[  306.787267] R13: 0000000000000001 R14: 00000000ffffffff R15: 00000fe0e2140000
[  306.795119] FS:  0000000000000000(0000) GS:ffff88207fa00000(0000) knlGS:0000000000000000
[  306.804022] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  306.810342] CR2: ffffffffff600400 CR3: 0000000001a0e000 CR4: 00000000001407f0
[  306.818193] Stack:
[  306.820402]  ffff88207fa07c40 0000000000000000 ffffe8ffffffffff ffffc90000000000
[  306.828580]  ffff88207fa07bf8 0000000000000010 00000fe0e2140028 0000000000001000
[  306.836758]  0000000000000001 00000fe0e2140000 ffff88207fa07bf8 ffffffff81155b6b
[  306.844934] Call Trace:
[  306.847621]  <NMI> 
[  306.849734]  [<ffffffff81155b6b>] get_vm_area_caller+0x3b/0x40
[  306.856374]  [<ffffffff812eadb6>] ? acpi_os_write_memory+0x73/0xcd
[  306.863180]  [<ffffffff81041483>] __ioremap_caller+0x263/0x3a0
[  306.869601]  [<ffffffff8109b1da>] ? up+0x2a/0x50
[  306.874681]  [<ffffffff812eadb6>] ? acpi_os_write_memory+0x73/0xcd
[  306.881482]  [<ffffffff810415ef>] ioremap_cache+0xf/0x20
[  306.887329]  [<ffffffff812eadb6>] acpi_os_write_memory+0x73/0xcd
[  306.893942]  [<ffffffff81313e99>] acpi_hw_write+0x47/0xd1
[  306.899886]  [<ffffffff814cbc9b>] ? printk+0x48/0x4a
[  306.905350]  [<ffffffff81315523>] acpi_reset+0x93/0xbc
[  306.911005]  [<ffffffff812ebe30>] acpi_reboot+0xb8/0xc0
[  306.916764]  [<ffffffff8102ed8a>] native_machine_emergency_restart+0x19a/0x220
[  306.924716]  [<ffffffff8102e9e4>] machine_emergency_restart+0x14/0x20
[  306.931809]  [<ffffffff8107ad53>] emergency_restart+0x13/0x20
[  306.938133]  [<ffffffff814cbb6e>] panic+0x189/0x1e5
[  306.943500]  [<ffffffff814d10cd>] io_check_error+0x9d/0xa0
[  306.949536]  [<ffffffff814d119c>] default_do_nmi+0xcc/0x200
[  306.955669]  [<ffffffff814d1360>] do_nmi+0x90/0xe0
[  306.960942]  [<ffffffff814d0667>] end_repeat_nmi+0x1e/0x2e
[  306.966979]  [<ffffffff812ea0eb>] ? intel_idle+0xbb/0x140
[  306.972920]  [<ffffffff812ea0eb>] ? intel_idle+0xbb/0x140
[  306.978862]  [<ffffffff812ea0eb>] ? intel_idle+0xbb/0x140
[  306.984802]  <<EOE>> 
[  306.987108]  [<ffffffff813e2252>] cpuidle_enter_state+0x42/0xd0
[  306.993836]  [<ffffffff813e22f2>] cpuidle_enter+0x12/0x20
[  306.999778]  [<ffffffff81096641>] cpuidle_idle_call+0x101/0x1c0
[  307.006291]  [<ffffffff810968d5>] cpu_idle_loop+0x185/0x1a0
[  307.012423]  [<ffffffff8109690e>] cpu_startup_entry+0x1e/0x20
[  307.018746]  [<ffffffff814c4412>] rest_init+0x72/0x80
[  307.024307]  [<ffffffff81b211ed>] start_kernel+0x35d/0x364
[  307.030344]  [<ffffffff81b20cae>] ? repair_env_string+0x5b/0x5b
[  307.036860]  [<ffffffff814ca5f6>] ? memblock_reserve+0x49/0x4e
[  307.043281]  [<ffffffff81b205ad>] x86_64_start_reservations+0x2a/0x2c
[  307.050370]  [<ffffffff81b206f0>] x86_64_start_kernel+0x141/0x148
[  307.057073] Code: 05 7c 09 cb 00 01 48 89 d8 4c 8b 65 e0 48 8b 5d d8 4c 8b 6d e8 4c 8b 75 f0 4c 8b 7d f8 c9 c3 48 89 df e8 63 8f 01 00 31 db eb db <0f> 0b eb fe 66 66 2e 0f 1f 84 00 00 00 00 00 55 41 b9 ff ff ff 
[  307.078390] RIP  [<ffffffff81155b21>] __get_vm_area_node+0x141/0x150
[  307.085393]  RSP <ffff88207fa07b88>
[  307.089229] ---[ end trace 9e8a787fe880d053 ]---

I reported this initially in the acpica.org tracker at 
    https://bugs.acpica.org/show_bug.cgi?id=1089
The response to that intial bugzilla requested I file the issue in the kernel.org tracker.

I am attaching a patch that I constructed to avoid this issue by creating a virtual mapping for the reset register during ACPI initialization.  By so doing acpi_reset is made safe from interrupt context.
Comment 1 David Box 2014-05-29 22:09:51 UTC
Hi Randy,

Your patch includes a call to arch_reserve_mem_area() which can't be used in acpica code since it's Linux specific. All code is drivers/acpi/acpica is OS independent, taken mostly as is into Linux and other OS's.

Are you sure you need to make this call? Is the reset register address not in an area of memory that's already reserved?

Dave
Comment 2 Randy Wright 2014-06-03 15:21:29 UTC
Hi Dave, 

I will research your question carefully, since the answer to your question is that I believe the region containing the reset register should be reserved already and the arch_reserve_mem_area would simply categorize it more precisely.

In the particular configuration of the prototype I am using, the reset register is on a page starting at 0xfe062141000.  The EFI memmap shows this as type 11 
EFI_MEMORY_MAPPED_IO, and it appears in the initial E820 map as reserved:

[    0.000000] BIOS-e820: [mem 0x00000fe060000000-0x00000fe073ffffff] reserved

After adding the arch_reserve_mem_area call, it is more accurately recorded in the map as ACPI data but it does seem that "reserved" really should have been okay:

[    0.000000] modified: [mem 0x00000fe060000000-0x00000fe062140027] reserved
[    0.000000] modified: [mem 0x00000fe062140028-0x00000fe06214002f] ACPI data
[    0.000000] modified: [mem 0x00000fe062140030-0x00000fe073ffffff] reserved

The check_early_ioremap_leak diagnostic appeared at the same point I did two things to an initial version of the same patch idea.  The initial version of the patch ran later, called from acpi_init.  I both moved it to be called earlier, from acpi_tb_setup_fadt_registers. and at the same time brought the patch forward from an earlier stable kernel into the latest 3.15 rc kernel.  So perhaps in moving it I did something else that triggered the diagnostic.  I will research more today.
Comment 3 Randy Wright 2014-06-03 22:20:52 UTC
Created attachment 138021 [details]
Use acpi_os_map_generic_address to pre-map the reset register

As I further researched the necessity of using arch_reserve_mem_area, I became convinced there was no point in trying to do the allocation when the early ioremap allocator is in use.  By performing the mapping slightly later in the boot sequence, in acpi_os_initialize, the early allocator is avoided, so of course there is no early ioremap leak diagnostic.  As an additional benefit, the patch logically fits very well with the existing calls to acpi_os_map_generic_address already made in acpi_os_initialize.
Comment 4 Rafael J. Wysocki 2014-06-05 21:22:22 UTC
Patch: https://patchwork.kernel.org/patch/4295151/

Note You need to log in before you can comment on or make changes to this bug.