Bug 87971

Summary: PowerEdge R220 boot fail in 32-bit OS unless acpi=off
Product: ACPI Reporter: sialnije
Component: ACPICA-CoreAssignee: Lv Zheng (lv.zheng)
Status: CLOSED CODE_FIX    
Severity: high CC: paulepanter, rui.zhang
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 3.13.0 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: acpidump, iomem, fwts report
64-bit acpidump
debug patch output
output of objdump -S exregion.o
added printk in acpi_ex_system_memory_space_handler()
terminal output with printk added in acpi_ex_system_memory_space_handler
added more debug printk
[PATCH] ACPICA: Utilities: split IO address types from data type models.
ACPICA: Utilities: split IO address types from data type models.
ACPICA: Utilities: split IO address types from data type
[DBG PATCH 1/8] Divergences: reduce index divergences for 32-bit PAE fix.
[DBG PATCH 2/8] ACPICA: Tables: Change acpi_find_root_pointer() to use acpi_physical_address.
[DBG PATCH 3/8] ACPICA: Unix: Cleanup to use ACPI_TO_INTEGER() to calc page offset.
[DBG PATCH 4/8] ACPICA: Executer: Cleanup to remove an unnecessary conversion.
[DBG PATCH 5/8] ACPICA: Utilities: Cleanup to enforce ACPI_PHYSADDR_TO_PTR()/ACPI_PTR_TO_PHYSADDR().
[DBG PATCH 6/8] ACPICA: Utilities: Cleanup to convert physical address printing formats.
[DBG PATCH 7/8] ACPICA: Utilities: Cleanup to remove useless ACPI_PRINTF/FORMAT_xxx helpers.
[DBG PATCH 8/8] ACPICA: Utilities: split IO address types from data type models.
[DBG 3.16.7 PATCH 7/8] ACPICA: Utilities: Cleanup to remove useless ACPI_PRINTF/FORMAT_xxx helpers.

Description sialnije 2014-11-10 04:45:36 UTC
Created attachment 157101 [details]
acpidump, iomem,  fwts report

R220 is a relatively new platform. 64-bit OS'es work fine with this box.
Unfortunately our product line is locked to 32-bit and 32 bit OS just
can't boot in this box. Kernel versions I have tried: 3.0.23, 3.10.52,
3.13.0. All recent versions of Debian and CentOS running 2.6.x also
crash the same way. Besides acpi=off, I have tried:
apci_osi=! acpi_osi=Linux
pci=noacpi
Didn't help.

Currently I have installed Ubuntu 14.0.4 desktop in this box. When I 
tried to run acpidump version 20140214, it gave this error:
--------------------------------------------
Cannot open directory - /sys/firmware/acpi/tables/dynamic
Could not get ACPI tables, AE_NOT_FOUND
--------------------------------------------

The acpidump attached was collected with acpidump version 20100513-3
when Debian was installed with kernel version 3.10.52. I hope the output
is still valid. Also include output from fwts, in case it may be helpful.

The backtrace captured on serial port:
[    0.125965] BUG: unable to handle kernel paging request at 6241fff4
[    0.132979] IP: [<c138576e>] acpi_ex_system_memory_space_handler+0x19a/0x1ed
[    0.140859] *pdpt = 0000000000000000 *pde = 0000000000000000
[    0.147287] Oops: 0000 [#1] SMP 
[    0.150905] Modules linked in: 
[    0.154319] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-32-generic #57-u
[    0.162763] Hardware name: Dell Inc. PowerEdge R220/081N4V, BIOS 1.1.4 05/064
[    0.171109] task: f74d0000 ti: f749e000 task.ti: f749e000
[    0.177131] EIP: 0060:[<c138576e>] EFLAGS: 00010246 CPU: 0
[    0.183249] EIP is at acpi_ex_system_memory_space_handler+0x19a/0x1ed
[    0.190433] EAX: 6241fff4 EBX: 00000020 ECX: 0000000c EDX: f74024a0
[    0.197422] ESI: 00000000 EDI: f749fbc8 EBP: f749fae4 ESP: f749faa8
[    0.204411]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[    0.210431] CR0: 80050033 CR2: 6241fff4 CR3: 01a99000 CR4: 001407f0
[    0.217420] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    0.224409] DR6: fffe0ff0 DR7: 00000400
[    0.228684] Stack:
[    0.230924]  f74d0000 f749fab4 c136e190 c137fb6b 00000014 c137fb6b 00000000 0
[    0.239635]  00000000 0000000c 00000000 fffffff4 f7439060 f748bb70 c13855d4 4
[    0.248342]  c137f6c2 f749fbc8 00000000 f7753140 00000000 ffffffff f749fc2c 0
[    0.257052] Call Trace:
[    0.259780]  [<c136e190>] ? acpi_os_signal_semaphore+0x1f/0x2c
[    0.266292]  [<c137fb6b>] ? acpi_ev_system_memory_region_setup+0x60/0x7e
[    0.273773]  [<c137fb6b>] ? acpi_ev_system_memory_region_setup+0x60/0x7e
[    0.281247]  [<c13855d4>] ? acpi_ex_do_logical_op+0x15f/0x15f
[    0.287656]  [<c137f6c2>] acpi_ev_address_space_dispatch+0x19c/0x1ee
[    0.294745]  [<c1382721>] acpi_ex_access_region+0x1e0/0x269
[    0.300960]  [<c1382b7a>] acpi_ex_field_datum_io+0x108/0x1ce
[    0.307273]  [<c1382cbe>] acpi_ex_write_with_update_rule+0x7e/0xf3
[    0.314167]  [<c1382721>] ? acpi_ex_access_region+0x1e0/0x269
[    0.320576]  [<c1382a5a>] acpi_ex_insert_into_field+0x2b0/0x2c8
[    0.327180]  [<c138251e>] acpi_ex_write_data_to_field+0x18e/0x1b1
[    0.333978]  [<c1393842>] ? acpi_ut_allocate_object_desc_dbg+0x3a/0x66
[    0.341263]  [<c138648f>] acpi_ex_store_object_to_node+0x9f/0xb2
[    0.347962]  [<c138655c>] acpi_ex_store+0xba/0x216
[    0.353304]  [<c1383c29>] acpi_ex_opcode_1A_1T_1R+0x426/0x53a
[    0.359712]  [<c13860d3>] ? acpi_ex_resolve_operands+0x1db/0x4c2
[    0.366414]  [<c137c7f5>] acpi_ds_exec_end_op+0xc4/0x3a5
[    0.372340]  [<c138cc62>] ? acpi_ps_get_next_arg+0x300/0x35f
[    0.378651]  [<c138de21>] ? acpi_ps_append_arg+0x1e/0x7f
[    0.384577]  [<c138d158>] acpi_ps_parse_loop+0x497/0x4d3
[    0.390501]  [<c137adfc>] ? acpi_ds_call_control_method+0xe8/0x154
[    0.397394]  [<c138daaf>] acpi_ps_parse_aml+0x8a/0x23c
[    0.403125]  [<c138e1fc>] acpi_ps_execute_method+0x19e/0x23c
[    0.409437]  [<c13892b2>] acpi_ns_evaluate+0x1b8/0x243
[    0.415169]  [<c13920b2>] ? acpi_ut_remove_reference+0x2a/0x2d
[    0.421675]  [<c139294c>] ? acpi_ut_execute_STA+0x46/0x4e
[    0.427697]  [<c13895fb>] acpi_ns_init_one_device+0x7a/0x9f
[    0.433914]  [<c138b74f>] acpi_ns_walk_namespace+0xb9/0x16b
[    0.440130]  [<c1389140>] ? acpi_ns_evaluate+0x46/0x243
[    0.445959]  [<c13897ce>] acpi_ns_initialize_devices+0xed/0x13b
[    0.452562]  [<c1389581>] ? acpi_ns_init_one_object+0xf4/0xf4
[    0.458974]  [<c19f3545>] ? acpi_sleep_proc_init+0x2e/0x2e
[    0.465094]  [<c19f4c8b>] acpi_initialize_objects+0x34/0x47
[    0.471311]  [<c19f35cd>] acpi_init+0x88/0x257
[    0.476266]  [<c140c9c4>] ? __class_create+0x44/0x70
[    0.481799]  [<c19f1fd5>] ? rio_bus_init+0x2a/0x2a
[    0.487143]  [<c19f2041>] ? fbmem_init+0x6c/0x96
[    0.492292]  [<c1002122>] do_one_initcall+0xd2/0x190
[    0.497831]  [<c11d183b>] ? __proc_create+0x9b/0xd0
[    0.503271]  [<c1073168>] ? parameq+0x18/0x70
[    0.508135]  [<c19b4500>] ? do_early_param+0x54/0x78
[    0.513671]  [<c10733f9>] ? parse_args+0x239/0x420
[    0.519014]  [<c1090eff>] ? __wake_up+0x3f/0x50
[    0.524066]  [<c19b4c46>] kernel_init_freeable+0x14f/0x1e6
[    0.530184]  [<c19b4524>] ? do_early_param+0x78/0x78
[    0.535722]  [<c1647060>] kernel_init+0x10/0x100
[    0.540871]  [<c165ef37>] ret_from_kernel_thread+0x1b/0x28
[    0.546991]  [<c1647050>] ? rest_init+0x70/0x70
[    0.552045]  [<c13c3e73>] ? tty_devnum+0x3/0x20
[    0.557097] Code: 00 00 74 22 77 0a 83 fb 08 75 69 0f b6 00 eb 1d 83 fb 20 73
[    0.578613] EIP: [<c138576e>] acpi_ex_system_memory_space_handler+0x19a/0x1e8
[    0.588627] CR2: 000000006241fff4
[    0.592324] ---[ end trace b9a45f0f2891d2c1 ]---
[    0.597484] Kernel panic - not syncing: Attempted to kill init! exitcode=0x09
[    0.597484]
Comment 1 Lv Zheng 2014-12-08 05:58:09 UTC
(In reply to sialnije from comment #0)
> Created attachment 157101 [details]
> acpidump, iomem,  fwts report
> 
> R220 is a relatively new platform. 64-bit OS'es work fine with this box.
> Unfortunately our product line is locked to 32-bit and 32 bit OS just
> can't boot in this box. Kernel versions I have tried: 3.0.23, 3.10.52,
> 3.13.0. All recent versions of Debian and CentOS running 2.6.x also
> crash the same way. Besides acpi=off, I have tried:
> apci_osi=! acpi_osi=Linux
> pci=noacpi
> Didn't help.
> 
> Currently I have installed Ubuntu 14.0.4 desktop in this box. When I 
> tried to run acpidump version 20140214, it gave this error:
> --------------------------------------------
> Cannot open directory - /sys/firmware/acpi/tables/dynamic
> Could not get ACPI tables, AE_NOT_FOUND
> --------------------------------------------

The folder is not there as ACPI is disabled due to acpi=off.
This isn't a critical error.
As acpidump still can dump some tables from /dev/mem.
Did you obtain the acpidump output?

But IMO, as your report is related to a 32-bit/64-bit compliance issue.
You should boot into a working 64-bit ACPI enabled kernel, and try the acpidump.
Please do this and upload the output.

> The acpidump attached was collected with acpidump version 20100513-3
> when Debian was installed with kernel version 3.10.52. I hope the output
> is still valid.

Its output is confusing.
We cannot distinguish whether the table is customized or not.
Which prevents us from root causing issues.
And there are bugs that tables cannot be dumped.

I can check it later.

> Also include output from fwts, in case it may be helpful.
> 
> The backtrace captured on serial port:
> [    0.125965] BUG: unable to handle kernel paging request at 6241fff4
> [    0.132979] IP: [<c138576e>]
> acpi_ex_system_memory_space_handler+0x19a/0x1ed
> [    0.140859] *pdpt = 0000000000000000 *pde = 0000000000000000
> [    0.147287] Oops: 0000 [#1] SMP 
> [    0.150905] Modules linked in: 
> [    0.154319] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-32-generic
> #57-u
> [    0.162763] Hardware name: Dell Inc. PowerEdge R220/081N4V, BIOS 1.1.4
> 05/064
> [    0.171109] task: f74d0000 ti: f749e000 task.ti: f749e000
> [    0.177131] EIP: 0060:[<c138576e>] EFLAGS: 00010246 CPU: 0
> [    0.183249] EIP is at acpi_ex_system_memory_space_handler+0x19a/0x1ed
> [    0.190433] EAX: 6241fff4 EBX: 00000020 ECX: 0000000c EDX: f74024a0
> [    0.197422] ESI: 00000000 EDI: f749fbc8 EBP: f749fae4 ESP: f749faa8
> [    0.204411]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [    0.210431] CR0: 80050033 CR2: 6241fff4 CR3: 01a99000 CR4: 001407f0
> [    0.217420] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
> [    0.224409] DR6: fffe0ff0 DR7: 00000400
> [    0.228684] Stack:
> [    0.230924]  f74d0000 f749fab4 c136e190 c137fb6b 00000014 c137fb6b
> 00000000 0
> [    0.239635]  00000000 0000000c 00000000 fffffff4 f7439060 f748bb70
> c13855d4 4
> [    0.248342]  c137f6c2 f749fbc8 00000000 f7753140 00000000 ffffffff
> f749fc2c 0
> [    0.257052] Call Trace:
> [    0.259780]  [<c136e190>] ? acpi_os_signal_semaphore+0x1f/0x2c
> [    0.266292]  [<c137fb6b>] ? acpi_ev_system_memory_region_setup+0x60/0x7e
> [    0.273773]  [<c137fb6b>] ? acpi_ev_system_memory_region_setup+0x60/0x7e
> [    0.281247]  [<c13855d4>] ? acpi_ex_do_logical_op+0x15f/0x15f
> [    0.287656]  [<c137f6c2>] acpi_ev_address_space_dispatch+0x19c/0x1ee
> [    0.294745]  [<c1382721>] acpi_ex_access_region+0x1e0/0x269
> [    0.300960]  [<c1382b7a>] acpi_ex_field_datum_io+0x108/0x1ce
> [    0.307273]  [<c1382cbe>] acpi_ex_write_with_update_rule+0x7e/0xf3
> [    0.314167]  [<c1382721>] ? acpi_ex_access_region+0x1e0/0x269
> [    0.320576]  [<c1382a5a>] acpi_ex_insert_into_field+0x2b0/0x2c8
> [    0.327180]  [<c138251e>] acpi_ex_write_data_to_field+0x18e/0x1b1
> [    0.333978]  [<c1393842>] ? acpi_ut_allocate_object_desc_dbg+0x3a/0x66
> [    0.341263]  [<c138648f>] acpi_ex_store_object_to_node+0x9f/0xb2
> [    0.347962]  [<c138655c>] acpi_ex_store+0xba/0x216
> [    0.353304]  [<c1383c29>] acpi_ex_opcode_1A_1T_1R+0x426/0x53a
> [    0.359712]  [<c13860d3>] ? acpi_ex_resolve_operands+0x1db/0x4c2
> [    0.366414]  [<c137c7f5>] acpi_ds_exec_end_op+0xc4/0x3a5
> [    0.372340]  [<c138cc62>] ? acpi_ps_get_next_arg+0x300/0x35f
> [    0.378651]  [<c138de21>] ? acpi_ps_append_arg+0x1e/0x7f
> [    0.384577]  [<c138d158>] acpi_ps_parse_loop+0x497/0x4d3
> [    0.390501]  [<c137adfc>] ? acpi_ds_call_control_method+0xe8/0x154
> [    0.397394]  [<c138daaf>] acpi_ps_parse_aml+0x8a/0x23c
> [    0.403125]  [<c138e1fc>] acpi_ps_execute_method+0x19e/0x23c
> [    0.409437]  [<c13892b2>] acpi_ns_evaluate+0x1b8/0x243
> [    0.415169]  [<c13920b2>] ? acpi_ut_remove_reference+0x2a/0x2d
> [    0.421675]  [<c139294c>] ? acpi_ut_execute_STA+0x46/0x4e
> [    0.427697]  [<c13895fb>] acpi_ns_init_one_device+0x7a/0x9f
> [    0.433914]  [<c138b74f>] acpi_ns_walk_namespace+0xb9/0x16b
> [    0.440130]  [<c1389140>] ? acpi_ns_evaluate+0x46/0x243
> [    0.445959]  [<c13897ce>] acpi_ns_initialize_devices+0xed/0x13b
> [    0.452562]  [<c1389581>] ? acpi_ns_init_one_object+0xf4/0xf4
> [    0.458974]  [<c19f3545>] ? acpi_sleep_proc_init+0x2e/0x2e
> [    0.465094]  [<c19f4c8b>] acpi_initialize_objects+0x34/0x47
> [    0.471311]  [<c19f35cd>] acpi_init+0x88/0x257
> [    0.476266]  [<c140c9c4>] ? __class_create+0x44/0x70
> [    0.481799]  [<c19f1fd5>] ? rio_bus_init+0x2a/0x2a
> [    0.487143]  [<c19f2041>] ? fbmem_init+0x6c/0x96
> [    0.492292]  [<c1002122>] do_one_initcall+0xd2/0x190
> [    0.497831]  [<c11d183b>] ? __proc_create+0x9b/0xd0
> [    0.503271]  [<c1073168>] ? parameq+0x18/0x70
> [    0.508135]  [<c19b4500>] ? do_early_param+0x54/0x78
> [    0.513671]  [<c10733f9>] ? parse_args+0x239/0x420
> [    0.519014]  [<c1090eff>] ? __wake_up+0x3f/0x50
> [    0.524066]  [<c19b4c46>] kernel_init_freeable+0x14f/0x1e6
> [    0.530184]  [<c19b4524>] ? do_early_param+0x78/0x78
> [    0.535722]  [<c1647060>] kernel_init+0x10/0x100
> [    0.540871]  [<c165ef37>] ret_from_kernel_thread+0x1b/0x28
> [    0.546991]  [<c1647050>] ? rest_init+0x70/0x70
> [    0.552045]  [<c13c3e73>] ? tty_devnum+0x3/0x20
> [    0.557097] Code: 00 00 74 22 77 0a 83 fb 08 75 69 0f b6 00 eb 1d 83 fb
> 20 73
> [    0.578613] EIP: [<c138576e>]
> acpi_ex_system_memory_space_handler+0x19a/0x1e8
> [    0.588627] CR2: 000000006241fff4
> [    0.592324] ---[ end trace b9a45f0f2891d2c1 ]---
> [    0.597484] Kernel panic - not syncing: Attempted to kill init!
> exitcode=0x09
> [    0.597484]

Sounds like a duplicate bug.
Could you offer the test mentioned in bug 79501?

Thanks and best regards
-Lv
Comment 2 sialnije 2014-12-11 21:27:47 UTC
Created attachment 160411 [details]
64-bit acpidump

Installed Ubuntu 14.0.4 64-bit server and ran:
$ acpidump > acpidump.out

Still working on building a 32-bit kernel to collect debug info.
Comment 3 sialnije 2014-12-15 17:50:12 UTC
(In reply to Lv Zheng from comment #1)

> 
> The folder is not there as ACPI is disabled due to acpi=off.
> This isn't a critical error.
> As acpidump still can dump some tables from /dev/mem.
> Did you obtain the acpidump output?

Tried this command:
 acpidump > /tmp/acpidump.out

It was done on a 32-bit Ubuntu Desktop 14.04.1 with acpi=off.
The file /tmp/acpidump.out is just an empty file.
Not sure if "dump some tables from /dev/mem" meant I have to use the
"-a" switch to specify table addresses. If it is indeed what you meant,
I would have to use info from the acpidump obtained from 64-bit.
Are the table addresses for 64-bit and 32-bit the same?
Comment 4 sialnije 2014-12-15 18:03:26 UTC
Created attachment 160711 [details]
debug patch output

Added the following line to osl.c as indicated in bug 79501 attachment 150891 [details]:

printk(KERN_ERR PREFIX "acpi_map: phys=%#018Lx size=%#018Lx virt=%#018Lx\n", pg_off, pg_sz, virt);

Then built and installed the patched kernel and boot the system without
acpi=off option. The printk statement does not show up in the serial terminal,
only 32 lines of header followed by the kernel oops and the backtrace.
May be acpi_os_map_memory() never get invoked?!?

NB: used the kernel option
earlyprintk=serial,ttyS0,115200

Without the earlyprintk option I would only get the backtrace. The 32 lines
of header would not show.
Comment 5 Lv Zheng 2014-12-22 06:02:52 UTC
(In reply to sialnije from comment #3)
> (In reply to Lv Zheng from comment #1)
> 
> > 
> > The folder is not there as ACPI is disabled due to acpi=off.
> > This isn't a critical error.
> > As acpidump still can dump some tables from /dev/mem.
> > Did you obtain the acpidump output?
> 
> Tried this command:
>  acpidump > /tmp/acpidump.out
> 
> It was done on a 32-bit Ubuntu Desktop 14.04.1 with acpi=off.
> The file /tmp/acpidump.out is just an empty file.
> Not sure if "dump some tables from /dev/mem" meant I have to use the
> "-a" switch to specify table addresses. If it is indeed what you meant,
> I would have to use info from the acpidump obtained from 64-bit.
> Are the table addresses for 64-bit and 32-bit the same?

This sounds like a bug in acpidump.
The RSDP value is from EFI (/sys/firmware/efi/systab) not ACPI subsystem.

If you want to try further, it is not "-a", it's "-r" that you can use to specify an RSDP address without using "/sys/firmware/efi/systab".
According to your 64-bit outputs, you can try:
acpidump -r 0x00000000000FE020

Thanks and best regards
-Lv
Comment 6 Lv Zheng 2014-12-22 07:11:18 UTC
(In reply to sialnije from comment #4)
> Created attachment 160711 [details]
> debug patch output
> 
> Added the following line to osl.c as indicated in bug 79501 attachment
> 150891 [details]:
> 
> printk(KERN_ERR PREFIX "acpi_map: phys=%#018Lx size=%#018Lx virt=%#018Lx\n",
> pg_off, pg_sz, virt);
> 
> Then built and installed the patched kernel and boot the system without
> acpi=off option. The printk statement does not show up in the serial
> terminal,
> only 32 lines of header followed by the kernel oops and the backtrace.
> May be acpi_os_map_memory() never get invoked?!?
> 
> NB: used the kernel option
> earlyprintk=serial,ttyS0,115200
> 
> Without the earlyprintk option I would only get the backtrace. The 32 lines
> of header would not show.

It's weired.
The IP is in the acpi_ex_system_memory_space_handler().
If acpi_os_map_memory() hasn't been invoked, maybe the bug is in the acpi_ut_short_divide().
Could you add some printk() lines before/after acpi_ut_short_divide() to catch?

Thanks and best regards
-Lv
Comment 7 Lv Zheng 2014-12-22 07:31:09 UTC
Please ignore the previous reply.
You can just upload an output of the following command:
objdump -S drivers/acpi/acpica/exregion.o.

Thanks and best regards
-Lv
Comment 8 sialnije 2014-12-24 00:31:06 UTC
Created attachment 161731 [details]
output of objdump -S exregion.o
Comment 9 Lv Zheng 2014-12-24 01:36:30 UTC
(In reply to sialnije from comment #8)
> Created attachment 161731 [details]
> output of objdump -S exregion.o

It's this ">" marked line:
			*value = (u64)ACPI_GET32(logical_addr_ptr);
>19a:   8b 00                   mov    (%eax),%eax
 19c:	89 07                	mov    %eax,(%edi)
 19e:	c7 47 04 00 00 00 00 	movl   $0x0,0x4(%edi)
 1a5:	eb 3c                	jmp    1e3 <acpi_ex_system_memory_space_handler+0x1e3>
		}
		break;

It seems the mapped logical address triggers a bus error.
It sounds like we've mapped a zapped 64-bit address and is trying to access the wrong mapped logical address.
IMO, you should see acpi_os_map_memory() logs before.
We need to know which mapped physical address has triggered this error.
I'll check later why it is not logged.

Thanks and best regards
-Lv
Comment 10 sialnije 2014-12-30 17:38:43 UTC
Created attachment 162131 [details]
added printk in acpi_ex_system_memory_space_handler()

Added two printk statements in exregion.c 
acpi_ex_system_memory_space_handler() to try to flush out why 
acpi_os_map_memory() not get invoked. See line 196 and 206 in
the attachment.
With this change, the debug print you added in acpi_os_map_memory()
shows up a few times. Unfortunately when the bad address 6241fff4
shows up, none of the printk statements were hit.
Comment 11 sialnije 2014-12-30 17:42:45 UTC
Created attachment 162141 [details]
terminal output with printk added in acpi_ex_system_memory_space_handler
Comment 12 sialnije 2014-12-30 23:32:43 UTC
Created attachment 162151 [details]
added more debug printk

Added more printk statements in acpi_os_map_memory() and
acpi_ex_system_memory_space_handler().

When the bad address appeared, the following code in acpi_os_map_memory()
decided mapping already exists:

	/* Check if there's a suitable mapping already. */
	map = acpi_map_lookup(phys, size);
	if (map) {
		map->refcount++;
		goto out;
	}

This was the first and only time acpi_map_lookup() returned non null.
Comment 13 sialnije 2014-12-31 21:48:52 UTC
Seems function acpi_map_lookup() has a math overflow problem in 32-bit builds.
The code is this:
--------------------------------------------
if (map->phys <= phys  &&
    phys + size <= map->phys + map->size)
  return map;
---------------------------------------------

The address that cause the crash: phy=0xFFFFFFF4, size=0xC
0xFFFFFFF4 + 0xc wraps around to zero.

I think this function has to cast the two sums to u64 before compare.
Comment 14 Lv Zheng 2015-01-04 02:16:11 UTC
(In reply to sialnije from comment #13)
> Seems function acpi_map_lookup() has a math overflow problem in 32-bit
> builds.
> The code is this:
> --------------------------------------------
> if (map->phys <= phys  &&
>     phys + size <= map->phys + map->size)
>   return map;
> ---------------------------------------------
> 
> The address that cause the crash: phy=0xFFFFFFF4, size=0xC
> 0xFFFFFFF4 + 0xc wraps around to zero.
> 
> I think this function has to cast the two sums to u64 before compare.

Great! Thanks for the analysis. :)

Then this seems to be the known issue.

In include/acpi/actypes.h:
acpi_physical_address and acpi_size can be u32 for 32-bit builds, they are determined by ACPI_MACHINE_WIDTH.
In include/acpi/platform/aclinux.h:
The ACPI_MACHINE_WIDTH is BITS_PER_LONG.
So on a physically 64-bit system, though it can be configured as 32-bit compliant one, ACPICA simply defines physical addresses as 32-bit.

To fix it, we need to be careful because ACPICA also depends on the ACPI_MACHINE_WIDTH definition to follow Linux LP64 compliant design.
The actypes.h need to be cleaned up. Splitting the Linux LP64 compliant stuff apart from the physical address.

Thanks and best regards
-Lv
Comment 15 Lv Zheng 2015-01-04 06:01:58 UTC
Created attachment 162371 [details]
[PATCH] ACPICA: Utilities: split IO address types from data type models.

This is an fix for the known issue.
Please apply this patch and try again.
Comment 16 sialnije 2015-01-06 18:57:53 UTC
(In reply to Lv Zheng from comment #15)
> Created attachment 162371 [details]
> [PATCH] ACPICA: Utilities: split IO address types from data type models.
> 
> This is an fix for the known issue.
> Please apply this patch and try again.

The patch works! Thanks a lot for help.
Comment 17 Lv Zheng 2015-01-07 06:21:22 UTC
OK.
Marking it as resolved.
Thanks for the reporting and testing.
Comment 18 Lv Zheng 2015-01-12 08:38:12 UTC
Created attachment 163281 [details]
ACPICA: Utilities: split IO address types from data type models.

This might be the final version.
It includes code to fix ACPICA upstream compilation errors caused by -Wint-to-pointer-cast.
Please help to confirm again.

Thanks in advance.
Comment 19 Paul Menzel 2015-01-14 22:56:23 UTC
Comment on attachment 163281 [details]
ACPICA: Utilities: split IO address types from data type models.

> From 83c6016492ce82a0bae81186d25c07c2a27fe685 Mon Sep 17 00:00:00 2001
> From: Lv Zheng <lv.zheng@intel.com>
> Data: Tue, 13 Jan 2015 00:19:44 +0800

Dat*e*

> From: Lv Zheng <lv.zheng@intel.com>
> Subject: [DBG PATCH] ACPICA: ACPICA: Utilities: split IO address types from
> data type models.

Just one ACPICA. Why tag it as a debug patch?

> It is reported that on a physically 64-bit addressed machine, 32-bit kernel
> can trigger crashes in accessing the memory regions that are beyond the
> 32-bit boundary.
>
> This patch fixes this gap by always defining IO addresses as 64-bit, and
> allows OSPMs to optimize it for a real 32-bit machine to reduce the size of
> the internal objects.

Fix this gap by always defining IO addresses as 64-bit, and allow OSPMs to optimize it for a real 32-bit machine to reduce the size of the internal objects.

> After this modification, new Linux kernel warnings can be seen:
>  drivers/acpi/acpica/exfldio.c: In function 'acpi_ex_access_region':
>  drivers/acpi/acpica/exfldio.c:265:2: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/hwvalid.c: In function 'acpi_hw_validate_io_request':
>  drivers/acpi/acpica/hwvalid.c:145:2: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/hwvalid.c:145:2: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/hwvalid.c:153:3: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/hwvalid.c:183:5: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/nsdump.c: In function 'acpi_ns_dump_one_object':
>  drivers/acpi/acpica/nsdump.c:277:12: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/tbdata.c: In function 'acpi_tb_acquire_table':
>  drivers/acpi/acpica/tbdata.c:117:7: warning: cast to pointer from integer of
>  different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/tbdata.c: In function 'acpi_tb_acquire_temp_table':
>  drivers/acpi/acpica/tbdata.c:217:18: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/tbinstal.c: In function 'acpi_tb_install_fixed_table':
>  drivers/acpi/acpica/tbinstal.c:190:3: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/tbinstal.c: In function
>  'acpi_tb_install_standard_table':
>  drivers/acpi/acpica/tbinstal.c:249:3: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/tbinstal.c:261:3: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/tbinstal.c: In function 'acpi_tb_uninstall_table':
>  drivers/acpi/acpica/tbinstal.c:520:3: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c: In function 'acpi_ut_add_address_range':
>  drivers/acpi/acpica/utaddress.c:109:2: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c:109:2: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c: In function 'acpi_ut_remove_address_range':
>  drivers/acpi/acpica/utaddress.c:162:4: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c:162:4: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c: In function 'acpi_ut_check_address_range':
>  drivers/acpi/acpica/utaddress.c:247:5: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c:247:5: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c:247:5: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]
>  drivers/acpi/acpica/utaddress.c:247:5: warning: cast to pointer from integer
>  of different size [-Wint-to-pointer-cast]

Add empty line.

> Such warnings will be compilation errors for ACPICA because ACPICA is
> compiled with "-Wall -Wint-to-pointer-cast" specified.
> So this patch also fixes such compilation warnings inside of ACPICA.

So also fix these compilation warnings inside of ACPICA.

> For an address read from the ACPI table or the namespace, since the BIOS
> running on top of 64-bit machine need also support 32-bit OSPMs, we can

need*s* *to*

> safely convert the addresses using ACPI_PHYSADDR_TO_PTR() and
> ACPI_PTR_TO_PHYSADDR() macros, but for other IO addresses, these macros are
> not safe. This patch also converts the macros into ACPI_PHYSADDR_TO_TABLE()
> and ACPI_TABLE_TO_PHYSADDR() to enfoce its usage with ACPI_TABLE pointers.

enfo*r*ce

> The only none table conversion can be seen in osunixmap.c where the pointer
> is a userspace mapped address. We convert it to be ACPI_TO_INTEGER() as it
> is actually used to calc a virtual address which should be acpi_size
> bounded.
> 
> When the physical addresses are used for printing, we carefully fix new
> compilation errors by:
> 1. For table addresses, we use ACPI_PHYSADDR_TO_TABLE() and
>    ACPI_TABLE_TO_PHYSADDR() to convert the variables.
> 2. For IO addresses, since the physical address may exceed 32-bit address
>    range, we use %8.8X%8.8X (see ACPI_FORMAT_UINT64()) to convert the %p
>    formats. Note that, by enabling ACPICA internal acpi_ut_printf() support,
>    we can further reduce all ACPI_FORMAT_UINT64() formats with standard
>    %llx formats. But as it is not enabled currently for binaries other than
>    EFI ports, we should still follow the rule to use ACPI_FORMAT_UINT64()
>    to achieve the portability.
>
> For iasl, it will enforce acpi_physical_address as 32-bit on 32-bit
> platforms, we need to define ACPI_FORCE_32BIT_PHYSADDR for it in acenv.h.
>
> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=87971
> Reference: https://bugzilla.kernel.org/show_bug.cgi?id=79501
> Reported-and-tested-by: Paul Menzel <paulepartner@users.sourceforge.net>

paulepanter@users.sourceforge.net

> Reported-and-tested-by: <sialnije@gmail.com>
> Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Comment 20 Lv Zheng 2015-01-15 00:38:31 UTC
Hi, Paul

Well. Many issues in the patch description...
Fortunately, it was not used for upstreaming.

(In reply to Paul Menzel from comment #19)
> Comment on attachment 163281 [details]
> > Data: Tue, 13 Jan 2015 00:19:44 +0800
> 
> Dat*e*

It's automatically generated by ACPICA release scripts.
So this is a bug in the script which I haven't noticed previously.
Thanks for pointing it out.

> > Subject: [DBG PATCH] ACPICA: ACPICA: Utilities: split IO address types from
> data type models.
> 
> Just one ACPICA. Why tag it as a debug patch?

The "ACPICA" is automatically appended by the release script.
I've fixed this in the patch that is used for upstreaming:
https://github.com/zetalog/acpica/commit/26c7d230

"DBG" is our policy, we use it for bugzilla posted patches which might not be the final upstreaming one.
You won't see it in the upstreaming patch.
I was disallowed to post any code here unless it was marked as "DBG".
I'm still in the toughest days of my life.

> > This patch fixes this gap by always defining IO addresses as 64-bit, and
> > allows OSPMs to optimize it for a real 32-bit machine to reduce the size of
> > the internal objects.
> 
> Fix this gap by always defining IO addresses as 64-bit, and allow OSPMs to
> optimize it for a real 32-bit machine to reduce the size of the internal
> objects.
> 
> > So this patch also fixes such compilation warnings inside of ACPICA.
> 
> So also fix these compilation warnings inside of ACPICA.
> 
> > For an address read from the ACPI table or the namespace, since the BIOS
> > running on top of 64-bit machine need also support 32-bit OSPMs, we can
> 
> need*s* *to*
> 
> > and ACPI_TABLE_TO_PHYSADDR() to enfoce its usage with ACPI_TABLE pointers.
> 
> enfo*r*ce

The above wording problems seem to have already been fixed in the upstreaming one.

> > Reported-and-tested-by: Paul Menzel <paulepartner@users.sourceforge.net>
> 
> paulepanter@users.sourceforge.net

My bad sorry!!
I'll correct it.

Besides, can this patch still fix the issue?

Thanks and best regards
-Lv
Comment 21 Lv Zheng 2015-01-15 01:30:53 UTC
Hi

(In reply to sialnije from comment #16)
> (In reply to Lv Zheng from comment #15)
> > Created attachment 162371 [details]
> > [PATCH] ACPICA: Utilities: split IO address types from data type models.
> > 
> > This is an fix for the known issue.
> > Please apply this patch and try again.
> 
> The patch works! Thanks a lot for help.

I couldn't find your full name on the internet.
This seems to be the original thread where you first detected this issue:
http://en.community.dell.com/support-forums/servers/f/956/t/19605110
There is still no signature in it.

This seems to be another post from you:
http://openssl.6102.n7.nabble.com/cannot-password-protect-key-file-in-FIPS-mode-td42975.html

So is it OK to use Reported-and-tested-by: Sial Nije <sialnije@gmail.com>?

Thanks
-Lv
Comment 22 Lv Zheng 2015-01-16 00:07:09 UTC
*** Bug 79501 has been marked as a duplicate of this bug. ***
Comment 23 Paul Menzel 2015-01-18 11:32:34 UTC
Unfortunately I am unable to apply the patch to Linux 3.16.7.

$ LANG=C git am /tmp/acpi1.patch
Applying: ACPICA: ACPICA: Utilities: split IO address types from data type models.
error: drivers/acpi/acpica/exconfig.c : does not exist in index
error: drivers/acpi/acpica/exfldio.c  : does not exist in index
error: drivers/acpi/acpica/exregion.c : does not exist in index
error: drivers/acpi/acpica/hwvalid.c  : does not exist in index
error: drivers/acpi/acpica/nsdump.c : does not exist in index
error: drivers/acpi/acpica/tbdata.c : does not exist in index
error: drivers/acpi/acpica/tbinstal.c : does not exist in index
error: drivers/acpi/acpica/tbutils.c  : does not exist in index
error: drivers/acpi/acpica/tbxfload.c : does not exist in index
error: drivers/acpi/acpica/utaddress.c  : does not exist in index
error: include/acpi/actypes.h : does not exist in index
error: include/acpi/platform/acenv.h  : does not exist in index
error: tools/power/acpi/os_specific/service_layers/osunixmap.c  : does not exist in index
Patch failed at 0001 ACPICA: ACPICA: Utilities: split IO address types from data type models.
The copy of the patch that failed is found in:
   /home/paul/src/linux/.git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Comment 24 Paul Menzel 2015-01-18 13:16:58 UTC
(In reply to Paul Menzel from comment #23)
> Unfortunately I am unable to apply the patch to Linux 3.16.7.
> 
> $ LANG=C git am /tmp/acpi1.patch
> Applying: ACPICA: ACPICA: Utilities: split IO address types from data type
> models.
> error: drivers/acpi/acpica/exconfig.c : does not exist in index
> error: drivers/acpi/acpica/exfldio.c  : does not exist in index
> error: drivers/acpi/acpica/exregion.c : does not exist in index
> error: drivers/acpi/acpica/hwvalid.c  : does not exist in index
> error: drivers/acpi/acpica/nsdump.c : does not exist in index
> error: drivers/acpi/acpica/tbdata.c : does not exist in index
> error: drivers/acpi/acpica/tbinstal.c : does not exist in index
> error: drivers/acpi/acpica/tbutils.c  : does not exist in index
> error: drivers/acpi/acpica/tbxfload.c : does not exist in index
> error: drivers/acpi/acpica/utaddress.c  : does not exist in index
> error: include/acpi/actypes.h : does not exist in index
> error: include/acpi/platform/acenv.h  : does not exist in index
> error: tools/power/acpi/os_specific/service_layers/osunixmap.c  : does not
> exist in index
> Patch failed at 0001 ACPICA: ACPICA: Utilities: split IO address types from
> data type models.
> The copy of the patch that failed is found in:
>    /home/paul/src/linux/.git/rebase-apply/patch
> When you have resolved this problem, run "git am --continue".
> If you prefer to skip this patch, run "git am --skip" instead.
> To restore the original branch and stop patching, run "git am --abort".

The raw output [1] puts spaces to the end of filenames for whatever reason.

Lv, could you please send me the `git format-patch -1` file per email or attach it here as a plain text file. No idea what I am doing wrong but it took too long already.

[1] https://bugzilla.kernel.org/attachment.cgi?id=163281&action=diff&context=patch&collapsed=&headers=1&format=raw
Comment 25 Lv Zheng 2015-01-19 02:46:19 UTC
OK, I'll post the updated patch here.

Thanks
-Lv
Comment 26 Lv Zheng 2015-01-19 03:27:39 UTC
Created attachment 163751 [details]
ACPICA: Utilities: split IO address types from data type

This is the updated patch.
I extracted the github one, applied it on top of linux-pm/linux-next branch, and formatted it out.

Thanks and best regards
-Lv
Comment 27 sialnije 2015-01-20 06:43:11 UTC
(In reply to Lv Zheng from comment #21)

> 
> So is it OK to use Reported-and-tested-by: Sial Nije <sialnije@gmail.com>?
> 

Sial Nije is correct.
Sorry I haven't check my email for a while.
Comment 28 Lv Zheng 2015-01-21 06:07:18 UTC
Created attachment 164031 [details]
[DBG PATCH 1/8] Divergences: reduce index divergences for 32-bit PAE fix.
Comment 29 Lv Zheng 2015-01-21 06:07:58 UTC
Created attachment 164041 [details]
[DBG PATCH 2/8] ACPICA: Tables: Change acpi_find_root_pointer() to use acpi_physical_address.
Comment 30 Lv Zheng 2015-01-21 06:08:30 UTC
Created attachment 164051 [details]
[DBG PATCH 3/8] ACPICA: Unix: Cleanup to use ACPI_TO_INTEGER() to calc page offset.
Comment 31 Lv Zheng 2015-01-21 06:09:21 UTC
Created attachment 164061 [details]
[DBG PATCH 4/8] ACPICA: Executer: Cleanup to remove an unnecessary conversion.
Comment 32 Lv Zheng 2015-01-21 06:09:57 UTC
Created attachment 164071 [details]
[DBG PATCH 5/8] ACPICA: Utilities: Cleanup to enforce ACPI_PHYSADDR_TO_PTR()/ACPI_PTR_TO_PHYSADDR().
Comment 33 Lv Zheng 2015-01-21 06:10:41 UTC
Created attachment 164081 [details]
[DBG PATCH 6/8] ACPICA: Utilities: Cleanup to convert physical address printing formats.
Comment 34 Lv Zheng 2015-01-21 06:13:27 UTC
Created attachment 164091 [details]
[DBG PATCH 7/8] ACPICA: Utilities: Cleanup to remove useless ACPI_PRINTF/FORMAT_xxx helpers.
Comment 35 Lv Zheng 2015-01-21 06:14:04 UTC
Created attachment 164101 [details]
[DBG PATCH 8/8] ACPICA: Utilities: split IO address types from data type models.
Comment 36 Lv Zheng 2015-01-21 06:15:42 UTC
Hi,

(In reply to sialnije from comment #27)
> (In reply to Lv Zheng from comment #21)
> 
> > 
> > So is it OK to use Reported-and-tested-by: Sial Nije <sialnije@gmail.com>?
> > 
> 
> Sial Nije is correct.
> Sorry I haven't check my email for a while.

I corrected it in attachment 164101 [details].
Could you also help to confirm this patch.

Thanks for your feedback.
Comment 37 Lv Zheng 2015-01-21 06:19:06 UTC
Created attachment 164111 [details]
[DBG 3.16.7 PATCH 7/8] ACPICA: Utilities: Cleanup to remove useless ACPI_PRINTF/FORMAT_xxx helpers.

The 3.16.7 material for PATCH 7.
Comment 38 Lv Zheng 2015-01-21 06:24:31 UTC
Hi, Paul

It seems the attachment 163751 [details] cannot be a stable material.
So you won't see it merged for Linux 3.16.7.
I split the patches.
Now the whole series is:
 attachment 164031 [details]
 attachment 164041 [details]
 attachment 164051 [details]
 attachment 164061 [details]
 attachment 164071 [details]
 attachment 164081 [details]
-attachment 164091 [details]
 attachment 164101 [details]

They are still for the linux-pm/linux-next branch.

If you want to try the whole series on top of 3.16.7, you can use this series:

 attachment 164031 [details]
 attachment 164041 [details]
 attachment 164051 [details]
 attachment 164061 [details]
 attachment 164071 [details]
 attachment 164081 [details]
+attachment 164111 [details]
*attachment 164101 [details]

I think you can also just try the attachment 164101 [details].
Without the other fixes, Linux 3.16.7 will only complain warnings during compilation, so the rest of them might not be merged for 3.16.7.

Thanks and best regards
-Lv

(In reply to Paul Menzel from comment #24)
> (In reply to Paul Menzel from comment #23)
> > Unfortunately I am unable to apply the patch to Linux 3.16.7.
> 
> The raw output [1] puts spaces to the end of filenames for whatever reason.
> 
> Lv, could you please send me the `git format-patch -1` file per email or
> attach it here as a plain text file. No idea what I am doing wrong but it
> took too long already.
> 
> [1]
> https://bugzilla.kernel.org/attachment.
> cgi?id=163281&action=diff&context=patch&collapsed=&headers=1&format=raw
Comment 39 Paul Menzel 2015-01-21 09:15:49 UTC
Lv, after taking the changes to `actypes.h` out and applying them manually (no idea why it didn’t work) I was able to use `git am` and successfully built Linux 3.16.7, I can confirm that it still works and X starts correctly.

Thanks a lot for your help!
Comment 40 Lv Zheng 2015-01-22 00:15:10 UTC
Hi, Paul

(In reply to Paul Menzel from comment #39)
> Lv, after taking the changes to `actypes.h` out and applying them manually
> (no idea why it didn’t work) I was able to use `git am` and successfully
> built Linux 3.16.7, I can confirm that it still works and X starts correctly.
> 

Thanks for the testing.
If I understood correctly, you should be talking about this block:

@@ -518,6 +518,9 @@ typedef u64 acpi_integer;
 #define ACPI_TO_POINTER(i)              ACPI_ADD_PTR (void, (void *) NULL,(acpi_size) i)
 #define ACPI_TO_INTEGER(p)              ACPI_PTR_DIFF (p, (void *) NULL)
 #define ACPI_OFFSET(d, f)               ACPI_PTR_DIFF (&(((d *) 0)->f), (void *) NULL)
+
+/* Physical table address conversions */
+
 #define ACPI_PHYSADDR_TO_PTR(i)         ACPI_TO_POINTER(i)
 #define ACPI_PTR_TO_PHYSADDR(i)         ACPI_TO_INTEGER(i)
 
It doesn't seem to be good for this bug fix series.
But should be included by another patch that renames the macros.
Thus it's also not a stable material.
I'll remove it from this series.

> Thanks a lot for your help!

You are welcome.

Best regards
-Lv
Comment 41 Lv Zheng 2015-05-18 02:16:08 UTC
Patch upstreamed:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=2b87601

Closing...