Kernel 3.1-rc3 / 3.1-rc4 doesn't boot on my Jetway NC9C-550-LF board with Intel Atom N550. Same config that works for 3.0 plus defaults for the new options. Here's the captured crash: [...] [ 7.027487] pci 0000:00:1f.2: BAR 5: assigned [mem 0xfed98400-0xfed987ff] [ 7.108769] pci 0000:00:1f.2: BAR 5: set to [mem 0xfed98400-0xfed987ff] (PCI address [0xfed98400-0xfed987ff]) [ 7.227488] pci 0000:00:1c.3: BAR 14: assigned [mem 0xff400000-0xff6fffff] [ 7.309808] pci 0000:00:1c.3: BAR 15: assigned [mem 0xff700000-0xff8fffff 64bit pref] [ 7.403544] pci 0000:00:1c.2: BAR 14: assigned [mem 0xff900000-0xffafffff] [ 7.485843] ------------[ cut here ]------------ [ 7.495817] kernel BUG at kernel/resource.c:499! [ 7.495817] invalid opcode: 0000 [#1] SMP [ 7.495817] Modules linked in: [ 7.495817] [ 7.495817] Pid: 1, comm: swapper Not tainted 3.1.0-rc4-hafanek+ #15 To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M. [ 7.495817] EIP: 0060:[<c102fa46>] EFLAGS: 00010282 CPU: 1 [ 7.495817] EIP is at reallocate_resource+0xc9/0xe0 [ 7.495817] EAX: f654bd20 EBX: f5c14b4c ECX: f654bd20 EDX: f5c14b4c [ 7.495817] ESI: f5c5fe64 EDI: f5c14b68 EBP: 00000000 ESP: f5c5fe44 [ 7.495817] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 7.495817] Process swapper (pid: 1, ti=f5c5e000 task=f5c60000 task.ti=f5c5e000) [ 7.495817] Stack: [ 7.495817] f5d48038 80000000 802fffff f5d64052 00182201 f5d48038 f5c129a8 00000000 [ 7.495817] f5c14b4c f5d48038 f5c14b4c 00182001 c102faae f5c5fe80 c119534c 80000000 [ 7.495817] ffffffff 00100000 c119534c f5c14800 00000006 fffffff4 f5c14b4c c112d68f [ 7.495817] Call Trace: [ 7.495817] [<c102faae>] ? allocate_resource+0x51/0xa2 [ 7.495817] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 7.495817] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 7.495817] [<c112d68f>] ? pci_bus_alloc_resource+0x5f/0x87 [ 7.495817] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 7.495817] [<c1133261>] ? _pci_assign_resource+0x99/0x101 [ 7.495817] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 7.495817] [<c1133526>] ? pci_reassign_resource+0x4e/0x84 [ 7.495817] [<c113a1b8>] ? __assign_resources_sorted+0x119/0x19b [ 7.495817] [<c121d0da>] ? __pci_bus_assign_resources+0x3e/0xb7 [ 7.495817] [<c1229306>] ? printk+0xe/0x11 [ 7.495817] [<c135596e>] ? pci_assign_unassigned_resources+0x7a/0x1b0 [ 7.495817] [<c10a4ec9>] ? kfree+0x3e/0xba [ 7.495817] [<c1131bef>] ? pci_get_subsys+0x48/0x50 [ 7.495817] [<c1131bef>] ? pci_get_subsys+0x48/0x50 [ 7.495817] [<c135cdeb>] ? pcibios_allocate_bus_resources+0x80/0x80 [ 7.495817] [<c135ce49>] ? pcibios_assign_resources+0x5e/0x62 [ 7.495817] [<c1001066>] ? do_one_initcall+0x66/0x104 [ 7.495817] [<c133a752>] ? kernel_init+0x9f/0x111 [ 7.495817] [<c133a6b3>] ? start_kernel+0x306/0x306 [ 7.495817] [<c122f23e>] ? kernel_thread_helper+0x6/0xd [ 7.495817] Code: 89 10 c7 43 10 00 00 00 00 eb 05 8d 42 14 eb e3 8d 74 24 04 b9 07 00 00 00 89 df 89 da f3 a5 8b 04 24 e8 06 f7 ff ff 85 c0 74 07 <0f> 0b bd f0 ff ff ff e8 69 f9 ff ff 8d 64 24 20 89 e8 5b 5e 5f [ 7.495817] EIP: [<c102fa46>] reallocate_resource+0xc9/0xe0 SS:ESP 0068:f5c5fe44 [ 10.271091] ---[ end trace 44593438a59a9533 ]--- [ 10.326314] Kernel panic - not syncing: Attempted to kill init! [ 10.397196] Pid: 1, comm: swapper Tainted: G D 3.1.0-rc4-hafanek+ #15 [ 10.483653] Call Trace: [ 10.512892] [<c1229217>] ? panic+0x4d/0x12e [ 10.563963] [<c102cdc0>] ? do_exit+0x70/0x62c [ 10.617146] [<c1002553>] ? do_bounds+0x4c/0x4c [ 10.671341] [<c102b968>] ? kmsg_dump+0x35/0xad [ 10.725523] [<c1002553>] ? do_bounds+0x4c/0x4c [ 10.779710] [<c10044e8>] ? oops_end+0x72/0x75 [ 10.832854] [<c10025bd>] ? do_invalid_op+0x6a/0x77 [ 10.891238] [<c102fa46>] ? reallocate_resource+0xc9/0xe0 [ 10.955844] [<c104a922>] ? tick_dev_program_event+0x1e/0xfd [ 11.023540] [<c104aa15>] ? tick_program_event+0x14/0x17 [ 11.087086] [<c104237b>] ? hrtimer_interrupt+0x120/0x1b9 [ 11.151710] [<c102f934>] ? __find_resource+0xd9/0x122 [ 11.213187] [<c122ea47>] ? error_code+0x67/0x6c [ 11.268406] [<c102fa46>] ? reallocate_resource+0xc9/0xe0 [ 11.332992] [<c102faae>] ? allocate_resource+0x51/0xa2 [ 11.395540] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 11.458053] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 11.520559] [<c112d68f>] ? pci_bus_alloc_resource+0x5f/0x87 [ 11.588262] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 11.650807] [<c1133261>] ? _pci_assign_resource+0x99/0x101 [ 11.717484] [<c119534c>] ? free_dmar_iommu+0x1f7/0x1f7 [ 11.779983] [<c1133526>] ? pci_reassign_resource+0x4e/0x84 [ 11.846653] [<c113a1b8>] ? __assign_resources_sorted+0x119/0x19b [ 11.919594] [<c121d0da>] ? __pci_bus_assign_resources+0x3e/0xb7 [ 11.991478] [<c1229306>] ? printk+0xe/0x11 [ 12.041489] [<c135596e>] ? pci_assign_unassigned_resources+0x7a/0x1b0 [ 12.119602] [<c10a4ec9>] ? kfree+0x3e/0xba [ 12.169658] [<c1131bef>] ? pci_get_subsys+0x48/0x50 [ 12.229055] [<c1131bef>] ? pci_get_subsys+0x48/0x50 [ 12.288444] [<c135cdeb>] ? pcibios_allocate_bus_resources+0x80/0x80 [ 12.364471] [<c135ce49>] ? pcibios_assign_resources+0x5e/0x62 [ 12.434283] [<c1001066>] ? do_one_initcall+0x66/0x104 [ 12.495772] [<c133a752>] ? kernel_init+0x9f/0x111 [ 12.553069] [<c133a6b3>] ? start_kernel+0x306/0x306 [ 12.612449] [<c122f23e>] ? kernel_thread_helper+0x6/0xd That's it.
Bisection found the culprit: | From 2bbc6942273b5b3097bd265d82227bdd84b351b2 Mon Sep 17 00:00:00 2001 | From: Ram Pai <linuxram@us.ibm.com> | Date: Mon, 25 Jul 2011 13:08:39 -0700 | Subject: [PATCH] PCI : ability to relocate assigned pci-resources
Created attachment 71052 [details] Boot log of a working kernel GIT revision just before the breaking commit.
Created attachment 71062 [details] Boot log of a crashing kernel GIT rev 2bbc694 - first commit that doesn't boot.
Created attachment 71292 [details] lspci -vv output
Created attachment 71642 [details] Boot log of a working kernel Increased verbosity with ignore_loglevel
Created attachment 71652 [details] Boot log of a crashing kernel Increased verbosity with ignore_loglevel
[I sent this in email earlier, intending it to be attached in bugzilla, but that didn't work.] I see two things wrong so far. 1) I think we are reassigning PCI resources when we shouldn't. pci_root PNP0A08:00: host bridge window [mem 0xf0000000-0xfed8ffff] pci_root PNP0A08:00: host bridge window [mem 0x00000000-0xffffffff] pci_root PNP0A08:00: host bridge window expanded to [mem 0x00000000-0xffffffff]; [mem 0x00000000-0xffffffff] ignored pci 0000:00:1c.1: address space collision: [mem 0xfde00000-0xfdefffff 64bit pref] conflicts with PCI Bus 0000:00 [mem 0xf0000000-0xfed8ffff] pci 0000:00:1c.2: address space collision: [mem 0xfdf00000-0xfdffffff 64bit pref] conflicts with PCI Bus 0000:00 [mem 0xf0000000-0xfed8ffff] ... These "collisions" are not actually collisions -- [mem 0xfde00000-0xfdefffff 64bit pref] is a perfectly legal assignment inside the [mem 0xf0000000-0xfed8ffff] host bridge window. The supposed host bridge window [mem 0x00000000-0xffffffff] is clearly bogus, but we don't handle it well in Linux. The kernel resource code doesn't allow overlaps at the same level, so we have a hack that coalesces those overlapping host bridge windows, which leads to these "collisions," which in turn causes unnecessary PCI resource reassignments. 2) The reassignment fails when it shouldn't. It looks like when we fail, we're assigning more space to the 1c.3 and 1c.2 bridge windows than we did before, but beyond that, I think Ram will have more insight than I do right now.