Bug 202511 - amdgpu fails to load saying "Could not allocate 8192 bytes percpu data"
Summary: amdgpu fails to load saying "Could not allocate 8192 bytes percpu data"
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-05 18:06 UTC by Michael A. Leonetti
Modified: 2019-02-26 23:54 UTC (History)
5 users (show)

See Also:
Kernel Version: 4.20.6
Tree: Mainline
Regression: No


Attachments
crashing dmesg (65.35 KB, text/plain)
2019-02-05 18:06 UTC, Michael A. Leonetti
Details
4.17.19 working amdgpu dmesg (67.37 KB, text/plain)
2019-02-06 01:18 UTC, Michael A. Leonetti
Details
bisect log (2.76 KB, text/plain)
2019-02-06 23:06 UTC, Michael A. Leonetti
Details
bisect log 2 (1.64 KB, text/plain)
2019-02-12 02:20 UTC, Michael A. Leonetti
Details
per-cpu alloc debug patch (595 bytes, patch)
2019-02-14 19:00 UTC, Bjorn Helgaas
Details | Diff
dmesg 4.18.0 with debugs (122.03 KB, text/plain)
2019-02-20 20:48 UTC, Michael A. Leonetti
Details
per-cpu alloc info (broken) (3.53 KB, text/plain)
2019-02-21 19:59 UTC, Bjorn Helgaas
Details
4.17.19 config (116.67 KB, text/x-mpsub)
2019-02-21 20:05 UTC, Michael A. Leonetti
Details
4.18 config (117.27 KB, text/x-mpsub)
2019-02-21 20:06 UTC, Michael A. Leonetti
Details
Acutal 4.17.19 config (116.54 KB, text/plain)
2019-02-21 21:57 UTC, Michael A. Leonetti
Details
Print module reserved percpu allocations (520 bytes, patch)
2019-02-25 20:25 UTC, Barret Rhoden
Details | Diff
fb: switching to amdgpudrmfb from EFI VGA freeze 4.19.23 (467.77 KB, image/png)
2019-02-26 16:20 UTC, Michael A. Leonetti
Details

Description Michael A. Leonetti 2019-02-05 18:06:26 UTC
Created attachment 281007 [details]
crashing dmesg

Working on 4.17.19 (which is what I use) but any kernel 4.19* 4.20* I try has this issue. Built amdgpu as a module, when it tries to load it (or I try to modprobe it) I get

[    4.376629] amdgpu: Could not allocate 8192 bytes percpu data
[    4.377316] ath10k_pci 0000:02:00.0: board_file api 2 bmi_id N/A crc32 8aedfa4a
[    4.382128] percpu: allocation failed, size=8192 align=4096 atomic=0, alloc from reserved chunk failed
[    4.382133] CPU: 1 PID: 2620 Comm: systemd-udevd Not tainted 4.20.6-gentoo #1
[    4.382135] Hardware name: Acer Aspire A315-41/Metapod_RR, BIOS V1.11 10/30/2018
[    4.382137] Call Trace:
[    4.382148]  dump_stack+0x46/0x5b
[    4.382154]  pcpu_alloc+0x56e/0x590
[    4.382160]  ? find_module_all+0x4c/0x80
[    4.382165]  load_module+0xb51/0x1e80
[    4.382170]  ? wait_woken+0x80/0x80
[    4.382175]  __se_sys_finit_module+0xe0/0xf0
[    4.382180]  do_syscall_64+0x4a/0x100
[    4.382185]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[    4.382189] RIP: 0033:0x7fb64cd77fb9
[    4.382193] Code: 00 00 00 75 05 48 83 c4 18 c3 e8 42 a5 01 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9f 4e 2c 00 f7 d8 64 89 01 48
[    4.382196] RSP: 002b:00007ffe5fe76eb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    4.382200] RAX: ffffffffffffffda RBX: 00005615a81b2c10 RCX: 00007fb64cd77fb9
[    4.382202] RDX: 0000000000000000 RSI: 00007fb64dd3c1c5 RDI: 000000000000000f
[    4.382204] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[    4.382206] R10: 000000000000000f R11: 0000000000000246 R12: 00005615a81d0290
[    4.382209] R13: 00007fb64dd3c1c5 R14: 0000000000020000 R15: 0000000003938700


And it won't load.
Comment 1 Michael A. Leonetti 2019-02-06 01:18:21 UTC
Created attachment 281011 [details]
4.17.19 working amdgpu dmesg

For reference here is my dmesg from the 4.17.19 which will modprobe amdgpu.
Comment 2 Alex Deucher 2019-02-06 01:19:29 UTC
Can you bisect?
Comment 3 Michael A. Leonetti 2019-02-06 23:06:24 UTC
Created attachment 281033 [details]
bisect log

Here is the bisect log as requested.
Comment 4 Alex Deucher 2019-02-06 23:23:16 UTC
Looks like it was caused by a pci core change.  Adding Bjorn.
Comment 5 Bjorn Helgaas 2019-02-06 23:56:18 UTC
The bisect log in comment #3 shows 3a3869f1c443 ("Merge tag 'pci-v4.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci") as the first bad commit.  That's a large merge commit so I can't pick out anything useful.  The previous good commit was 3036bc45364f ("Merge tag 'media/v4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media").

I'm not a bisect expert, but there must be a way to bisect between those two commits to zero in on something specific.  Is this any help?

05:49:21 ~/linux (junk)$ git bisect start
05:49:47 ~/linux (junk|BISECTING)$ git bisect good 3036bc45364f98515a2c446d7fac2c34dcfbeff4
05:50:01 ~/linux (junk|BISECTING)$ git bisect bad 3a3869f1c443383ef8354ffa0e5fb8df65d8b549
Bisecting: 88 revisions left to test after this (roughly 6 steps)
[13fbadcd512c225c907d6e8147fb48a88114bf03] Merge branch 'pci/sparc'
Comment 6 Christian König 2019-02-07 07:42:32 UTC
Since nobody else seems to be running into this problem: Are you sure that you are correctly building and installing a complete kernel?

That looks like something special with your installation/setup.
Comment 7 Michael A. Leonetti 2019-02-07 16:46:20 UTC
It could be special. If so I'd love to know what it is that I'm doing wrong. So far all of the 4.17 kernels I've tried works and Windows works on the laptop. I'll see if I can get an Ubuntu LiveCD working.
Comment 8 Bjorn Helgaas 2019-02-07 17:33:19 UTC
Even if this is "special", e.g., some strange kconfig or build/install issue, this is a really painful issue to debug.  It would be tremendous if we could fail more gracefully, with some hint about what went wrong and how the user could fix it.  But it's hard to know how we could improve that until we understand what's going on here.
Comment 9 Christian König 2019-02-07 19:17:04 UTC
Yeah, I also would be really interested what the root cause is.

As far as I know we don't have any large per CPU data in amdgpu, so no idea how this can happen.
Comment 10 Michael A. Leonetti 2019-02-08 14:06:48 UTC
I'd love to get it settled too (for obvious reasons). I guess a good place to start is the bisect that Bjorn Helgaas requested?
Comment 11 Bjorn Helgaas 2019-02-08 14:15:21 UTC
Yes.  I don't have any better ideas.  I'm not sure why git bisect wasn't smart enough to narrow it down further by itself.  Sorry, I know bisecting is painful.
Comment 12 Michael A. Leonetti 2019-02-08 14:17:13 UTC
Rather, I appreciate the help.

I'll start bisecting after work and post a reply as soon as I have the result.
Comment 13 Michael A. Leonetti 2019-02-12 02:20:29 UTC
Created attachment 281107 [details]
bisect log 2

I'm not sure if this is helpful. Basically all of the builds that I did in the bisect I marked "good" because they worked.
Comment 14 Michael A. Leonetti 2019-02-14 13:52:39 UTC
Is there anything else I can try? Should I do the first bisect again?
Comment 15 Bjorn Helgaas 2019-02-14 19:00:41 UTC
Created attachment 281143 [details]
per-cpu alloc debug patch

I think bisecting again would be a poor use of your time unless somebody knows a smarter way to do it (I don't).

I suspect amdgpu might be an innocent bystander.  I don't have any good ideas, so here's a lame one:

  - Apply this patch.
  - Figure out how to turn on pr_debug (see include/linux/printk.h and compare with your config -- you might need to enable CONFIG_DYNAMIC_DEBUG and and then figure out what boot parameter enables the output).  This is to turn on the debug in module_load().  Actually, the simpler way is to change those pr_debug() calls to pr_info().
  - Collect the dmesg log (probably large).
Comment 16 Michael A. Leonetti 2019-02-20 20:48:04 UTC
Created attachment 281247 [details]
dmesg 4.18.0 with debugs

Did I do this correctly? I see a lot more call track information in this file.
Comment 17 Bjorn Helgaas 2019-02-21 19:59:52 UTC
Created attachment 281259 [details]
per-cpu alloc info (broken)

FWIW, I pulled out the per-cpu alloc info and what looks like the caller for each (attached).  The big chunks look like IOMMU and iptables stuff.

This is from the 4.18 kernel with the problem.  Maybe we'd learn something by collecting the same info from the working 4.17 kernel and comparing?

I wonder if there's some config difference that could be relevant?  Could you attach the .config files for 4.17 and 4.18?
Comment 18 Michael A. Leonetti 2019-02-21 20:05:55 UTC
Created attachment 281261 [details]
4.17.19 config

Absolutely. Let me post the two configs then.
Comment 19 Michael A. Leonetti 2019-02-21 20:06:42 UTC
Created attachment 281263 [details]
4.18 config

Would you like me to build the 4.17.19 kernel with the print options also?
Comment 20 Michael A. Leonetti 2019-02-21 21:57:22 UTC
Created attachment 281269 [details]
Acutal 4.17.19 config

This is the actual 4.17.19 that I used to make the 4.18 config. The other one I uploaded was not the correct one.
Comment 21 Bjorn Helgaas 2019-02-21 22:21:50 UTC
Thanks!  Unless somebody has a better idea, I would try building 4.17.19 with the print options.

Config differences that might conceivably be related:

--- config-4.17.txt     2019-02-21 16:03:59.976990680 -0600
+++ config-4.18 2019-02-21 16:03:40.404831576 -0600
+CONFIG_IPV6_SEG6_BPF=y
+CONFIG_NF_TPROXY_IPV4=y
+CONFIG_NF_TPROXY_IPV6=y

You could try turning these off in 4.18 to see if that makes a difference.
Comment 22 Barret Rhoden 2019-02-22 15:34:09 UTC
As far as the bisection goes, from the original bisect report, both of the commits that were merged were good.  i.e. 3a3869f1 merged two good commits:  3036bc45364f and 488ad6d3678b.  Only when the commits were combined was the system bad.

Given it looks like a failure to do a percpu alloc, that makes sense - both branches could have had some change that when combined exhausted a resource.

From the error messages, these are 'reserved' percpu allocs:

[    4.176816] percpu: allocation failed, size=8192 align=4096 atomic=0, alloc from reserved chunk failed

The reserved space is rather small.  From the early output, we can see it's only 8KB (the r8192):

[    0.000000] percpu: Embedded 54 pages/cpu @(____ptrval____) s181144 r8192 d31848 u262144

The alloc that failed was 8192, which is the entire reserved space, so my guess is that there was another alloc already, such that there wasn't enough for the 8192 alloc.

It seems a little odd that there isn't enough reserved percpu space - or rather that someone is grabbing more space than they should.  These reserved allocs are only made by modules, and then only by modules that use percpu data (if I'm reading kernel/module.c right).  The default amount of 8192 is PERCPU_MODULE_RESERVE, which hasn't changed in years.

I'd be curious who else is making reserved per_cpu allocations, regardless of failure.  In kernel/module.c L652, we only print on failure.  If you print regardless of failure, particularly the mod name, then we might know who the other one is.  Maybe there are a bunch of benign small allocs, or maybe there's another 8192 out there.

Regardless, I'd guess the main culprit is the amdkfd and drm modules, asking for 8192 out of a total 8192.  I built with your 4.18 config from the merge commit.  Both amdkfd and drm have percpu sections, e.g.:

$ objdump -h drivers/gpu/drm/amd/amdkfd/amdkfd.ko

.data..percpu 00002000  0000000000000000  0000000000000000  00022000  2**12

That matches the size (8192) and alignment (2^12) that we saw in the allocation message.

drm.ko has something similar for its section.

Looking at amdkfd.ko (objdump -D), it has:

Disassembly of section .data..percpu:
0000000000000000 <kfd_processes_srcu_srcu_data>:

and drm.ko has: 

Disassembly of section .data..percpu:
0000000000000000 <drm_unplug_srcu_srcu_data>:

That looks like these two:

drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);

Those SRCU macros expand to a DEFINE_PER_CPU(struct srcu_data), though the struct doesn't look huge.  Not sure why it blows up to 8192 for its percpu section.

The amdkfd one was added in 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it conditional"), and the DRM one was added in bee330f3d672 ("drm: Use srcu to protect drm_device.unplugged").  Both are relatively recent.

It doesn't look like there are a lot of drivers that use SRCU with those macros, so maybe that's something drm and amdkfd shouldn't be doing?  Either that, or maybe there's something wrong that causes the SRCU percpu structure to get so large?

Also, it's not clear that if these are the culprits, then why would they be working before the bisection point.  If one of them succeeded, then the other should have failed (given they both try to alloc 8192 out of a total 8192 reservation).  So maybe I'm missing something.
Comment 23 Michael A. Leonetti 2019-02-22 15:57:33 UTC
(In reply to Bjorn Helgaas from comment #21)
> Thanks!  Unless somebody has a better idea, I would try building 4.17.19
> with the print options.
> 
> Config differences that might conceivably be related:
> 
> --- config-4.17.txt     2019-02-21 16:03:59.976990680 -0600
> +++ config-4.18 2019-02-21 16:03:40.404831576 -0600
> +CONFIG_IPV6_SEG6_BPF=y
> +CONFIG_NF_TPROXY_IPV4=y
> +CONFIG_NF_TPROXY_IPV6=y
> 
> You could try turning these off in 4.18 to see if that makes a difference.

It didn't end up doing anything. Would you like me to post my dmesg output or .config?
Comment 24 Barret Rhoden 2019-02-25 20:25:37 UTC
Created attachment 281345 [details]
Print module reserved percpu allocations

Hi Michael -

Can you apply this patch and report back the dmesg output?  Ideally, I'd like to see this for the commit that fails (3a3869f1) as well as its two parents (3036bc45364f and 488ad6d3678b).

If you're just trying to get a working system, then maybe build with CONFIG_DRM and CONFIG_DRM_AMDGPU stuff as 'Y' instead of as modules.

It might be that the right fix would be for the DRM and AMD code to not use static SRCU objects.
Comment 25 Michael A. Leonetti 2019-02-26 16:20:27 UTC
Created attachment 281359 [details]
fb: switching to amdgpudrmfb from EFI VGA freeze 4.19.23

Barret,

I will apply the patch and post the output for the commits as you asked.

I have tried a couple of times to compile AMDGPU and DRM into the kernel and it freezes after "fb: switching to amdgpudrmfb from EFI VGA" and doesn't give any errors for me to see. I've also attached the output from kernel 4.19.23.
Comment 26 Michel Dänzer 2019-02-26 16:34:46 UTC
When the amdgpu driver is built into the kernel, so must be all the microcode files under /lib/firmware/amdgpu/ that it needs with your GPU.
Comment 27 Michael A. Leonetti 2019-02-26 16:59:27 UTC
(In reply to Michel Dänzer from comment #26)
> When the amdgpu driver is built into the kernel, so must be all the
> microcode files under /lib/firmware/amdgpu/ that it needs with your GPU.

Thanks for that! I had to include the raven* firmware files. I had only included the vega* firmware files.
Comment 28 vladimir gerasimov 2019-02-26 17:31:59 UTC
Comment on attachment 281269 [details]
Acutal 4.17.19 config

There is a workaround, resolving this bug (for me at least) - disabling CONFIG_X86_VSMP parameter.
Hint to this solution was here: https://bugzilla.kernel.org/show_bug.cgi?id=201339
Comment 29 Barret Rhoden 2019-02-26 18:41:16 UTC
Ah, nice find.  That other bug (201339) looks like the root cause of this one.
Comment 30 Michael A. Leonetti 2019-02-26 23:54:55 UTC
Can confirm, disabling CONFIG_X86_VSMP corrects the issue.

Note You need to log in before you can comment on or make changes to this bug.