Bug 42608

Summary: vmalloc: allocation failure oops
Product: Drivers Reporter: Conrad Kostecki (ck+kernelbugzilla)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: RESOLVED CODE_FIX    
Severity: normal CC: akpm, alan, and, bugzilla, candysnell, florian, rusty
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Conrad Kostecki 2012-01-19 16:32:12 UTC
Hi!

OS: Gentoo Linux
Kernel: 3.2.1 (Gentoo Sources)

I am getting since kernel 3.2.0 this oops in my dmesg log:

vmalloc: allocation failure: 0 bytes
modprobe: page allocation failure: order:0, mode:0xd2
Pid: 6271, comm: modprobe Tainted: G           O 3.2.1-gentoo #1
Call Trace:
 [<ffffffff8108ef45>] ? 0xffffffff8108ef45
 [<ffffffff810b3678>] ? 0xffffffff810b3678
 [<ffffffff81071726>] ? 0xffffffff81071726
 [<ffffffff810b3616>] ? 0xffffffff810b3616
 [<ffffffff810717ff>] ? 0xffffffff810717ff
 [<ffffffff8101e562>] ? 0xffffffff8101e562
 [<ffffffff81071726>] ? 0xffffffff81071726
 [<ffffffff81071726>] ? 0xffffffff81071726
 [<ffffffff810722e0>] ? 0xffffffff810722e0
 [<ffffffff81072ae8>] ? 0xffffffff81072ae8
 [<ffffffff8142dc52>] ? 0xffffffff8142dc52
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
CPU    4: hi:    0, btch:   1 usd:   0
CPU    5: hi:    0, btch:   1 usd:   0
CPU    6: hi:    0, btch:   1 usd:   0
CPU    7: hi:    0, btch:   1 usd:   0
DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd: 119
CPU    1: hi:  186, btch:  31 usd: 115
CPU    2: hi:  186, btch:  31 usd: 138
CPU    3: hi:  186, btch:  31 usd:  64
CPU    4: hi:  186, btch:  31 usd:  39
CPU    5: hi:  186, btch:  31 usd:  92
CPU    6: hi:  186, btch:  31 usd:  73
CPU    7: hi:  186, btch:  31 usd: 131
active_anon:638 inactive_anon:14 isolated_anon:0
 active_file:1433 inactive_file:3548 isolated_file:0
 unevictable:0 dirty:11 writeback:0 unstable:0
 free:500215 slab_reclaimable:1131 slab_unreclaimable:2982
 mapped:536 shmem:25 pagetables:141 bounce:0
DMA free:15908kB min:684kB low:852kB high:1024kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15668kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 2000 2000 2000
DMA32 free:1984952kB min:89424kB low:111780kB high:134136kB active_anon:2552kB inactive_anon:56kB active_file:5732kB inactive_file:14192kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2048128kB mlocked:0kB dirty:44kB writeback:0kB mapped:2144kB shmem:100kB slab_reclaimable:4524kB slab_unreclaimable:11912kB kernel_stack:224kB pagetables:564kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15908kB
DMA32: 64*4kB 233*8kB 139*16kB 103*32kB 75*64kB 36*128kB 13*256kB 1*512kB 2*1024kB 2*2048kB 478*4096kB = 1984920kB
4987 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 1048572kB
Total swap = 1048572kB
524272 pages RAM
10806 pages reserved
4836 pages shared
9198 pages non-shared
Comment 1 Andrew Morton 2012-01-20 23:27:56 UTC
gack, why doesn't the backtrace show the kernel symbols?

Please check that CONFIG_KALLSYMS is enabled, then retry?
Comment 2 and 2012-04-21 11:26:50 UTC
Hi Andrew
confirmed for 3.2.14.
This allocation failure appears only _without_ CONFIG_KALLSYMS=y.

"http://forums.gentoo.org/viewtopic-t-908032.html"
"http://openelec.tv/forum/90-miscellaneous/25154-solved-udevd-page-allocation-failure"

Slim kernel user folks disabling CONFIG_KALLSYMS to save some bytes. :)
Comment 3 and 2012-04-26 19:22:44 UTC
Regression must be between v3.1 and v3.2-rc1.
Following modules are compiled:
CONFIG_CFG80211=m
CONFIG_MAC80211=m
CONFIG_SCSI_WAIT_SCAN=m
CONFIG_ATH_COMMON=m
CONFIG_ATH9K_HW=m
CONFIG_ATH9K_COMMON=m
CONFIG_ATH9K=m

It can't be a modprobe binary/userland problem. Cross test with modules of used components and in-kernel wlan stuff don't show this error.

Strip down more modules. Even with that module config this error occurs.
CONFIG_SCSI_WAIT_SCAN=m
CONFIG_ATH_COMMON=m
CONFIG_ATH9K_HW=m
CONFIG_ATH9K_COMMON=m
CONFIG_ATH9K=m

So our error must in ath9k driver code.
Comment 4 and 2012-06-18 22:51:28 UTC
Same happen with kernel v3.4.3.

For now I using no ath9k modules anymore.
I get rig of this allocation failure oops and replace wireless-regdb/crda userland rubbish by enabling CONFIG_CFG80211_INTERNAL_REGDB.
Initramfs is small and clean again and "iw reg set" works afterwards in userland linux also.
Comment 5 nzqr 2012-07-20 14:57:39 UTC
Same happens starting with 3.2-rc's, and I'm _not_ using ath9k module, this happens with bunch of other modules.
Comment 6 Alan 2012-08-30 13:54:01 UTC
nzqr: Please attach the dmesg trace after it occurs. Without that nothing can be debugged
Comment 7 Andrew Morton 2012-08-30 19:45:45 UTC
Yes, this is very simple: some driver is asking vmalloc() to allocate zero bytes, which is daft.  We just need to find out which driver is doing this and for that, we need some sort of backtrace.

We could probably work it out if we just knew the arguments to that modprobe invokation, sigh.
Comment 8 Steve Graham 2012-12-09 15:51:15 UTC
I have fixed this on my system by following a forum post:
https://bbs.archlinux.org/viewtopic.php?pid=1142013#p1142013

Basically, with CONFIG_KALLSYMS=n, some modules have an init_size of zero, but kernel/module.c still attempts to do the vmalloc of zero bytes which triggers the error and the dump.

I'm no expert, but it seems to me that init_size == 0 is probably OK for a module? On my system, I was getting dumps from about half a dozen disparate module insertions from boot or inserting a peripheral.

Obviously, the workaround is to set CONFIG_KALLSYMS=y!
Comment 9 Rusty Russell 2012-12-10 03:07:48 UTC
The warning is harmless, but I've hoisted the old 0-len test into kernel/module.c.

The don't-call-vmalloc-with-0-size patch was removed from various archs in:

commit d0a21265dfb5fa8ae54e90d0fb6d1c215b10a28a
Author: David Rientjes <rientjes@google.com>
Date:   Thu Jan 13 15:46:02 2011 -0800

    mm: unify module_alloc code for vmalloc
    
    Four architectures (arm, mips, sparc, x86) use __vmalloc_area() for
    module_init().  Much of the code is duplicated and can be generalized in a
    globally accessible function, __vmalloc_node_range().
    
    __vmalloc_node() now calls into __vmalloc_node_range() with a range of
    [VMALLOC_START, VMALLOC_END) for functionally equivalent behavior.
    
    Each architecture may then use __vmalloc_node_range() directly to remove
    the duplication of code.

Then you signed-off a "cleanup" which caused the spurious warning in:

commit de7d2b567d040e3b67fe7121945982f14343213d
Author: Joe Perches <joe@perches.com>
Date:   Mon Oct 31 17:08:48 2011 -0700

    mm/vmalloc.c: report more vmalloc failures
    
    Some vmalloc failure paths do not report OOM conditions.
    
    Add warn_alloc_failed, which also does a dump_stack, to those failure
    paths.
    
    This allows more site specific vmalloc failure logging message printks to
    be removed.
    
    Signed-off-by: Joe Perches <joe@perches.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Comment 10 Florian Mickler 2012-12-22 09:25:06 UTC
A patch referencing this bug report has been merged in Linux v3.8-rc1:

commit 82fab442f5322b016f72891c0db2436c6a6c20b7
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Tue Dec 11 09:38:33 2012 +1030

    modules: don't hand 0 to vmalloc.