Bug 199961

Summary: BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1342
Product: File System Reporter: Martin Peres (martin.peres)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED UNREPRODUCIBLE    
Severity: normal CC: lakshminarayana.vudum
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Linux: 4.17.0-rc7 Subsystem:
Regression: No Bisected commit-id:

Description Martin Peres 2018-06-07 09:59:08 UTC
We got the following backtrace in our CI system, while executing a non-FS-intensive test. This means it is gonna be a little hard to reproduce the issue, but I wanted to share it with you anyway as this might already have all the information you need.

[   43.666366] BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1342
[   43.666550] in_atomic(): 0, irqs_disabled(): 1, pid: 233, name: systemd-journal
[   43.666576] 4 locks held by systemd-journal/233:
[   43.666578]  #0: 00000000f4e8f811 (&mm->mmap_sem){++++}, at: __do_page_fault+0x116/0x590
[   43.666593]  #1: 000000008a5afe3b (sb_pagefaults){.+.+}, at: ext4_page_mkwrite+0x56/0x4f0
[   43.666607]  #2: 0000000037ac1e86 (&ei->i_mmap_sem){++++}, at: ext4_page_mkwrite+0x6a/0x4f0
[   43.666618]  #3: 00000000f4e8f811 (&mm->mmap_sem){++++}, at: __do_page_fault+0x116/0x590
[   43.666629] irq event stamp: 956418
[   43.666634] hardirqs last  enabled at (956417): [<ffffffff811174b4>] current_kernel_time64+0x94/0xb0
[   43.666638] hardirqs last disabled at (956418): [<ffffffff811f6709>] __slab_alloc.isra.27.constprop.33+0x19/0x70
[   43.666643] softirqs last  enabled at (944462): [<ffffffff818ae1ea>] unix_sock_destructor+0x4a/0xb0
[   43.666646] softirqs last disabled at (944460): [<ffffffff818ae1ea>] unix_sock_destructor+0x4a/0xb0
[   43.666650] CPU: 0 PID: 233 Comm: systemd-journal Tainted: G     U            4.17.0-rc7-CI-CI_DRM_4286+ #1
[   43.666652] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
[   43.666654] Call Trace:
[   43.666660]  dump_stack+0x67/0x9b
[   43.666665]  ___might_sleep+0x167/0x250
[   43.666670]  __do_page_fault+0x133/0x590
[   43.666679]  page_fault+0x1e/0x30
[   43.666682] RIP: 0010:deactivate_slab.isra.26+0x1bb/0x8d0
[   43.666684] RSP: 0000:ffffc900003c7980 EFLAGS: 00010086
[   43.666688] RAX: 0000000080000000 RBX: 00000000001653f1 RCX: 0000000000000001
[   43.666690] RDX: 0000000080000001 RSI: 0000000000000070 RDI: 00000000ffffffff
[   43.666692] RBP: ffffc900003c7a80 R08: ffff88041fa26c00 R09: ffff880400123ae0
[   43.666695] R10: ffffc900003c7aa0 R11: 0000000000000000 R12: ffffea0010004800
[   43.666697] R13: ffff880400122268 R14: ffff88040e179a40 R15: 0000000180250017
[   43.666709]  ? __kernel_text_address+0x9/0x30
[   43.666714]  ? __save_stack_trace+0x8d/0xf0
[   43.666723]  ? alloc_buffer_head+0x18/0x80
[   43.666729]  ? set_track+0x90/0x140
[   43.666732]  ? init_object+0x66/0x80
[   43.666737]  ? ___slab_alloc.constprop.34+0x232/0x3e0
[   43.666740]  ___slab_alloc.constprop.34+0x232/0x3e0
[   43.666743]  ? alloc_buffer_head+0x18/0x80
[   43.666747]  ? __lock_acquire+0x3c8/0x1b50
[   43.666755]  ? __lock_acquire+0x3c8/0x1b50
[   43.666760]  ? alloc_buffer_head+0x18/0x80
[   43.666764]  ? __slab_alloc.isra.27.constprop.33+0x3d/0x70
[   43.666767]  __slab_alloc.isra.27.constprop.33+0x3d/0x70
[   43.666771]  ? alloc_buffer_head+0x18/0x80
[   43.666774]  kmem_cache_alloc+0x234/0x2c0
[   43.666779]  alloc_buffer_head+0x18/0x80
[   43.666782]  alloc_page_buffers+0x84/0xd0
[   43.666788]  create_empty_buffers+0x14/0x100
[   43.666792]  create_page_buffers+0x47/0x50
[   43.666795]  __block_write_begin_int+0x89/0x590
[   43.666800]  ? ext4_inode_attach_jinode.part.18+0xa0/0xa0
[   43.666807]  ? ext4_inode_attach_jinode.part.18+0xa0/0xa0
[   43.666811]  block_page_mkwrite+0xab/0xf0
[   43.666817]  ext4_page_mkwrite+0x3d4/0x4f0
[   43.666825]  do_page_mkwrite+0x2c/0xa0
[   43.666829]  do_wp_page+0x1fc/0x4c0
[   43.666835]  __handle_mm_fault+0x7d6/0xe30
[   43.666847]  handle_mm_fault+0x196/0x3a0
[   43.666852]  __do_page_fault+0x295/0x590
[   43.666858]  ? page_fault+0x8/0x30
[   43.666862]  page_fault+0x1e/0x30
[   43.666865] RIP: 0033:0x7fc2bf099b00
[   43.666867] RSP: 002b:00007ffee8318390 EFLAGS: 00010246
[   43.666871] RAX: 00007fc2b63b9518 RBX: 000055ce8ec61e80 RCX: 00007fc2bf163694
[   43.666873] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007ffee8318338
[   43.666875] RBP: 0000000000000024 R08: 00000000002da7c0 R09: 0000000001c66518
[   43.666877] R10: 0000000000000000 R11: 0000000000000000 R12: 000055ce8ec51230
[   43.666879] R13: 00007ffee83183c8 R14: 0000000004012be0 R15: 00007ffee8318538
[   43.666892] BUG: unable to handle kernel paging request at 0000000000165461
[   43.666908] Oops: 0000 [#1] PREEMPT SMP PTI

Kernel logs (boot): https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4286/shard-hsw1/boot18.log
Kernel logs (when running the tests): https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4286/shard-hsw1/dmesg18.log

Kernel config: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4286/kernel.config.bz2

Hope this helps!
Comment 1 Lakshminarayana Vudum 2018-09-07 07:48:51 UTC
This issue occurred only once and that too 3 months ago, not seen afterwards.
Comment 2 Lakshminarayana Vudum 2018-09-07 07:59:49 UTC
This bug can be closed as the logs are not found from the above links.