Bug 200647

Summary: get_unused_fd_flags cause kernel crash
Product: Memory Management Reporter: Richard Zhang (zhang.zijian)
Component: Page AllocatorAssignee: Andrew Morton (akpm)
Status: NEW ---    
Severity: normal CC: sureeju, zhang.zijian
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 4.14.41 Subsystem:
Regression: No Bisected commit-id:
Attachments: ko demo

Description Richard Zhang 2018-07-25 09:21:19 UTC
Created attachment 277493 [details]
ko demo

'get_unused_fd_flags' in kthread cause kernel crash.
It works fine on 4.1, but causes crash after get 64 fds.
It also cause crash on ubuntu1404/1604/1804, centos7.5, and the crash messages are almost the same.

The crash message on centos7.5 shows below:
[   93.216084] start fd 61
[   93.316064] start fd 62
[   93.416084] start fd 63
[   93.521024] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   93.521111] IP: [<ffffffff9c4c4dbe>] __wake_up_common+0x2e/0x90
[   93.521172] PGD 0 
[   93.521197] Oops: 0000 [#1] SMP 
[   93.521233] Modules linked in: test(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink sunrpc kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd sg ppdev pcspkr virtio_balloon parport_pc parport i2c_piix4 joydev ip_tables xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi virtio_console virtio_net cirrus drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm ata_piix serio_raw libata virtio_pci virtio_ring i2c_core
[   93.522026]  virtio floppy dm_mirror dm_region_hash dm_log dm_mod
[   93.522089] CPU: 2 PID: 1820 Comm: test_fd Kdump: loaded Tainted: G           OE  ------------   3.10.0-862.3.3.el7.x86_64 #1
[   93.522172] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[   93.522258] task: ffff8e92b9431fa0 ti: ffff8e94247a0000 task.ti: ffff8e94247a0000
[   93.522314] RIP: 0010:[<ffffffff9c4c4dbe>]  [<ffffffff9c4c4dbe>] __wake_up_common+0x2e/0x90
[   93.522382] RSP: 0018:ffff8e94247a2d18  EFLAGS: 00010086
[   93.522424] RAX: 0000000000000000 RBX: ffffffff9d09daa0 RCX: 0000000000000000
[   93.522477] RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffff9d09daa0
[   93.522530] RBP: ffff8e94247a2d50 R08: 0000000000000000 R09: ffff8e92b95dfda8
[   93.522584] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9d09daa8
[   93.522637] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000003
[   93.522691] FS:  0000000000000000(0000) GS:ffff8e9434e80000(0000) knlGS:0000000000000000
[   93.522751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   93.522796] CR2: 0000000000000000 CR3: 000000017c686000 CR4: 00000000000207e0
[   93.522854] Call Trace:
[   93.522884]  [<ffffffff9c4c7d39>] __wake_up+0x39/0x50
[   93.522931]  [<ffffffff9c63a771>] expand_files+0x131/0x250
[   93.522979]  [<ffffffff9cb1182c>] ? schedule_timeout+0x17c/0x2c0
[   93.523027]  [<ffffffff9c63b017>] __alloc_fd+0x47/0x170
[   93.523071]  [<ffffffff9c63b170>] get_unused_fd_flags+0x30/0x40
[   93.523120]  [<ffffffffc07ea12a>] test_fd+0x12a/0x1c0 [test]
[   93.523178]  [<ffffffffc07ea000>] ? 0xffffffffc07e9fff
[   93.523220]  [<ffffffff9c4bb161>] kthread+0xd1/0xe0
[   93.523262]  [<ffffffff9c4bb090>] ? insert_kthread_work+0x40/0x40
[   93.523312]  [<ffffffff9cb20677>] ret_from_fork_nospec_begin+0x21/0x21
[   93.523363]  [<ffffffff9c4bb090>] ? insert_kthread_work+0x40/0x40
[   93.523409] Code: 66 90 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 49 89 fc 49 83 c4 08 53 48 83 ec 10 48 8b 47 08 89 55 cc 4c 89 45 d0 <48> 8b 08 49 39 c4 48 8d 78 e8 4c 8d 69 e8 75 08 eb 3b 4c 89 ef 
[   93.523756] RIP  [<ffffffff9c4c4dbe>] __wake_up_common+0x2e/0x90
[   93.523806]  RSP <ffff8e94247a2d18>
[   93.523835] CR2: 0000000000000000
Comment 1 Shuriyc Chu 2019-01-08 04:04:21 UTC
This issue exists since CentOS 7.5 3.10.0-862 and CentOS 7.4 (3.10.0-693.21.1	) is ok.
Root cause: the item 'resize_wait' is not initialized before being used.

Patch:
--- linux-4.14.91/fs/file.c     2018-12-29 20:39:11.000000000 +0800
+++ linux-4.14.91/fs/file.c.new 2019-01-08 11:55:44.297048053 +0800
@@ -462,6 +462,7 @@
                .full_fds_bits  = init_files.full_fds_bits_init,
        },
        .file_lock      = __SPIN_LOCK_UNLOCKED(init_files.file_lock),
+       .resize_wait    = __WAIT_QUEUE_HEAD_INITIALIZER(init_files.resize_wait),
 };

 static unsigned int find_next_fd(struct fdtable *fdt, unsigned int start)
Comment 2 Andrew Morton 2019-01-08 21:46:34 UTC
Thanks.  I queued the patch.

Please send me a Signed-off-by: as per Documentation/process/submitting-patches.rst, section 11.