Bug 107361 - BUG: unable to handle kernel NULL pointer dereference when mounting/umounting vfat in 4.3.0, worked in 4.2.4
Summary: BUG: unable to handle kernel NULL pointer dereference when mounting/umounting...
Status: RESOLVED INVALID
Alias: None
Product: File System
Classification: Unclassified
Component: FAT/VFAT/MSDOS (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: OGAWA Hirofumi
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-06 10:55 UTC by Mads
Modified: 2015-11-06 19:52 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.3.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Mads 2015-11-06 10:55:09 UTC
Can't seem to list files in my /boot-folder, and I get a kernel BUG when I try to umount it.

exai ~ # mount /boot
exai ~ # sync
exai ~ # mount
/dev/sda3 on / type btrfs (rw,noatime,nobarrier,compress=lzo,ssd,discard,space_cache,subvolid=258,subvol=/root)
devtmpfs on /dev type devtmpfs (rw,relatime,size=4044040k,nr_inodes=1011010,mode=755)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
efivarfs on /sys/firmware/efi/efivars type efivarfs (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=21,pgrp=1,timeout=0,minproto=5,maxproto=5,direct)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
tmpfs on /tmp type tmpfs (rw)
mqueue on /dev/mqueue type mqueue (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime)
/dev/sda3 on /usr/portage type btrfs (rw,noatime,nobarrier,compress=lzo,ssd,discard,space_cache,subvolid=261,subvol=/portage)
/dev/sda3 on /home type btrfs (rw,noatime,nobarrier,compress=lzo,ssd,discard,space_cache,subvolid=260,subvol=/home)
/dev/sda3 on /var type btrfs (rw,noatime,nobarrier,compress=lzo,ssd,discard,space_cache,subvolid=259,subvol=/var)
tmpfs on /run/user/106 type tmpfs (rw,nosuid,nodev,relatime,size=808900k,mode=700,uid=106,gid=995)
tmpfs on /run/user/0 type tmpfs (rw,nosuid,nodev,relatime,size=808900k,mode=700)
/dev/sda1 on /boot type vfat (rw,noatime,fmask=0022,dmask=0022,codepage=865,iocharset=utf8,shortname=mixed,errors=remount-ro)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=808900k,mode=700,uid=1000,gid=1000)
exai ~ # ls -l /boot
ls: cannot open directory /boot: No such device or address
exai ~ # umount /boot/
Killed
exai ~ # dmesg | tail -50
[   47.959725] cfg80211:   (5150000 KHz - 5250000 KHz @ 80000 KHz, 200000 KHz AUTO), (N/A, 2000 mBm), (N/A)
[   47.959726] cfg80211:   (5250000 KHz - 5350000 KHz @ 80000 KHz, 200000 KHz AUTO), (N/A, 2000 mBm), (0 s)
[   47.959727] cfg80211:   (5470000 KHz - 5725000 KHz @ 160000 KHz), (N/A, 2698 mBm), (0 s)
[   47.959728] cfg80211:   (57000000 KHz - 66000000 KHz @ 2160000 KHz), (N/A, 4000 mBm), (N/A)
[  101.965931] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  101.966053] IP: [<ffffffff8110219e>] truncate_inode_pages_range+0x1e/0x6a0
[  101.966152] PGD 838e7067 PUD 6c8db067 PMD 0 
[  101.966222] Oops: 0000 [#1] PREEMPT SMP 
[  101.966300] Modules linked in: iwlmvm iwlwifi vfat fat uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev x86_pkg_temp_thermal coretemp kvm_intel kvm microcode i2c_i801 iTCO_wdt xhci_pci xhci_hcd ideapad_laptop sparse_keymap int3403_thermal int3402_thermal processor_thermal_device int340x_thermal_zone intel_soc_dts_iosf int3400_thermal iosf_mbi acpi_thermal_rel intel_smartconnect efivarfs
[  101.967059] CPU: 0 PID: 1311 Comm: umount Not tainted 4.3.0-gentoo #1
[  101.967151] Hardware name: LENOVO 20266/Yoga2, BIOS 76CN42WW 03/02/2015
[  101.967206] task: ffff880087a23000 ti: ffff88006c92c000 task.ti: ffff88006c92c000
[  101.967269] RIP: 0010:[<ffffffff8110219e>]  [<ffffffff8110219e>] truncate_inode_pages_range+0x1e/0x6a0
[  101.967354] RSP: 0018:ffff88006c92fcd0  EFLAGS: 00010282
[  101.967395] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 9e37fffffffc0001
[  101.967453] RDX: ffffffffffffffff RSI: 0000000000000000 RDI: ffff88008897c770
[  101.967512] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  101.967571] R10: ffff88008897c718 R11: 0000000000000000 R12: ffffffffa03468c0
[  101.967630] R13: ffff88006c930000 R14: ffff8802532bd438 R15: ffff88008897c690
[  101.967689] FS:  00007fabc7f61780(0000) GS:ffff88025f200000(0000) knlGS:0000000000000000
[  101.967757] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  101.967802] CR2: 0000000000000028 CR3: 000000006c8df000 CR4: 00000000001406f0
[  101.967880] Stack:
[  101.967897]  ffff88008897c770 0000000000000000 ffff880087a23000 0000000000000000
[  101.967966]  ffffffff81100678 0000000000000000 ffffffff810fefd6 ffff88006c92fe58
[  101.968034]  00ffffff00000000 00000002900e19c0 ffffffff810fd640 ffff8802540b8248
[  101.968102] Call Trace:
[  101.968117]  [<ffffffff81100678>] ? pagevec_lookup_tag+0x18/0x20
[  101.968167]  [<ffffffff810fefd6>] ? write_cache_pages+0xe6/0x390
[  101.968215]  [<ffffffff810fd640>] ? domain_dirty_limits+0xe0/0xe0
[  101.968266]  [<ffffffff81088273>] ? finish_task_switch+0x53/0x180
[  101.968316]  [<ffffffff810f54f6>] ? find_get_pages_tag+0x126/0x160
[  101.968366]  [<ffffffff8116bc02>] ? __inode_wait_for_writeback+0x62/0xb0
[  101.968422]  [<ffffffff8109c420>] ? autoremove_wake_function+0x30/0x30
[  101.968478]  [<ffffffffa03435a0>] ? fat_evict_inode+0x10/0x50 [fat]
[  101.968530]  [<ffffffff8115ffa3>] ? evict+0xb3/0x180
[  101.968567]  [<ffffffff8116009d>] ? dispose_list+0x2d/0x40
[  101.968611]  [<ffffffff81160e3a>] ? evict_inodes+0x13a/0x150
[  101.968656]  [<ffffffff81148e15>] ? generic_shutdown_super+0x35/0xe0
[  101.968707]  [<ffffffff8114914c>] ? kill_block_super+0x1c/0x60
[  101.968754]  [<ffffffff81149264>] ? deactivate_locked_super+0x34/0x60
[  101.968806]  [<ffffffff81163db6>] ? cleanup_mnt+0x36/0x80
[  101.968860]  [<ffffffff81082a7f>] ? task_work_run+0x6f/0x90
[  101.968917]  [<ffffffff810013f5>] ? prepare_exit_to_usermode+0x95/0xd0
[  101.968971]  [<ffffffff8175066f>] ? int_ret_from_sys_call+0x25/0x8f
[  101.969021] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 41 57 41 56 41 55 41 54 55 48 89 f5 53 48 89 d3 48 81 ec 10 01 00 00 48 8b 07 48 89 3c 24 <48> 8b 40 28 8b 80 08 04 00 00 85 c0 78 05 e8 cf 19 04 00 48 8b 
[  101.969295] RIP  [<ffffffff8110219e>] truncate_inode_pages_range+0x1e/0x6a0
[  101.969355]  RSP <ffff88006c92fcd0>
[  101.969377] CR2: 0000000000000028
[  101.990401] ---[ end trace a5cb453620b7ad23 ]---
exai ~ #
Comment 1 OGAWA Hirofumi 2015-11-06 19:36:56 UTC
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 41 57 41 56 41 55 41 54 55 48 89 f5 53 48 89 d3 48 81 ec 10 01 00 00 48 8b 07 48 89 3c 24 <48> 8b 40 28 8b 80 08 04 00 00 85 c0 78 05 e8 cf 19 04 00 48 8b 

Disassemble of oops code

   0:	ff                   	(bad)  
   1:	ff c3                	inc    %ebx
   3:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
   a:	00 00 00 
   d:	41 57                	push   %r15
   f:	41 56                	push   %r14
  11:	41 55                	push   %r13
  13:	41 54                	push   %r12
  15:	55                   	push   %rbp
  16:	48 89 f5             	mov    %rsi,%rbp
  19:	53                   	push   %rbx
  1a:	48 89 d3             	mov    %rdx,%rbx
  1d:	48 81 ec 10 01 00 00 	sub    $0x110,%rsp
  24:	48 8b 07             	mov    (%rdi),%rax
  27:	48 89 3c 24          	mov    %rdi,(%rsp)
  2b:	48 8b 40 28          	mov    0x28(%rax),%rax
  2f:	8b 80 08 04 00 00    	mov    0x408(%rax),%eax
  35:	85 c0                	test   %eax,%eax
  37:	78 05                	js     0x3e
  39:	e8 cf 19 04 00       	callq  0x41a0d
  3e:	48                   	rex.W
  3f:	8b                   	.byte 0x8b
  40:	a0                   	.byte 0xa0

24: %rdi would be mapping
    %rax would be mapping->host
2b: 0x28(%rax) == mapping->host->i_sb
2f: 0x408(%rax) == mapping->host->i_sb->cleancache_poolid

And it seems to be host->i_sb == NULL then. 

There is no change in v4.2..v4.3, so this is likely to be the bug of other
parts. It might be memory corruption, race, or such.

Could you report this to lkml?
Comment 2 Mads 2015-11-06 19:52:36 UTC
Ok, thanks for checking it out!

Note You need to log in before you can comment on or make changes to this bug.