Bug 47731

Summary: Null pointer dereference at strchr
Product: File System Reporter: Brad Figg (brad.figg)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: RESOLVED CODE_FIX    
Severity: normal CC: florian, tytso
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.5.4 Subsystem:
Regression: No Bisected commit-id:

Description Brad Figg 2012-09-19 21:50:26 UTC
This is a 3.5.0-15.20 Ubuntu kernel. 

I was running xfstests and got this oops. I am not sure which test caused it but the last test being run was 234.


Nov  1 02:52:47 rizzo kernel: [11454.238205] BUG: unable to handle kernel NULL pointer dereference at 000000e5
Nov  1 02:52:47 rizzo kernel: [11454.249565] IP: [<c12d367f>] strchr+0xf/0x30
Nov  1 02:52:47 rizzo kernel: [11454.260460] *pdpt = 0000000036dd7001 *pde = 0000000000000000
Nov  1 02:52:47 rizzo kernel: [11454.271359] Oops: 0000 [#1] SMP
Nov  1 02:52:47 rizzo kernel: [11454.282094] Modules linked in: quota_v2 quota_tree nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_recent nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables bluetooth pppoe pppox irda crc_ccitt atm p8022 psnap llc p8023 ax25 btrfs zlib_deflate libcrc32c xfs ext2 dm_multipath scsi_dh vesafb coretemp kvm_intel kvm joydev i7core_edac edac_core gpio_ich microcode lpc_ich dcdbas mac_hid acpi_power_meter lp parport hid_generic usbhid hid mpt2sas bnx2 scsi_transport_sas raid_class [last unloaded: tipc]
Nov  1 02:52:47 rizzo kernel: [11454.314893]
Nov  1 02:52:47 rizzo kernel: [11454.314895] Pid: 32681, comm: setquota Tainted: G           O 3.5.0-15-generic #20-Ubuntu Dell Inc. PowerEdge R310/05XKKK
Nov  1 02:52:47 rizzo kernel: [11454.314898] EIP: 0060:[<c12d367f>] EFLAGS: 00210246 CPU: 3
Nov  1 02:52:47 rizzo kernel: [11454.314900] EIP is at strchr+0xf/0x30
Nov  1 02:52:47 rizzo kernel: [11454.314901] EAX: 00003d3d EBX: c15ebbf0 ECX: 00280000 EDX: 0000003d
Nov  1 02:52:47 rizzo kernel: [11454.314902] ESI: 000000e5 EDI: d250fc00 EBP: ef2e3eac ESP: ef2e3ea8
Nov  1 02:52:47 rizzo kernel: [11454.314903]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Nov  1 02:52:47 rizzo kernel: [11454.314904] CR0: 8005003b CR2: 000000e5 CR3: 291c4000 CR4: 000007f0
Nov  1 02:52:47 rizzo kernel: [11454.314905] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Nov  1 02:52:47 rizzo kernel: [11454.314906] DR6: ffff0ff0 DR7: 00000400
Nov  1 02:52:47 rizzo kernel: [11454.314907] Process setquota (pid: 32681, ti=ef2e2000 task=c2943f70 task.ti=ef2e2000)
Nov  1 02:52:47 rizzo kernel: [11454.314908] Stack:
Nov  1 02:52:47 rizzo kernel: [11454.314908]  0000002d ef2e3ef8 c11e35df d2a67de0 c17925fd 0000002c c1791a80 c1161fc4
Nov  1 02:52:47 rizzo kernel: [11454.314913]  8c3ac810 0000002c e9e77400 8c02c010 ef2e3eec d250d800 00000000 2ca67de0
Nov  1 02:52:47 rizzo kernel: [11454.314917]  d2a67de0 d2a67de0 c11e3930 f6c62a90 ef2e3f00 c11e3942 ef2e3f20 c1184e39
Nov  1 02:52:47 rizzo kernel: [11454.314921] Call Trace:
Nov  1 02:52:47 rizzo kernel: [11454.314923]  [<c11e35df>] _ext4_show_options+0x37f/0x6d0
Nov  1 02:52:47 rizzo kernel: [11454.314927]  [<c1161fc4>] ? d_path+0xb4/0xd0
Nov  1 02:52:47 rizzo kernel: [11454.314930]  [<c11e3930>] ? _ext4_show_options+0x6d0/0x6d0
Nov  1 02:52:47 rizzo kernel: [11454.314932]  [<c11e3942>] ext4_show_options+0x12/0x20
Nov  1 02:52:47 rizzo kernel: [11454.314934]  [<c1184e39>] show_vfsmnt+0xc9/0x110
Nov  1 02:52:47 rizzo kernel: [11454.314936]  [<c11684f0>] m_show+0x10/0x20
Nov  1 02:52:47 rizzo kernel: [11454.314939]  [<c116c1c4>] seq_read+0x244/0x3c0
Nov  1 02:52:47 rizzo kernel: [11454.314941]  [<c116bf80>] ? seq_lseek+0x150/0x150
Nov  1 02:52:47 rizzo kernel: [11454.314943]  [<c114f159>] vfs_read+0x89/0x160
Nov  1 02:52:47 rizzo kernel: [11454.314946]  [<c116bf80>] ? seq_lseek+0x150/0x150
Nov  1 02:52:47 rizzo kernel: [11454.314948]  [<c114f26d>] sys_read+0x3d/0x70
Nov  1 02:52:47 rizzo kernel: [11454.314950]  [<c15cf65f>] sysenter_do_call+0x12/0x28
Nov  1 02:52:47 rizzo kernel: [11454.314954] Code: 19 c0 0c 01 8b 75 f8 8b 7d fc 89 ec 5d c3 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 55 89 e5 56 66 66 66 66 90 89 c6 89 d0 88 c4 <ac> 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 8d
Nov  1 02:52:47 rizzo kernel: [11454.314979] EIP: [<c12d367f>] strchr+0xf/0x30 SS:ESP 0068:ef2e3ea8
Nov  1 02:52:47 rizzo kernel: [11454.314981] CR2: 00000000000000e5
Comment 1 Theodore Tso 2012-09-20 01:21:10 UTC
Is this repeatable?   I'm having a lot of trouble figuring out the stack trace, since strchr isn't used anywhere in the code path involved with printing the ext4 mount options while returning the contents of /proc/mounts (which is obviously what is going on).

The only strchr that looks even vaguely possible is the one used by seq_escape() in fs/seq_file, which is called by mangle() and which is called by show_vfsmnt() in fs/namespace.c.   But the stack trace doesn't look consistent with that, either.   Part of the problem is between the inline functions, and the function pointers used by show_vfsmnt, the stack dump in the OOPS message can be quite misleading.
Comment 2 Brad Figg 2012-09-20 02:12:11 UTC
I'm trying but so far I have not been able to reproduce. As soon as I can, I'll add a comment here.
Comment 3 Brad Figg 2012-09-21 04:21:34 UTC
After many, many runs of xfstests I have encountered this same oops.


Sep 20 23:46:19 rizzo jenkins: run xfstest 234
Sep 20 23:46:20 rizzo kernel: [46589.656011] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: acl,user_xattr,usrquota,grpquota
Sep 20 23:46:20 rizzo kernel: [46589.674369] EXT4-fs (sdb2): re-mounted. Opts: (null)
Sep 20 23:46:20 rizzo kernel: [46589.682388] EXT4-fs (sdb2): re-mounted. Opts: (null)
Sep 20 23:46:20 rizzo kernel: [46590.133061] BUG: unable to handle kernel NULL pointer dereference at 0000001a
Sep 20 23:46:20 rizzo kernel: [46590.133482] IP: [<c12d908f>] strchr+0xf/0x30
Sep 20 23:46:20 rizzo kernel: [46590.133728] *pdpt = 0000000026347001 *pde = 0000000000000000 
Sep 20 23:46:20 rizzo kernel: [46590.134069] Oops: 0000 [#1] SMP 
Sep 20 23:46:20 rizzo kernel: [46590.134262] Modules linked in: quota_v2 quota_tree dm_multipath scsi_dh vesafb coretemp kvm_intel kvm joydev gpio_ich microcode i7core_edac hid_generic edac_core dcdbas lpc_ich lp mac_hid acpi_power_meter parport usbhid hid mpt2sas scsi_transport_sas raid_class bnx2
Sep 20 23:46:20 rizzo kernel: [46590.135782] 
Sep 20 23:46:20 rizzo kernel: [46590.135861] Pid: 23450, comm: setquota Not tainted 3.5.0-15-generic #22~precise1-Ubuntu Dell Inc. PowerEdge R310/05XKKK
Sep 20 23:46:20 rizzo kernel: [46590.136498] EIP: 0060:[<c12d908f>] EFLAGS: 00010246 CPU: 3
Sep 20 23:46:20 rizzo kernel: [46590.136805] EIP is at strchr+0xf/0x30
Sep 20 23:46:20 rizzo kernel: [46590.137007] EAX: 00003d3d EBX: c1603be4 ECX: 00180000 EDX: 0000003d
Sep 20 23:46:20 rizzo kernel: [46590.137362] ESI: 0000001a EDI: 00380800 EBP: da22beac ESP: da22bea8
Sep 20 23:46:20 rizzo kernel: [46590.137718]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Sep 20 23:46:20 rizzo kernel: [46590.138023] CR0: 8005003b CR2: 0000001a CR3: 130a0000 CR4: 000007f0
Sep 20 23:46:20 rizzo kernel: [46590.138376] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Sep 20 23:46:20 rizzo kernel: [46590.138733] DR6: ffff0ff0 DR7: 00000400
Sep 20 23:46:20 rizzo kernel: [46590.138969] Process setquota (pid: 23450, ti=da22a000 task=f6582610 task.ti=da22a000)
Sep 20 23:46:20 rizzo kernel: [46590.139473] Stack:
Sep 20 23:46:20 rizzo kernel: [46590.139592]  0000002c da22bef4 c11e8057 f779b480 c17b8975 0000002c c17b8597 00000034
Sep 20 23:46:20 rizzo kernel: [46590.140123]  0000002c ca024400 eeb29400 8c02c010 eeb2d000 00000000 2cd6d558 f779b480
Sep 20 23:46:20 rizzo kernel: [46590.149809]  eeb2d000 f779b480 00000000 da22befc c11e85e2 da22bf1c c1189e66 f6d6d550
Sep 20 23:46:20 rizzo kernel: [46590.169372] Call Trace:
Sep 20 23:46:20 rizzo kernel: [46590.178912]  [<c11e8057>] _ext4_show_options+0xf7/0x670
Sep 20 23:46:20 rizzo kernel: [46590.188682]  [<c11e85e2>] ext4_show_options+0x12/0x20
Sep 20 23:46:20 rizzo kernel: [46590.198181]  [<c1189e66>] show_vfsmnt+0xc6/0x110
Sep 20 23:46:20 rizzo kernel: [46590.207743]  [<c116ca70>] m_show+0x10/0x20
Sep 20 23:46:20 rizzo kernel: [46590.216361]  [<c11709e3>] seq_read+0x203/0x380
Sep 20 23:46:20 rizzo kernel: [46590.225585]  [<c115343c>] vfs_read+0x8c/0x160
Sep 20 23:46:20 rizzo kernel: [46590.234509]  [<c11707e0>] ? seq_lseek+0x170/0x170
Sep 20 23:46:20 rizzo kernel: [46590.243111]  [<c115354d>] sys_read+0x3d/0x70
Sep 20 23:46:20 rizzo kernel: [46590.251658]  [<c15e759f>] sysenter_do_call+0x12/0x28
Sep 20 23:46:20 rizzo kernel: [46590.259959] Code: 19 c0 0c 01 8b 75 f8 8b 7d fc 89 ec 5d c3 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 55 89 e5 56 66 66 66 66 90 89 c6 89 d0 88 c4 <ac> 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 8d 
Sep 20 23:46:20 rizzo kernel: [46590.286078] EIP: [<c12d908f>] strchr+0xf/0x30 SS:ESP 0068:da22bea8
Sep 20 23:46:20 rizzo kernel: [46590.294730] CR2: 000000000000001a
Sep 20 23:46:20 rizzo kernel: [46590.316231] ---[ end trace d04680d177f087ac ]---
Comment 4 Brad Figg 2012-09-21 04:42:32 UTC
$ uname -a
Linux rizzo 3.5.0-15-generic #22~precise1-Ubuntu SMP Wed Sep 19 22:07:48 UTC 2012 i686 i686 i386 GNU/Linux



$ sudo parted /dev/sdb print
Model: SEAGATE ST3300657SS (scsi)
Disk /dev/sdb: 300GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name  Flags
 1      1049kB  21.0GB  21.0GB  ext4         p1
 2      21.0GB  41.9GB  21.0GB  ext4         p2
 3      41.9GB  62.9GB  21.0GB  ext4         p3
 4      62.9GB  83.9GB  21.0GB  ext4         p4
 5      83.9GB  84.4GB  523MB                p5
 6      84.4GB  84.9GB  523MB                p6


$ sudo TEST_DEV=/dev/sdb1 TEST_DIR=/mnt/xfstests_test_dir SCRATCH_MNT=/mnt/xfstests_scratch_mnt SCRATCH_DEV=/dev/sdb2 ./check -g auto -x dangerous
Comment 5 Theodore Tso 2012-09-24 03:01:05 UTC
Herton Ronaldo Krzesinski has submitted a fix to this problem.

http://patchwork.ozlabs.org/patch/186011/
Comment 6 Florian Mickler 2012-10-15 21:18:33 UTC
A patch referencing this bug report has been merged in Linux v3.7-rc1:

commit 50df9fd55e4271e89a7adf3b1172083dd0ca199d
Author: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Date:   Sun Sep 23 22:49:12 2012 -0400

    ext4: fix crash when accessing /proc/mounts concurrently