Bug 12549 - kernel crash on using btrfs
Summary: kernel crash on using btrfs
Status: RESOLVED OBSOLETE
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Chris Mason
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-27 01:32 UTC by Andrey Sipin
Modified: 2012-10-30 15:49 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.29-r2
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Andrey Sipin 2009-01-27 01:32:22 UTC
Latest working kernel version: 2.6.29-r2
Earliest failing kernel version: 2.6.29-r2
Distribution: Gentoo
Hardware Environment: Pentium 4m, 
sipin saa # lspci
00:00.0 Host bridge: Silicon Integrated Systems [SiS] 645xx (rev 51)
00:01.0 PCI bridge: Silicon Integrated Systems [SiS] SiS AGP Port (virtual PCI-to-PCI bridge)
00:02.0 ISA bridge: Silicon Integrated Systems [SiS] SiS963 [MuTIOL Media IO] (rev 25)
00:02.1 SMBus: Silicon Integrated Systems [SiS] SiS961/2 SMBus Controller
00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE]
00:02.6 Modem: Silicon Integrated Systems [SiS] AC'97 Modem Controller (rev a0)
00:02.7 Multimedia audio controller: Silicon Integrated Systems [SiS] AC'97 Sound Controller (rev a0)
00:03.0 USB Controller: Silicon Integrated Systems [SiS] USB 1.1 Controller (rev 0f)
00:03.1 USB Controller: Silicon Integrated Systems [SiS] USB 1.1 Controller (rev 0f)
00:03.3 USB Controller: Silicon Integrated Systems [SiS] USB 2.0 Controller
00:04.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 PCI Fast Ethernet (rev 90)
00:0a.0 CardBus bridge: Ricoh Co Ltd RL5c475 (rev b8)
00:0a.1 FireWire (IEEE 1394): Ricoh Co Ltd R5C551 IEEE 1394 Controller
00:0c.0 Network controller: Broadcom Corporation BCM4306 802.11b/g Wireless LAN Controller (rev 03)
01:00.0 VGA compatible controller: ATI Technologies Inc RV350 [Mobility Radeon 9600 M10]

Software Environment: glibc 2.9
Problem Description: crash when copying or deleting many little files (portage tree) on btrfs (v 0.17)

[   86.026081] ------------[ cut here ]------------
[   86.026672] kernel BUG at fs/btrfs/extent-tree.c:3106!
[   86.026672] invalid opcode: 0000 [#1]
[   86.026672] last sysfs file: /sys/devices/pci0000:00/0000:00:0c.0/ssb0:0/firmware/ssb0:0/loading
[   86.026672] Modules linked in: radeon drm snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device ircomm_tty ircomm p4_clockmod speedstep_lib msr cpufreq_ondemand freq_table snd_pcm_oss snd_mixer_oss b43 rng_core mac80211 irtty_sir sir_dev firewire_ohci firewire_core ssb irda crc_itu_t crc_ccitt yenta_socket pcmcia ohci1394 rsrc_nonstatic ehci_hcd ohci_hcd sr_mod pcmcia_core wbsd ppdev ieee1394 sis_agp mmc_core agpgart cdrom parport_pc parport battery [last unloaded: microcode]
[   86.026672]
[   86.026672] Pid: 2605, comm: rm Not tainted (2.6.29-rc2 #1) A2C/S
[   86.026672] EIP: 0060:[<c02dffe5>] EFLAGS: 00010257 CPU: 0
[   86.026672] EIP is at __btrfs_reserve_extent+0x33d/0x43a
[   86.026672] EAX: f6ae4ca8 EBX: f6978d80 ECX: ffffffff EDX: 00000001
[   86.026672] ESI: f6ae4ca0 EDI: f6ae4ca0 EBP: f6ae4ca8 ESP: f3e4a7f4
[   86.026672]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[   86.026672] Process rm (pid: 2605, ti=f3e4a000 task=f713c000 task.ti=f3e4a000)
[   86.026672] Stack:
[   86.026672]  c053fdd0 01c00000 00000000 11de0000 00000000 10e6a000 00000000 00f66000
[   86.026672]  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000024
[   86.026672]  f6bd6064 f69a5c00 f6f0c030 f6bd6000 00000000 00000002 00000000 f69a5c00
[   86.026672] Call Trace:
[   86.026672]  [<c02e0181>] btrfs_alloc_extent+0x9f/0x14d
[   86.026672]  [<c02e02f9>] btrfs_alloc_free_block+0xca/0x101
[   86.026672]  [<c02cce80>] __btrfs_cow_block+0x249/0x944
[   86.026672]  [<c02e6450>] verify_parent_transid+0x4e/0x179
[   86.026672]  [<c02cdbe0>] btrfs_cow_block+0x156/0x1f8
[   86.026672]  [<c02d2bdc>] btrfs_search_slot+0x1aa/0x825
[   86.026672]  [<c0300f58>] clear_state_bit+0x7b/0x119
[   86.026672]  [<c0300f58>] clear_state_bit+0x7b/0x119
[   86.026672]  [<c0301175>] clear_extent_bit+0x17f/0x31c
[   86.026672]  [<c02d52f8>] lookup_extent_backref+0x4e/0x116
[   86.026672]  [<c02d7e24>] free_extents+0xcb/0x78a
[   86.026672]  [<c02d885a>] del_pending_extents+0x377/0x3bf
[   86.026672]  [<c02dbbac>] __btrfs_inc_extent_ref+0x224/0x23d
[   86.026672]  [<c02d76d6>] btrfs_inc_ref+0x263/0x416
[   86.026672]  [<c02db988>] __btrfs_inc_extent_ref+0x0/0x23d
[   86.026672]  [<c02ccfff>] __btrfs_cow_block+0x3c8/0x944
[   86.026672]  [<c02cdbe0>] btrfs_cow_block+0x156/0x1f8
[   86.026672]  [<c02d2bdc>] btrfs_search_slot+0x1aa/0x825
[   86.026672]  [<c032f5e1>] chksum_update+0xb/0x12
[   86.026672]  [<c032701e>] shash_update_unaligned+0x79/0x84
[   86.026672]  [<c021a4b0>] update_curr+0x65/0xa8
[   86.026672]  [<c02d37b7>] btrfs_insert_empty_items+0x75/0x40a
[   86.026672]  [<c021893b>] dequeue_task+0x68/0xce
[   86.026672]  [<c04a1b8d>] schedule+0x166/0x391
[   86.026672]  [<c030fc7a>] btrfs_insert_orphan_item+0x6a/0x80
[   86.026672]  [<c02ea75b>] btrfs_orphan_add+0x4c/0x53
[   86.026672]  [<c02ed6f3>] btrfs_rmdir+0xaa/0x138
[   86.026672]  [<c0270c5a>] vfs_rmdir+0x57/0x8f
[   86.026672]  [<c0272d8e>] do_rmdir+0xb5/0xc6
[   86.026672]  [<c0202dc5>] sysenter_do_call+0x12/0x25
[   86.026672] Code: d0 fd 53 c0 e8 d5 d5 f3 ff 8b 54 24 64 8b 4c 24 68 89 d8 e8 a8 5b 03 00 8b 36 8b 06 0f 18 00 90 39 f7 75 90 89 e8 e8 19 eb f4 ff <0f> 0b eb fe 8b 54 24 70 39 54 24 68 0f 82 e2 fe ff ff 77 0e 8b
[   86.026672] EIP: [<c02dffe5>] __btrfs_reserve_extent+0x33d/0x43a SS:ESP 0068:f3e4a7f4
[   86.076172] ---[ end trace 1b03314584f2ece8 ]---


Steps to reproduce:
Comment 1 Chris Mason 2009-01-27 06:40:13 UTC
Josef is chasing this one, we should have a patch this week.
Comment 2 Valentin QUEQUET 2009-02-16 07:01:23 UTC
This bug #12549 might be coalesced with bug #12555.

Sincerely,
Valentin QUEQUET
Comment 3 Luca Bonissi 2009-03-04 05:33:19 UTC
I reproduced the bug with 2.6.29-rc6. Step to reproduce:

- Create 1GB btrfs volume
- Mount it with -o compress
- Copy 2 times compiled kernel tree
- Copy 1 time uncompile kernel tree (disk usage about 71%)
- Try to read/copy the all files in btrfs filesystem to another location

Now I'm trying if the bug exist by copying only 1 time the compiled kernel tree (disk usage about 30%).

[73829.662671] ------------[ cut here ]------------
[73829.662678] kernel BUG at fs/btrfs/extent-tree.c:3190!
[73829.662684] invalid opcode: 0000 [#1] 
[73829.662690] last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/resource
[73829.662696] Modules linked in: fschmd fscpos i2c_dev ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss rfcomm l2cap bluetooth nfsd lockd auth_rpcgss sunrpc exportfs ipv6 nls_iso8859_1 nls_cp437 vfat fat lp pcspkr psmouse fuse sg usb_storage usblp sbp2 snd_sbawe snd_opl3_lib snd_sb16_dsp snd_sb16_csp snd_sb_common snd_mpu401_uart ns558 evdev container parport_pc parport rtc_cmos rtc_core rtc_lib snd_emu10k1 snd_rawmidi serio_raw thermal snd_ac97_codec ac97_bus processor button thermal_sys hwmon snd_pcm snd_seq_device snd_timer snd_page_alloc snd_util_mem 8139too mii snd_hwdep snd ohci1394 soundcore i2c_piix4 emu10k1_gp ehci_hcd gameport i2c_core intel_agp ieee1394 uhci_hcd shpchp agpgart
[73829.662813] 
[73829.662820] Pid: 3929, comm: cp Tainted: G        W  (2.6.29-rc6se #2) 
[73829.662826] EIP: 0060:[<c027e302>] EFLAGS: 00010257 CPU: 0
[73829.662847] EIP is at __btrfs_reserve_extent+0x2c2/0x410
[73829.662853] EAX: dd8c4ac8 EBX: dd8c4ac0 ECX: ffffffff EDX: 00000001
[73829.662859] ESI: dd8c4a40 EDI: dd8c4ac8 EBP: c0f897f0 ESP: c0f89790
[73829.662866]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
[73829.662873] Process cp (pid: 3929, ti=c0f88000 task=c7a22360 task.ti=c0f88000)
[73829.662878] Stack:
[73829.662882]  c0598654 27400000 00000000 03200000 00000000 03110000 00000000 000ef000
[73829.662893]  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000024
[73829.662905]  00000000 dd944c00 c4ff2060 00000024 00000000 00000002 00000000 dd944c00
[73829.662918] Call Trace:
[73829.662924]  [<c0281428>] ? btrfs_alloc_extent+0x78/0x120
[73829.662936]  [<c0281578>] ? btrfs_alloc_free_block+0xa8/0xf0
[73829.662947]  [<c027549e>] ? __btrfs_cow_block+0x21e/0x960
[73829.662956]  [<c01607ea>] ? free_hot_page+0xa/0x10
[73829.662970]  [<c02a8479>] ? map_extent_buffer+0xb9/0xc0
[73829.662987]  [<c0163412>] ? activate_page+0x132/0x150
[73829.663000]  [<c0276228>] ? btrfs_cow_block+0x178/0x240
[73829.663008]  [<c0279477>] ? btrfs_search_slot+0x197/0x8d0
[73829.663017]  [<c02bdbbe>] ? btrfs_remove_free_space_lock+0x1e/0x30
[73829.663034]  [<c027a121>] ? btrfs_insert_empty_items+0x61/0x410
[73829.663044]  [<c028113a>] ? __btrfs_alloc_reserved_extent+0xca/0x340
[73829.663054]  [<c028147f>] ? btrfs_alloc_extent+0xcf/0x120
[73829.663064]  [<c0281578>] ? btrfs_alloc_free_block+0xa8/0xf0
[73829.663074]  [<c027549e>] ? __btrfs_cow_block+0x21e/0x960
[73829.663082]  [<c02aa37d>] ? unlock_extent+0x3d/0x50
[73829.663092]  [<c02aa8ef>] ? __extent_read_full_page+0x55f/0x860
[73829.663101]  [<c02a8479>] ? map_extent_buffer+0xb9/0xc0
[73829.663110]  [<c0276228>] ? btrfs_cow_block+0x178/0x240
[73829.663119]  [<c0279477>] ? btrfs_search_slot+0x197/0x8d0
[73829.663128]  [<c028ae36>] ? btrfs_lookup_inode+0x36/0xb0
[73829.663141]  [<c0294cb5>] ? btrfs_update_inode+0x45/0xd0
[73829.663156]  [<c0295d1e>] ? btrfs_dirty_inode+0x4e/0x70
[73829.663164]  [<c01992bc>] ? __mark_inode_dirty+0x2c/0x150
[73829.663178]  [<c016307c>] ? ondemand_readahead+0xbc/0x150
[73829.663186]  [<c0191cfb>] ? touch_atime+0xbb/0xf0
[73829.663199]  [<c015d454>] ? generic_file_aio_read+0x564/0x5e0
[73829.663208]  [<c024f1e6>] ? jfs_open+0x16/0xa0
[73829.663218]  [<c018090c>] ? do_sync_read+0xcc/0x110
[73829.663238]  [<c01354d0>] ? autoremove_wake_function+0x0/0x40
[73829.663248]  [<c02cadbf>] ? security_file_permission+0xf/0x20
[73829.663267]  [<c01809a7>] ? rw_verify_area+0x57/0xc0
[73829.663275]  [<c0181114>] ? vfs_read+0x94/0x130
[73829.663284]  [<c0180840>] ? do_sync_read+0x0/0x110
[73829.663292]  [<c018148d>] ? sys_read+0x3d/0x70
[73829.663300]  [<c0103ada>] ? syscall_call+0x7/0xb
[73829.663313] Code: 4d 0c e8 d2 f1 03 00 8b b6 80 00 00 00 83 c6 80 8b 86 80 00 00 00 0f 18 00 90 8d 86 80 
00 00 00 39 d8 75 85 89 f8 e8 0e a9 eb ff <0f> 0b eb fe 8b 55 e4 8b 82 24 01 00 00 8b 50 20 c7 44 24 10 00 
[73829.663373] EIP: [<c027e302>] __btrfs_reserve_extent+0x2c2/0x410 SS:ESP 0068:c0f89790
[73829.663389] ---[ end trace 3f5ff6f07f3b1174 ]---
Comment 4 Josef Bacik 2009-03-04 08:08:59 UTC
Do you happen to have the messages that came before the ---[ cut here ]--- part of that?  I'd like to see if its a true -ENOSPC or not.  Either way thank you for the reproducer, I will add it to my test script.
Comment 5 Luca Bonissi 2009-03-06 07:30:34 UTC
I tried 2.6.29-rc7, same problem:

[  126.544527] btrfs: use compression
[  166.548940] btrfs: disabling barriers on dev /dev/loop0
[  627.983608] btrfs searching for 4096 bytes, num_bytes 4096, loop 2, allowed_alloc 1
[  627.983614] btrfs allocation failed flags 36, wanted 4096
[  627.983617] space_info has 65536 free, is full
[  627.983621] space_info total=161021952, pinned=4755456, delalloc=0, may_use=0, used=156200960
[  627.983625] block group 29360128 has 53673984 bytes, 50446336 used 3162112 pinned 0 reserved
[  627.983629] 0 blocks of free space at or bigger than bytes is
[  627.983633] block group 297730048 has 53673984 bytes, 52871168 used 802816 pinned 0 reserved
[  627.983636] 0 blocks of free space at or bigger than bytes is
[  627.983640] block group 566099968 has 53673984 bytes, 52883456 used 790528 pinned 0 reserved
[  627.983643] 0 blocks of free space at or bigger than bytes is
[  627.983683] ------------[ cut here ]------------
[  627.983688] kernel BUG at fs/btrfs/extent-tree.c:3412!
[  627.983691] invalid opcode: 0000 [#1] SMP


Detailed step to reproduce:
- Create 1GB file system
  + First case:
    - Mount with -o compress option
    - untar 4 times (in different directories) linux-2.6.28.tar.bz2 (usage 50%)
    - go to the root mount point, do "tar cf - . | cat > /dev/null"
  + Second case:
    - Mount normally (without compression)
    - untar 2 times (in different directories) linux-2.6.28.tar.bz2 (usage 65%)
    - go to the root mount point, do "tar cf - . | cat > /dev/null"
  + Third case:
    - Mount with -o compress option
    - untar 6 times (in different directories) linux-2.6.28.tar.bz2 (usage up
      to 77%, then kernel BUG).

When reading files with tar, the used space grows by 40-50 MB and then it returns to the original size (it do this many times).
Comment 6 Alan 2012-10-30 15:49:54 UTC
If this is still seen on modern kernels then please re-open/update

Note You need to log in before you can comment on or make changes to this bug.