Bug 28232

Summary: Kernel panics with 2.6.38 (rc1, rc2, rc3, rc4) and the lzo compression of btrfs
Product: File System Reporter: Juan Francisco Cantero Hurtado (iam)
Component: btrfsAssignee: fs_btrfs (fs_btrfs)
Status: CLOSED CODE_FIX    
Severity: high CC: chris.mason, florian, iam, lizf, maciej.rutecki, rjw
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.38-rc4 Subsystem:
Regression: No Bisected commit-id:
Attachments: Output of /proc/version
Output of ver_linux
Output of /proc/cpuinfo
Output of /proc/modules
Output of /proc/ioports
Output of /proc/iomem
Output of lspci
Output of /proc/scsi/scsi
Oops

Description Juan Francisco Cantero Hurtado 2011-02-05 04:44:44 UTC
I've a lot of panics with the rc1, rc2 and rc3 of kernel 2.6.38. I've a few btrfs partitions with uncompressed data and compressed data with lzo and zlib. Before of 2.6.38, I've used zlib compression for months without any problem. The bug occur with intensive IO activity and without that.

"lzo1x_decompress_safe" and "last sysfs file: /sys/devices/system/cpu/cpu3/cache" appears always.
Comment 1 Juan Francisco Cantero Hurtado 2011-02-05 04:50:24 UTC
Created attachment 46352 [details]
Output of /proc/version
Comment 2 Juan Francisco Cantero Hurtado 2011-02-05 04:52:19 UTC
Created attachment 46362 [details]
Output of ver_linux
Comment 3 Juan Francisco Cantero Hurtado 2011-02-05 04:53:55 UTC
Created attachment 46372 [details]
Output of /proc/cpuinfo
Comment 4 Juan Francisco Cantero Hurtado 2011-02-05 04:55:22 UTC
Created attachment 46382 [details]
Output of /proc/modules
Comment 5 Juan Francisco Cantero Hurtado 2011-02-05 04:56:36 UTC
Created attachment 46392 [details]
Output of /proc/ioports
Comment 6 Juan Francisco Cantero Hurtado 2011-02-05 04:57:51 UTC
Created attachment 46402 [details]
Output of /proc/iomem
Comment 7 Juan Francisco Cantero Hurtado 2011-02-05 05:01:24 UTC
Created attachment 46412 [details]
Output of lspci
Comment 8 Juan Francisco Cantero Hurtado 2011-02-05 05:02:44 UTC
Created attachment 46422 [details]
Output of /proc/scsi/scsi
Comment 9 Juan Francisco Cantero Hurtado 2011-02-08 20:56:07 UTC
The same problem with 2.6.38-rc4.
Comment 10 Juan Francisco Cantero Hurtado 2011-02-11 03:43:51 UTC
I've tried with many configurations in the kernel and the bug persist. Also I've tested with this kernel of Fedora ( http://koji.fedoraproject.org/koji/buildinfo?buildID=218287 ) and the same problem.

I really need this bug fixed, the computer is unusable with this. If you need any type of testing or more information, let me know.
Comment 11 Juan Francisco Cantero Hurtado 2011-02-12 00:45:22 UTC
I've tried the x86 and amd64 versions of the livecd of Ubuntu 11.04 alpha-2 ( http://cdimage.ubuntu.com/releases/natty/alpha-2/ ). Both have the kernel 2.6.38 rc1.

The amd64 version works without problems. The x86 version fail with the same Oops.
Comment 12 Juan Francisco Cantero Hurtado 2011-02-12 23:23:12 UTC
I've removed the regression label in this bug because I thought that the problem affect also to the zlib compression but the bug is only in the lzo code (new in 2.6.38).
Comment 13 Juan Francisco Cantero Hurtado 2011-02-12 23:41:59 UTC
For the people that has used the lzo compression and now can't access to their data:
- Run a amd64 system or amd64 livecd. Never with a x86 system.
- Mount the partition or subvolume with data inaccessible without the lzo compression activated.
- Mount other partition or subvolume without the lzo compression activated.
- Copy all data from the old partition or subvolume to the other partition or subvolume.
- Remove all data in the old partition or subvolume.
- Now copy the data again to the old subvolume or partition.
- Sync the filesystems or reboot.
Comment 14 Chris Mason 2011-02-13 23:26:51 UTC
Could you please post the oops?  I'm unable reproduce this locally.
Comment 15 Juan Francisco Cantero Hurtado 2011-02-13 23:39:52 UTC
I'm sorry. I forgot the more important file :) .
Comment 16 Juan Francisco Cantero Hurtado 2011-02-13 23:40:41 UTC
Created attachment 47712 [details]
Oops
Comment 17 Li Zefan 2011-02-14 01:48:48 UTC
Thanks for the report! I'll look into it.
Comment 18 Li Zefan 2011-02-16 01:44:12 UTC
I managed to trigger the bug after running a test script for about 10 hours, and I think I know what's the cause.
Comment 19 Juan Francisco Cantero Hurtado 2011-02-17 23:56:04 UTC
I think that your patch[1] fix the bug. I've tested the kernel 2.6.38-rc5+patch with intensive disk activity for more than 24 hours and all works perfect. Thank you so much for your work :) . You can close the bug.

1.- http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08469.html
Comment 20 Florian Mickler 2011-03-30 21:32:48 UTC
The fix has been merged into v2.6.38-rc7: 

commit ca9b688c1c9a21635cfc8af8b68565b154185196
Author: Li Zefan <lizf@cn.fujitsu.com>
Date:   Wed Feb 16 06:06:41 2011 +0000

    Btrfs: Avoid accessing unmapped kernel address