Bug 7258

Summary: XFS on dm_crypt Oops
Product: File System Reporter: KatagiriWayou (kpr-ee)
Component: XFSAssignee: Christophe Saout (christophe)
Status: CLOSED CODE_FIX    
Severity: high CC: christophe, cw, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.18�@and 2.6.17-1.2187_FC5 and 2.6.15-1.2054_FC5 Subsystem:
Regression: --- Bisected commit-id:

Description KatagiriWayou 2006-10-03 19:06:18 UTC
Most recent kernel where this bug did not occur: 2.6.9
Distribution: Fedoracore5
Hardware Environment:
------[lspci output]--------------------------------------------------------- 
   00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a1)
00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a2)
00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a2)
00:01.2 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2)
00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1)
00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2)
00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1)
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2)
00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2)
00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2)
00:06.0 PCI bridge: nVidia Corporation Unknown device 0370 (rev a2)
00:06.1 Audio device: nVidia Corporation MCP55 High Definition Audio (rev a2)
00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a2)
00:0a.0 PCI bridge: nVidia Corporation Unknown device 0376 (rev a2)
00:0b.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a2)
00:0c.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a2)
00:0d.0 PCI bridge: nVidia Corporation Unknown device 0378 (rev a2)
00:0e.0 PCI bridge: nVidia Corporation Unknown device 0375 (rev a2)
00:0f.0 PCI bridge: nVidia Corporation Unknown device 0377 (rev a2)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM 
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] 
Miscellaneous Control
01:07.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet 
Controller (Copper) (rev 02)
01:08.0 VGA compatible unclassified device: Texas Instruments TVP4020 
[Permedia 2] (rev 01)
02:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid 
II Controller (rev 01)
04:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid 
II Controller (rev 01)
05:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid 
II Controller (rev 01)
----[CPU]------------------------------------------------------------------
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 79
model name      : AMD Athlon(tm) 64 Processor 3500+
stepping        : 2
cpu MHz         : 1000.000
cache size      : 512 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx
 mmxext fxsr_opt rdtscp lm 3dnowext 3dnow up pni cx16 lahf_lm svm cr8_legacy
bogomips        : 2010.91
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
-----------------------------------------------------------------------------
Software Environment:
 ----------------lsmod------------------------------------------------------
aes                    60352  1
dm_crypt               46224  1
xfs                   558936  2
ipv6                  426848  12
ppdev                  43528  0
autofs4                59144  1
hidp                   84096  2
l2cap                  92544  5 hidp
bluetooth             124932  2 hidp,l2cap
sunrpc                211912  1
ip_nat_ftp             37504  0
ip_conntrack_ftp       42448  1 ip_nat_ftp
ip_conntrack_netbios_ns    36736  0
xt_limit               36608  2
xt_tcpudp              37120  30
iptable_filter         36864  1
ipt_MASQUERADE         37632  1
iptable_nat            41988  1
ip_nat                 55084  3 ip_nat_ftp,ipt_MASQUERADE,iptable_nat
ip_conntrack           96164  6 
ip_nat_ftp,ip_conntrack_ftp,ip_conntrack_netbios_ns,ipt_MASQUERADE,iptable_nat,
ip_nat
nfnetlink              41800  2 ip_nat,ip_conntrack
ip_tables              57184  2 iptable_filter,iptable_nat
x_tables               52616  5 
xt_limit,xt_tcpudp,ipt_MASQUERADE,iptable_nat,ip_tables
raid456               155424  1
xor                    39568  1 raid456
raid0                  41472  1
video                  53512  0
button                 41632  0
battery                45064  0
asus_acpi              52516  0
ac                     39688  0
lp                     49232  0
parport_pc             64936  1
parport                77708  3 ppdev,lp,parport_pc
ohci_hcd               55812  0
ehci_hcd               67848  0
floppy                100424  0
sg                     72360  0
serio_raw              41732  0
snd_hda_intel          54300  0
snd_hda_codec         224128  1 snd_hda_intel
snd_seq_dummy          37892  0
snd_seq_oss            70656  0
snd_seq_midi_event     42496  1 snd_seq_oss
snd_seq                97184  5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event
e1000                 160576  0
snd_seq_device         43028  3 snd_seq_dummy,snd_seq_oss,snd_seq
sata_sil24             50820  8
snd_pcm_oss            82176  0
forcedeth              77828  0
snd_mixer_oss          52224  1 snd_pcm_oss
snd_pcm               125704  3 snd_hda_intel,snd_hda_codec,snd_pcm_oss
pcspkr                 37120  0
snd_timer              60680  2 snd_seq,snd_pcm
snd                   102440  9 
snd_hda_intel,snd_hda_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_
mixer_oss,snd_p
cm,snd_timer
soundcore              45344  1 snd
i2c_nforce2            42240  0
i2c_core               58880  1 i2c_nforce2
snd_page_alloc         44560  2 snd_hda_intel,snd_pcm
dm_snapshot            50640  0
dm_zero                35328  0
dm_mirror              56576  0
dm_mod                 99536  9 dm_crypt,dm_snapshot,dm_zero,dm_mirror
raid1                  57600  1
ext3                  177296  2
jbd                    99440  1 ext3
sata_nv                46468  14
libata                142880  2 sata_sil24,sata_nv
sd_mod                 55808  34
scsi_mod              191056  3 sg,libata,sd_mod
------------------------------------------------------------------------------
Problem Description:
The Oops message was outout from kernel with xfs filesystem on dm_crypt 
With 2.6.15 and 2.6.18 Oops was output.
With 2.6.17 the system do kernel panic end. 

The constitution of filesystem
[Disk x 6]--[sata_nv]-------------+--Raid6--dm_crypt--XFS
[Disk x 6]--[sii3132(sata_si124)]-+

I copied the files (all size=~1T) to the XFS file system by "rsync" or "cp".
Once in about 10 hours, this problem occurs it.

The Oops message is following
------------------------------------------------------------------------------

Oct  3 07:27:21 alice kernel: Unable to handle kernel NULL pointer dereference 
at 0000000000000000 RIP:
Oct  3 07:27:21 alice kernel: <ffffffff80121b7d>{page_to_pfn+0}
Oct  3 07:27:21 alice kernel: PGD 7462c067 PUD 745e0067 PMD 747e0067 PTE 0
Oct  3 07:27:21 alice kernel: Oops: 0000 [1] SMP
Oct  3 07:27:21 alice kernel: last sysfs 
file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Oct  3 07:27:21 alice kernel: CPU 0
Oct  3 07:27:21 alice kernel: Modules linked in: aes dm_crypt raid5 nls_utf8 
hfsplus xfs exportfs ipv6 ppdev autofs4 i2c_dev i
2c_core hidp l2cap bluetooth vmnet(U) vmmon(U) sunrpc ip_nat_ftp 
ip_conntrack_ftp ip_conntrack_netbios_ns xt_limit xt_tcpudp i
ptable_filter ipt_MASQUERADE iptable_nat ip_nat ip_conntrack nfnetlink 
ip_tables x_tables dm_mirror dm_mod raid6 xor raid0 vid
eo button battery ac lp parport_pc parport floppy nvram sg ehci_hcd ohci_hcd 
sata_sil24 e1000 snd_hda_intel snd_hda_codec snd_
seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss 
snd_mixer_oss snd_pcm forcedeth snd_timer snd soun
dcore snd_page_alloc raid1 ext3 jbd sata_nv libata sd_mod scsi_mod
Oct  3 07:27:21 alice kernel: Pid: 6567, comm: pdflush Tainted: P      2.6.15-
1.2054_FC5.root #1
Oct  3 07:27:21 alice kernel: RIP: 0010:[<ffffffff80121b7d>] <ffffffff80121b7d>
{page_to_pfn+0}
Oct  3 07:27:21 alice kernel: RSP: 0018:ffff8100366ad760  EFLAGS: 00010293
Oct  3 07:27:21 alice kernel: RAX: 0000000000000000 RBX: ffff81007ec18540 RCX: 
0000000000000000
Oct  3 07:27:21 alice kernel: RDX: 0000000000000016 RSI: ffff81007ec18540 RDI: 
0000000000000000
Oct  3 07:27:21 alice kernel: RBP: ffff81002abec400 R08: 0000000000000000 R09: 
ffff81003d6fa7a0
Oct  3 07:27:21 alice kernel: R10: 0000000000000000 R11: ffffffff80338fbb R12: 
ffff81007ec18540
Oct  3 07:27:21 alice kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 
ffff8100033b4ce8
Oct  3 07:27:21 alice kernel: FS:  00002accf750dd10(0000) GS:ffffffff8050b000
(0000) knlGS:00000000f7f9a6b0
Oct  3 07:27:21 alice kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
Oct  3 07:27:21 alice kernel: CR2: 0000000000000000 CR3: 00000000746c2000 CR4: 
00000000000006e0
Oct  3 07:27:21 alice kernel: Process pdflush (pid: 6567, threadinfo 
ffff8100366ac000, task ffff81003d6fa7a0)
Oct  3 07:27:21 alice kernel: Stack: ffffffff801f30c0 ffff81003d6fa7a0 
0000000000006c80 0000000000000000
Oct  3 07:27:21 alice kernel:        ffff810000000000 0000000000000001 
ffff81007ec18540 ffff8100033b4ce8
Oct  3 07:27:21 alice kernel:        ffff810027c32cc0 ffff81001acd9728
Oct  3 07:27:21 alice kernel: Call Trace: <ffffffff801f30c0>
{blk_recount_segments+126}
Oct  3 07:27:21 alice kernel:        <ffffffff80183c00>{__bio_clone+113} 
<ffffffff80183c4f>{bio_clone+53}
Oct  3 07:27:21 alice kernel:        <ffffffff885e673a>
{:dm_crypt:crypt_map+205} <ffffffff883002c9>{:dm_mod:__map_bio+71}
Oct  3 07:27:21 alice kernel:        <ffffffff88300cea>
{:dm_mod:__split_bio+381} <ffffffff8830151a>{:dm_mod:dm_request+337}
Oct  3 07:27:21 alice kernel:        <ffffffff801f3f52>
{generic_make_request+365} <ffffffff801f573c>{submit_bio+186}
Oct  3 07:27:21 alice kernel:        <ffffffff80183691>{__bio_add_page+393} 
<ffffffff8858345a>{:xfs:xfs_submit_ioend_bio+30}
Oct  3 07:27:21 alice kernel:        <ffffffff88583ea2>
{:xfs:xfs_page_state_convert+2623}
Oct  3 07:27:21 alice kernel:        <ffffffff88584218>
{:xfs:linvfs_writepage+167} <ffffffff801a0028>{mpage_writepages+462}
Oct  3 07:27:21 alice kernel:        <ffffffff88584171>
{:xfs:linvfs_writepage+0} <ffffffff80160f40>{do_writepages+46}
Oct  3 07:27:21 alice kernel:        <ffffffff8019e937>
{__writeback_single_inode+449} <ffffffff8019edf6>{sync_sb_inodes+472}
Oct  3 07:27:21 alice kernel:        <ffffffff80145afb>
{keventd_create_kthread+0} <ffffffff8019f3a5>{writeback_inodes+149}
Oct  3 07:27:21 alice kernel:        <ffffffff8016127a>
{background_writeout+112} <ffffffff8016189a>{pdflush+0}
Oct  3 07:27:21 alice kernel:        <ffffffff80161a0e>{pdflush+372} 
<ffffffff8016120a>{background_writeout+0}
Oct  3 07:27:21 alice kernel:        <ffffffff80145de0>{kthread+254} 
<ffffffff8010b8e6>{child_rip+8}
Oct  3 07:27:21 alice kernel:        <ffffffff80145afb>
{keventd_create_kthread+0} <ffffffff80338fbb>{thread_return+0}
Oct  3 07:27:21 alice kernel:        <ffffffff80145ce2>{kthread+0} 
<ffffffff8010b8de>{child_rip+0}
Oct  3 07:27:21 alice kernel:
Oct  3 07:27:21 alice kernel: Code: 48 8b 07 48 c1 e8 38 48 8b 04 c5 40 41 53 
80 48 2b b8 20 0a
Oct  3 07:27:21 alice kernel: RIP <ffffffff80121b7d>{page_to_pfn+0} RSP 
<ffff8100366ad760>
Oct  3 07:27:21 alice kernel: CR2: 0000000000000000


Steps to reproduce:
Comment 1 Pavel 2006-11-22 11:23:42 UTC
I have the same problem.

device structure:

/dev/md0		: software RAID 5
/dev/mapper/md0-aes	: (software RAID 5) with dm_crypt
/dev/mapper/lnvg-store	: ((software RAID 5) with dm_crypt) 

(((software RAID 5) with dm_crypt) LVM) XFS

Distribution: Gentoo
Hardware Environment:
------[lspci output]--------------------------------------------------------- 
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation Unknown device 0050 (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio
Controller (rev a2)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:00.0 VGA compatible controller: ATI Technologies Inc Unknown device 5b63
01:00.1 Display controller: ATI Technologies Inc Unknown device 5b73
05:06.0 CardBus bridge: Ricoh Co Ltd RL5c475 (rev 80)
05:07.0 RAID bus controller: Silicon Image, Inc. (formerly CMD Technology Inc)
SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
05:08.0 RAID bus controller: Silicon Image, Inc. (formerly CMD Technology Inc)
SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
----[CPU]------------------------------------------------------------------
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 47
model name      : AMD Athlon(tm) 64 Processor 3200+
stepping        : 2
cpu MHz         : 2015.031
cache size      : 512 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow
pni lahf_lm
bogomips        : 4032.47
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc
----------------lsmod--------------------------------------------------------
Module                  Size  Used by
aes_x86_64             27688  0
sha256                 10816  0
dm_crypt               11664  0
iptable_nat             8388  1
ip_nat                 17452  1 iptable_nat
ip_conntrack           48356  2 iptable_nat,ip_nat
iptable_filter          4864  1
ip_tables              19112  2 iptable_nat,iptable_filter
x_tables               14792  2 iptable_nat,ip_tables
pppoe                  14848  2
pppox                   5200  1 pppoe
ppp_async              11136  0
ppp_generic            21728  7 pppoe,pppox,ppp_async
slhc                    7936  1 ppp_generic
crc_ccitt               4160  1 ppp_async
i2c_nforce2             8896  0
it87                   23844  0
hwmon_vid               4544  1 it87
i2c_isa                 6656  1 it87
i2c_core               19904  3 i2c_nforce2,it87,i2c_isa
dm_snapshot            15032  0
dm_mirror              18560  0
dm_mod                 48464  2 dm_snapshot,dm_mirror
The Oops message is following
------------------------------------------------------------------------------

Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
<ffffffff802520dc>{page_to_pfn+0}
PGD 2ace1067 PUD cb85067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: aes_x86_64 sha256 dm_crypt capability commoncap iptable_nat
ip_nat ip_conntrack iptable_filter ip_tables x_tables pppoe pppox ppp_async
ppp_generic slhc crc_ccitt serial_cs pcmcia firmware_class yenta_socket
rsrc_nonstatic pcmcia_core i2c_nforce2 it87 hwmon_vid i2c_isa i2c_core
dm_snapshot dm_mirror dm_mod
Pid: 30349, comm: pdflush Not tainted 2.6.17-gentoo-r7 #32
RIP: 0010:[<ffffffff802520dc>] <ffffffff802520dc>{page_to_pfn+0}
RSP: 0018:ffff810021af56b0  EFLAGS: 00010297
RAX: 0000000000000000 RBX: ffff810033ff54c0 RCX: 0000000000000000
RDX: 000000000000000d RSI: ffff810033ff54c0 RDI: 0000000000000000
RBP: ffff8100276b2c00 R08: 0000000000000000 R09: ffff81003f00f850
R10: 0000000000000282 R11: ffff81003f6c7de8 R12: ffff810033ff54c0
R13: 0000000000000000 R14: 0000000000000000 R15: ffff81003f7e5350
FS:  00002ae66af90f30(0000) GS:ffffffff8076d000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000026fff000 CR4: 00000000000006e0
Process pdflush (pid: 30349, threadinfo ffff810021af4000, task ffff81003f00f850)
Stack: ffffffff8038a880 ffff81003f00f850 0000000000001840 0000000000000000
       ffff810000000001 0000000000000001 ffff810033ff54c0 ffff81003f7e5350
       ffff81001fb3e1c0 ffff8100276b1d28
Call Trace: <ffffffff8038a880>{blk_recount_segments+126}
       <ffffffff80275d52>{__bio_clone+113} <ffffffff80275ea3>{bio_clone+53}
       <ffffffff8809e731>{:dm_crypt:crypt_map+205}
<ffffffff880022b6>{:dm_mod:__map_bio+71}
       <ffffffff88002b3c>{:dm_mod:__split_bio+370}
<ffffffff8038a919>{blk_recount_segments+279}
       <ffffffff88003349>{:dm_mod:dm_request+257}
<ffffffff8038b716>{generic_make_request+342}
       <ffffffff880022b6>{:dm_mod:__map_bio+71}
<ffffffff88002b3c>{:dm_mod:__split_bio+370}
       <ffffffff8039755d>{__up_read+19} <ffffffff88003349>{:dm_mod:dm_request+257}
       <ffffffff8038b716>{generic_make_request+342}
<ffffffff8038d3ae>{submit_bio+184}
       <ffffffff80275c30>{__bio_add_page+340}
<ffffffff80376de3>{xfs_submit_ioend_bio+30}
       <ffffffff8037781b>{xfs_page_state_convert+2607}
<ffffffff80377b8a>{xfs_vm_writepage+167}
       <ffffffff802916d0>{mpage_writepages+437}
<ffffffff80377ae3>{xfs_vm_writepage+0}
       <ffffffff80254234>{do_writepages+41}
<ffffffff8028ff61>{__writeback_single_inode+436}
       <ffffffff88002963>{:dm_mod:dm_any_congested+56}
<ffffffff880043a3>{:dm_mod:dm_table_any_congested+70}
       <ffffffff802903b3>{sync_sb_inodes+469}
<ffffffff80240bfb>{keventd_create_kthread+0}
       <ffffffff80290923>{writeback_inodes+125}
<ffffffff8025454a>{background_writeout+118}
       <ffffffff80254b5a>{pdflush+0} <ffffffff80254c9c>{pdflush+322}
       <ffffffff802544d4>{background_writeout+0} <ffffffff80240eb6>{kthread+212}
       <ffffffff8020a4da>{child_rip+8} <ffffffff80240bfb>{keventd_create_kthread+0}
       <ffffffff80240de2>{kthread+0} <ffffffff8020a4d2>{child_rip+0}

Code: 48 8b 07 48 c1 e8 3a 48 8b 14 c5 a0 66 77 80 48 b8 b7 6d db
RIP <ffffffff802520dc>{page_to_pfn+0} RSP <ffff810021af56b0>
CR2: 0000000000000000
Comment 2 Dominik Sandjaja 2006-11-23 06:11:25 UTC
Does this also happen on non-XFS-filesystems?
Comment 3 KatagiriWayou 2006-11-23 08:59:24 UTC
No,this don't happen on only XFS.
I tryed to use JFS,but this don't happen.
Comment 4 Christophe Saout 2006-12-03 15:13:55 UTC
We have finally found out, why there are strange problems with dm-crypt
sometimes. If readaheads are cancelled by the underlying block device, the
buffer/page cache could be populated by bogus data. Unfortunately this was
showing itself very rarely and under strange and hard to reproduce
circumstances, and apparently only on top of software raid5.

Anyway, that bug is fixed in 2.6.19 and will be in 2.6.18.6 (missed .5 by some
hours).

The patch can be found here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=116503133222152&w=2

The problem would typically show up as metadata corruption, which not all
filesystems can gracefully handle (i.e. without oopsing).
Comment 5 Christophe Saout 2006-12-14 16:28:41 UTC
Yikes. The page_to_pfn used with sparse memory barfs when hitting freed pages
while cloning writeout bios under memory pressure. bio_clone shouldn't look at
these pages anyway.

I see four possibilities:

- don't set bv_page to NULL (ugly)
- before freeing the pages change bi_idx (atomically?) so that nobody ever looks
at the freed bv_page? (strange)
- implement own bio_clone (ugly)
- Since bio_clone doesn't share the bv array any more, another possibility would
be to not use bio_clone at all and go with bio_set_alloc all the way.

Probably the last solution.
Comment 6 Natalie Protasevich 2008-03-30 14:44:32 UTC
What is the status on this bug, has it been resolved?