Most recent kernel where this bug did not occur: 2.6.9 Distribution: Fedoracore5 Hardware Environment: ------[lspci output]--------------------------------------------------------- 00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a1) 00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a2) 00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a2) 00:01.2 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2) 00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1) 00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2) 00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1) 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2) 00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2) 00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2) 00:06.0 PCI bridge: nVidia Corporation Unknown device 0370 (rev a2) 00:06.1 Audio device: nVidia Corporation MCP55 High Definition Audio (rev a2) 00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a2) 00:0a.0 PCI bridge: nVidia Corporation Unknown device 0376 (rev a2) 00:0b.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a2) 00:0c.0 PCI bridge: nVidia Corporation Unknown device 0374 (rev a2) 00:0d.0 PCI bridge: nVidia Corporation Unknown device 0378 (rev a2) 00:0e.0 PCI bridge: nVidia Corporation Unknown device 0375 (rev a2) 00:0f.0 PCI bridge: nVidia Corporation Unknown device 0377 (rev a2) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:07.0 Ethernet controller: Intel Corporation 82544EI Gigabit Ethernet Controller (Copper) (rev 02) 01:08.0 VGA compatible unclassified device: Texas Instruments TVP4020 [Permedia 2] (rev 01) 02:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01) 04:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01) 05:00.0 Mass storage controller: Silicon Image, Inc. SiI 3132 Serial ATA Raid II Controller (rev 01) ----[CPU]------------------------------------------------------------------ processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 79 model name : AMD Athlon(tm) 64 Processor 3500+ stepping : 2 cpu MHz : 1000.000 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow up pni cx16 lahf_lm svm cr8_legacy bogomips : 2010.91 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc ----------------------------------------------------------------------------- Software Environment: ----------------lsmod------------------------------------------------------ aes 60352 1 dm_crypt 46224 1 xfs 558936 2 ipv6 426848 12 ppdev 43528 0 autofs4 59144 1 hidp 84096 2 l2cap 92544 5 hidp bluetooth 124932 2 hidp,l2cap sunrpc 211912 1 ip_nat_ftp 37504 0 ip_conntrack_ftp 42448 1 ip_nat_ftp ip_conntrack_netbios_ns 36736 0 xt_limit 36608 2 xt_tcpudp 37120 30 iptable_filter 36864 1 ipt_MASQUERADE 37632 1 iptable_nat 41988 1 ip_nat 55084 3 ip_nat_ftp,ipt_MASQUERADE,iptable_nat ip_conntrack 96164 6 ip_nat_ftp,ip_conntrack_ftp,ip_conntrack_netbios_ns,ipt_MASQUERADE,iptable_nat, ip_nat nfnetlink 41800 2 ip_nat,ip_conntrack ip_tables 57184 2 iptable_filter,iptable_nat x_tables 52616 5 xt_limit,xt_tcpudp,ipt_MASQUERADE,iptable_nat,ip_tables raid456 155424 1 xor 39568 1 raid456 raid0 41472 1 video 53512 0 button 41632 0 battery 45064 0 asus_acpi 52516 0 ac 39688 0 lp 49232 0 parport_pc 64936 1 parport 77708 3 ppdev,lp,parport_pc ohci_hcd 55812 0 ehci_hcd 67848 0 floppy 100424 0 sg 72360 0 serio_raw 41732 0 snd_hda_intel 54300 0 snd_hda_codec 224128 1 snd_hda_intel snd_seq_dummy 37892 0 snd_seq_oss 70656 0 snd_seq_midi_event 42496 1 snd_seq_oss snd_seq 97184 5 snd_seq_dummy,snd_seq_oss,snd_seq_midi_event e1000 160576 0 snd_seq_device 43028 3 snd_seq_dummy,snd_seq_oss,snd_seq sata_sil24 50820 8 snd_pcm_oss 82176 0 forcedeth 77828 0 snd_mixer_oss 52224 1 snd_pcm_oss snd_pcm 125704 3 snd_hda_intel,snd_hda_codec,snd_pcm_oss pcspkr 37120 0 snd_timer 60680 2 snd_seq,snd_pcm snd 102440 9 snd_hda_intel,snd_hda_codec,snd_seq_oss,snd_seq,snd_seq_device,snd_pcm_oss,snd_ mixer_oss,snd_p cm,snd_timer soundcore 45344 1 snd i2c_nforce2 42240 0 i2c_core 58880 1 i2c_nforce2 snd_page_alloc 44560 2 snd_hda_intel,snd_pcm dm_snapshot 50640 0 dm_zero 35328 0 dm_mirror 56576 0 dm_mod 99536 9 dm_crypt,dm_snapshot,dm_zero,dm_mirror raid1 57600 1 ext3 177296 2 jbd 99440 1 ext3 sata_nv 46468 14 libata 142880 2 sata_sil24,sata_nv sd_mod 55808 34 scsi_mod 191056 3 sg,libata,sd_mod ------------------------------------------------------------------------------ Problem Description: The Oops message was outout from kernel with xfs filesystem on dm_crypt With 2.6.15 and 2.6.18 Oops was output. With 2.6.17 the system do kernel panic end. The constitution of filesystem [Disk x 6]--[sata_nv]-------------+--Raid6--dm_crypt--XFS [Disk x 6]--[sii3132(sata_si124)]-+ I copied the files (all size=~1T) to the XFS file system by "rsync" or "cp". Once in about 10 hours, this problem occurs it. The Oops message is following ------------------------------------------------------------------------------ Oct 3 07:27:21 alice kernel: Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: Oct 3 07:27:21 alice kernel: <ffffffff80121b7d>{page_to_pfn+0} Oct 3 07:27:21 alice kernel: PGD 7462c067 PUD 745e0067 PMD 747e0067 PTE 0 Oct 3 07:27:21 alice kernel: Oops: 0000 [1] SMP Oct 3 07:27:21 alice kernel: last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed Oct 3 07:27:21 alice kernel: CPU 0 Oct 3 07:27:21 alice kernel: Modules linked in: aes dm_crypt raid5 nls_utf8 hfsplus xfs exportfs ipv6 ppdev autofs4 i2c_dev i 2c_core hidp l2cap bluetooth vmnet(U) vmmon(U) sunrpc ip_nat_ftp ip_conntrack_ftp ip_conntrack_netbios_ns xt_limit xt_tcpudp i ptable_filter ipt_MASQUERADE iptable_nat ip_nat ip_conntrack nfnetlink ip_tables x_tables dm_mirror dm_mod raid6 xor raid0 vid eo button battery ac lp parport_pc parport floppy nvram sg ehci_hcd ohci_hcd sata_sil24 e1000 snd_hda_intel snd_hda_codec snd_ seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm forcedeth snd_timer snd soun dcore snd_page_alloc raid1 ext3 jbd sata_nv libata sd_mod scsi_mod Oct 3 07:27:21 alice kernel: Pid: 6567, comm: pdflush Tainted: P 2.6.15- 1.2054_FC5.root #1 Oct 3 07:27:21 alice kernel: RIP: 0010:[<ffffffff80121b7d>] <ffffffff80121b7d> {page_to_pfn+0} Oct 3 07:27:21 alice kernel: RSP: 0018:ffff8100366ad760 EFLAGS: 00010293 Oct 3 07:27:21 alice kernel: RAX: 0000000000000000 RBX: ffff81007ec18540 RCX: 0000000000000000 Oct 3 07:27:21 alice kernel: RDX: 0000000000000016 RSI: ffff81007ec18540 RDI: 0000000000000000 Oct 3 07:27:21 alice kernel: RBP: ffff81002abec400 R08: 0000000000000000 R09: ffff81003d6fa7a0 Oct 3 07:27:21 alice kernel: R10: 0000000000000000 R11: ffffffff80338fbb R12: ffff81007ec18540 Oct 3 07:27:21 alice kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff8100033b4ce8 Oct 3 07:27:21 alice kernel: FS: 00002accf750dd10(0000) GS:ffffffff8050b000 (0000) knlGS:00000000f7f9a6b0 Oct 3 07:27:21 alice kernel: CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b Oct 3 07:27:21 alice kernel: CR2: 0000000000000000 CR3: 00000000746c2000 CR4: 00000000000006e0 Oct 3 07:27:21 alice kernel: Process pdflush (pid: 6567, threadinfo ffff8100366ac000, task ffff81003d6fa7a0) Oct 3 07:27:21 alice kernel: Stack: ffffffff801f30c0 ffff81003d6fa7a0 0000000000006c80 0000000000000000 Oct 3 07:27:21 alice kernel: ffff810000000000 0000000000000001 ffff81007ec18540 ffff8100033b4ce8 Oct 3 07:27:21 alice kernel: ffff810027c32cc0 ffff81001acd9728 Oct 3 07:27:21 alice kernel: Call Trace: <ffffffff801f30c0> {blk_recount_segments+126} Oct 3 07:27:21 alice kernel: <ffffffff80183c00>{__bio_clone+113} <ffffffff80183c4f>{bio_clone+53} Oct 3 07:27:21 alice kernel: <ffffffff885e673a> {:dm_crypt:crypt_map+205} <ffffffff883002c9>{:dm_mod:__map_bio+71} Oct 3 07:27:21 alice kernel: <ffffffff88300cea> {:dm_mod:__split_bio+381} <ffffffff8830151a>{:dm_mod:dm_request+337} Oct 3 07:27:21 alice kernel: <ffffffff801f3f52> {generic_make_request+365} <ffffffff801f573c>{submit_bio+186} Oct 3 07:27:21 alice kernel: <ffffffff80183691>{__bio_add_page+393} <ffffffff8858345a>{:xfs:xfs_submit_ioend_bio+30} Oct 3 07:27:21 alice kernel: <ffffffff88583ea2> {:xfs:xfs_page_state_convert+2623} Oct 3 07:27:21 alice kernel: <ffffffff88584218> {:xfs:linvfs_writepage+167} <ffffffff801a0028>{mpage_writepages+462} Oct 3 07:27:21 alice kernel: <ffffffff88584171> {:xfs:linvfs_writepage+0} <ffffffff80160f40>{do_writepages+46} Oct 3 07:27:21 alice kernel: <ffffffff8019e937> {__writeback_single_inode+449} <ffffffff8019edf6>{sync_sb_inodes+472} Oct 3 07:27:21 alice kernel: <ffffffff80145afb> {keventd_create_kthread+0} <ffffffff8019f3a5>{writeback_inodes+149} Oct 3 07:27:21 alice kernel: <ffffffff8016127a> {background_writeout+112} <ffffffff8016189a>{pdflush+0} Oct 3 07:27:21 alice kernel: <ffffffff80161a0e>{pdflush+372} <ffffffff8016120a>{background_writeout+0} Oct 3 07:27:21 alice kernel: <ffffffff80145de0>{kthread+254} <ffffffff8010b8e6>{child_rip+8} Oct 3 07:27:21 alice kernel: <ffffffff80145afb> {keventd_create_kthread+0} <ffffffff80338fbb>{thread_return+0} Oct 3 07:27:21 alice kernel: <ffffffff80145ce2>{kthread+0} <ffffffff8010b8de>{child_rip+0} Oct 3 07:27:21 alice kernel: Oct 3 07:27:21 alice kernel: Code: 48 8b 07 48 c1 e8 38 48 8b 04 c5 40 41 53 80 48 2b b8 20 0a Oct 3 07:27:21 alice kernel: RIP <ffffffff80121b7d>{page_to_pfn+0} RSP <ffff8100366ad760> Oct 3 07:27:21 alice kernel: CR2: 0000000000000000 Steps to reproduce:
I have the same problem. device structure: /dev/md0 : software RAID 5 /dev/mapper/md0-aes : (software RAID 5) with dm_crypt /dev/mapper/lnvg-store : ((software RAID 5) with dm_crypt) (((software RAID 5) with dm_crypt) LVM) XFS Distribution: Gentoo Hardware Environment: ------[lspci output]--------------------------------------------------------- 00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3) 00:01.0 ISA bridge: nVidia Corporation Unknown device 0050 (rev a3) 00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2) 00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2) 00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3) 00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97 Audio Controller (rev a2) 00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2) 00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3) 00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3) 00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2) 00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3) 00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:00.0 VGA compatible controller: ATI Technologies Inc Unknown device 5b63 01:00.1 Display controller: ATI Technologies Inc Unknown device 5b73 05:06.0 CardBus bridge: Ricoh Co Ltd RL5c475 (rev 80) 05:07.0 RAID bus controller: Silicon Image, Inc. (formerly CMD Technology Inc) SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) 05:08.0 RAID bus controller: Silicon Image, Inc. (formerly CMD Technology Inc) SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02) ----[CPU]------------------------------------------------------------------ processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 47 model name : AMD Athlon(tm) 64 Processor 3200+ stepping : 2 cpu MHz : 2015.031 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm bogomips : 4032.47 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc ----------------lsmod-------------------------------------------------------- Module Size Used by aes_x86_64 27688 0 sha256 10816 0 dm_crypt 11664 0 iptable_nat 8388 1 ip_nat 17452 1 iptable_nat ip_conntrack 48356 2 iptable_nat,ip_nat iptable_filter 4864 1 ip_tables 19112 2 iptable_nat,iptable_filter x_tables 14792 2 iptable_nat,ip_tables pppoe 14848 2 pppox 5200 1 pppoe ppp_async 11136 0 ppp_generic 21728 7 pppoe,pppox,ppp_async slhc 7936 1 ppp_generic crc_ccitt 4160 1 ppp_async i2c_nforce2 8896 0 it87 23844 0 hwmon_vid 4544 1 it87 i2c_isa 6656 1 it87 i2c_core 19904 3 i2c_nforce2,it87,i2c_isa dm_snapshot 15032 0 dm_mirror 18560 0 dm_mod 48464 2 dm_snapshot,dm_mirror The Oops message is following ------------------------------------------------------------------------------ Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: <ffffffff802520dc>{page_to_pfn+0} PGD 2ace1067 PUD cb85067 PMD 0 Oops: 0000 [1] SMP CPU 0 Modules linked in: aes_x86_64 sha256 dm_crypt capability commoncap iptable_nat ip_nat ip_conntrack iptable_filter ip_tables x_tables pppoe pppox ppp_async ppp_generic slhc crc_ccitt serial_cs pcmcia firmware_class yenta_socket rsrc_nonstatic pcmcia_core i2c_nforce2 it87 hwmon_vid i2c_isa i2c_core dm_snapshot dm_mirror dm_mod Pid: 30349, comm: pdflush Not tainted 2.6.17-gentoo-r7 #32 RIP: 0010:[<ffffffff802520dc>] <ffffffff802520dc>{page_to_pfn+0} RSP: 0018:ffff810021af56b0 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff810033ff54c0 RCX: 0000000000000000 RDX: 000000000000000d RSI: ffff810033ff54c0 RDI: 0000000000000000 RBP: ffff8100276b2c00 R08: 0000000000000000 R09: ffff81003f00f850 R10: 0000000000000282 R11: ffff81003f6c7de8 R12: ffff810033ff54c0 R13: 0000000000000000 R14: 0000000000000000 R15: ffff81003f7e5350 FS: 00002ae66af90f30(0000) GS:ffffffff8076d000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000026fff000 CR4: 00000000000006e0 Process pdflush (pid: 30349, threadinfo ffff810021af4000, task ffff81003f00f850) Stack: ffffffff8038a880 ffff81003f00f850 0000000000001840 0000000000000000 ffff810000000001 0000000000000001 ffff810033ff54c0 ffff81003f7e5350 ffff81001fb3e1c0 ffff8100276b1d28 Call Trace: <ffffffff8038a880>{blk_recount_segments+126} <ffffffff80275d52>{__bio_clone+113} <ffffffff80275ea3>{bio_clone+53} <ffffffff8809e731>{:dm_crypt:crypt_map+205} <ffffffff880022b6>{:dm_mod:__map_bio+71} <ffffffff88002b3c>{:dm_mod:__split_bio+370} <ffffffff8038a919>{blk_recount_segments+279} <ffffffff88003349>{:dm_mod:dm_request+257} <ffffffff8038b716>{generic_make_request+342} <ffffffff880022b6>{:dm_mod:__map_bio+71} <ffffffff88002b3c>{:dm_mod:__split_bio+370} <ffffffff8039755d>{__up_read+19} <ffffffff88003349>{:dm_mod:dm_request+257} <ffffffff8038b716>{generic_make_request+342} <ffffffff8038d3ae>{submit_bio+184} <ffffffff80275c30>{__bio_add_page+340} <ffffffff80376de3>{xfs_submit_ioend_bio+30} <ffffffff8037781b>{xfs_page_state_convert+2607} <ffffffff80377b8a>{xfs_vm_writepage+167} <ffffffff802916d0>{mpage_writepages+437} <ffffffff80377ae3>{xfs_vm_writepage+0} <ffffffff80254234>{do_writepages+41} <ffffffff8028ff61>{__writeback_single_inode+436} <ffffffff88002963>{:dm_mod:dm_any_congested+56} <ffffffff880043a3>{:dm_mod:dm_table_any_congested+70} <ffffffff802903b3>{sync_sb_inodes+469} <ffffffff80240bfb>{keventd_create_kthread+0} <ffffffff80290923>{writeback_inodes+125} <ffffffff8025454a>{background_writeout+118} <ffffffff80254b5a>{pdflush+0} <ffffffff80254c9c>{pdflush+322} <ffffffff802544d4>{background_writeout+0} <ffffffff80240eb6>{kthread+212} <ffffffff8020a4da>{child_rip+8} <ffffffff80240bfb>{keventd_create_kthread+0} <ffffffff80240de2>{kthread+0} <ffffffff8020a4d2>{child_rip+0} Code: 48 8b 07 48 c1 e8 3a 48 8b 14 c5 a0 66 77 80 48 b8 b7 6d db RIP <ffffffff802520dc>{page_to_pfn+0} RSP <ffff810021af56b0> CR2: 0000000000000000
Does this also happen on non-XFS-filesystems?
No,this don't happen on only XFS. I tryed to use JFS,but this don't happen.
We have finally found out, why there are strange problems with dm-crypt sometimes. If readaheads are cancelled by the underlying block device, the buffer/page cache could be populated by bogus data. Unfortunately this was showing itself very rarely and under strange and hard to reproduce circumstances, and apparently only on top of software raid5. Anyway, that bug is fixed in 2.6.19 and will be in 2.6.18.6 (missed .5 by some hours). The patch can be found here: http://marc.theaimsgroup.com/?l=linux-kernel&m=116503133222152&w=2 The problem would typically show up as metadata corruption, which not all filesystems can gracefully handle (i.e. without oopsing).
Yikes. The page_to_pfn used with sparse memory barfs when hitting freed pages while cloning writeout bios under memory pressure. bio_clone shouldn't look at these pages anyway. I see four possibilities: - don't set bv_page to NULL (ugly) - before freeing the pages change bi_idx (atomically?) so that nobody ever looks at the freed bv_page? (strange) - implement own bio_clone (ugly) - Since bio_clone doesn't share the bv array any more, another possibility would be to not use bio_clone at all and go with bio_set_alloc all the way. Probably the last solution.
What is the status on this bug, has it been resolved?