Distribution: debian-amd64/unstable Jul 15 18:42:03 localhost kernel: [ 1068.607624] BUG at fs/jfs/jfs_dmap.c:2722 assert(bsz < le32_to_cpu(tp->dmt_nleafs)) Jul 15 18:42:03 localhost kernel: [ 1068.607840] ----------- [cut here ] --------- [please bite here ] --------- Jul 15 18:42:03 localhost kernel: [ 1068.607845] Kernel BUG at "fs/jfs/jfs_dmap.c":2722 Jul 15 18:42:03 localhost kernel: [ 1068.607848] invalid operand: 0000 [1] PREEMPT SMP Jul 15 18:42:03 localhost kernel: [ 1068.607851] CPU 1 Jul 15 18:42:03 localhost kernel: [ 1068.607854] Modules linked in: e1000 amd64_agp ndiswrapper hpfs fglrx snd_emu10k1_synth snd_emux_synth snd_seq_virmidi snd_seq_midi_emul snd_emu10k1 snd_ac97_codec snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_seq_device snd_hwdep snd_util_mem snd_pcm_oss snd_pcm snd_page_alloc snd_mixer_oss snd_rtctimer snd_timer pktcdvd Jul 15 18:42:03 localhost kernel: [ 1068.607871] Pid: 5221, comm: unrar Tainted: P 2.6.12.2.64 Jul 15 18:42:03 localhost kernel: [ 1068.607875] RIP: 0010:[dbBackSplit+183/336] <ffffffff8024b067>{dbBackSplit+183} Jul 15 18:42:03 localhost kernel: [ 1068.607886] RSP: 0018:ffff8100b7263868 EFLAGS: 00010296 Jul 15 18:42:03 localhost kernel: [ 1068.607889] RAX: 000000000000005d RBX: 0000000000000400 RCX: 00000000c0000100 Jul 15 18:42:03 localhost kernel: [ 1068.607893] RDX: 0000000000000000 RSI: ffff8100b76fb650 RDI: ffff810003d35560 Jul 15 18:42:03 localhost kernel: [ 1068.607897] RBP: 0000000000000000 R08: ffff8100b7262000 R09: 00000000ffffffff Jul 15 18:42:03 localhost kernel: [ 1068.607900] R10: 0000000000000000 R11: 0000000000000002 R12: ffff8100b1ded000 Jul 15 18:42:03 localhost kernel: [ 1068.607905] R13: ffff8100b1ded166 R14: ffff8100b1ded458 R15: 00000000000002f2 Jul 15 18:42:03 localhost kernel: [ 1068.607909] FS: 00002aaaab2718e0(0000) GS:ffffffff80671d80(0000) knlGS:00000000556d5580 Jul 15 18:42:03 localhost kernel: [ 1068.607913] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jul 15 18:42:03 localhost kernel: [ 1068.607916] CR2: 00002aaaab8f8000 CR3: 00000000b1ae7000 CR4: 00000000000006e0 Jul 15 18:42:03 localhost kernel: [ 1068.607921] Process unrar (pid: 5221, threadinfo ffff8100b7262000, task ffff8100b76fb650) Jul 15 18:42:03 localhost kernel: [ 1068.607923] Stack: 0000000e00000002 00000000000002f2 ffff8100b1ded000 ffff8100cbd41000 Jul 15 18:42:03 localhost kernel: [ 1068.607930] 0000000000000000 0000000000000447 ffff8100bad4fb38 ffffffff8024babc Jul 15 18:42:03 localhost kernel: [ 1068.607936] 0000000000de48e8 0000000100000001 Jul 15 18:42:03 localhost kernel: [ 1068.607939] Call Trace:<ffffffff8024babc>{dbAdjCtl+348} <ffffffff8024bc78>{dbAllocDmap+88} Jul 15 18:42:03 localhost kernel: [ 1068.607955] <ffffffff8024c9b2>{dbAlloc+514} <ffffffff802540af>{extAlloc+495} Jul 15 18:42:03 localhost kernel: [ 1068.607966] <ffffffff8023e57a>{jfs_get_blocks+362} <ffffffff8017d1f3>{nobh_prepare_write+179} Jul 15 18:42:03 localhost kernel: [ 1068.607983] <ffffffff8023e640>{jfs_get_block+0} <ffffffff802e39a3>{radix_tree_node_alloc+19} Jul 15 18:42:03 localhost kernel: [ 1068.607995] <ffffffff801575ab>{generic_file_buffered_write+635} Jul 15 18:42:03 localhost kernel: [ 1068.608005] <ffffffff80157d5f>{__generic_file_aio_write_nolock+895} Jul 15 18:42:03 localhost kernel: [ 1068.608018] <ffffffff8015ff49>{__pagevec_lru_add_active+233} <ffffffff80157f3e>{__generic_file_write_nolock+158} Jul 15 18:42:03 localhost kernel: [ 1068.608032] <ffffffff8016a389>{vma_merge+377} <ffffffff80148740>{autoremove_wake_function+0} Jul 15 18:42:03 localhost kernel: [ 1068.608047] <ffffffff8016b001>{do_mmap_pgoff+1505} <ffffffff801580a5>{generic_file_write+101} Jul 15 18:42:03 localhost kernel: [ 1068.608056] <ffffffff801799ea>{vfs_write+186} <ffffffff80179f83>{sys_write+83} Jul 15 18:42:03 localhost kernel: [ 1068.608068] <ffffffff8010dad2>{system_call+126} Jul 15 18:42:03 localhost kernel: [ 1068.608078] Jul 15 18:42:03 localhost kernel: [ 1068.608079] Code: 0f 0b 70 9f 4c 80 ff ff ff ff a2 0a 89 ee 31 de 48 63 c6 41 Jul 15 18:42:03 localhost kernel: [ 1068.608089] RIP <ffffffff8024b067>{dbBackSplit+183} RSP <ffff8100b7263868> I've lost a number of files upon fsck.jfs after this happened.
The assert() statement is the wrong thing to do here. The code somehow stumbled on some data in the block map that is inconsistent. The right thing would be to return -EIO and mark the superblock dirty, so that fsck rebuild the block map. What kind of activity was going on? Were large files being written to? Lots of files in parallel? Any other errors that might have triggered an I/O error?
I'm not sure what was a reason for the bug to happen but as a result a large file was found cross-linked with several small ones, so fsck.jfs deleted them all.
Created attachment 6278 [details] Replace asserts with sane error handling
I'm not sure what the initial cause of the problem is, but this patch will fix the oops.