Latest working kernel version: 2.6.23.1 Earliest failing kernel version: 2.6.24.2 Distribution: Hardware Environment: validated for i386 (2x p3 & 2x p4) Software Environment: gcc (GCC) 4.1.2 (Gentoo 4.1.2 p1.0.1) Problem Description: G'day we've caught this bug on two of an our hosts with a gap of three days just after migration to 2.6.24.2 (a last one got stack on i/o a one hour ago). One of them (p3) is a pretty i/o loaded mail exchanger but the second one has negligible i/o most of the time. (and last one oopses at first). I don't know just now how it can be force-reproduced with any sane timeslice (so i can't see how i.g. bisect could be useful here :( ). I'll play with the old 2x piii on weekend, something like dbench xxx on cicle or so, but i suspect it can be not so i/o related. [574124.902070] BUG: unable to handle kernel paging request at virtual address f8000008 [574124.902079] printing eip: c01d7154 *pde = 00000000 [574124.902087] Oops: 0000 [#1] SMP [574124.905481] Modules linked in: ipt_REJECT xt_state xt_multiport iptable_filter [574124.912940] [574124.914537] Pid: 29766, comm: rm Not tainted (2.6.24.2 #1) [574124.920127] EIP: 0060:[<c01d7154>] EFLAGS: 00010286 CPU: 1 [574124.925731] EIP is at xfs_file_readdir+0xea/0x199 [574124.930538] EAX: 00000000 EBX: 00000550 ECX: 00000020 EDX: 00000000 [574124.936917] ESI: 00000000 EDI: f8000000 EBP: d26f1f68 ESP: d26f1f10 [574124.943294] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [574124.948804] Process rm (pid: 29766, ti=d26f1000 task=d3f41a40 task.ti=d26f1000) [574124.956046] Stack: 00000550 00000000 01811a81 00000000 00000000 c015eb80 d26f1f94 dc5386c0 [574124.964728] cd61f780 00000000 00000000 00000000 00000550 00000000 f7fff000 00001000 [574124.973409] 00001000 00000552 00000000 c02ff920 dc5386c0 f289e600 d26f1f88 c015ed6b [574124.982090] Call Trace: [574124.984846] [<c01035de>] show_trace_log_lvl+0x1a/0x2f [574124.990125] [<c0103690>] show_stack_log_lvl+0x9d/0xa5 [574124.995401] [<c0103746>] show_registers+0xae/0x17d [574125.000418] [<c010392b>] die+0x116/0x1e3 [574125.004558] [<c0110421>] do_page_fault+0x467/0x546 [574125.009575] [<c02f2d8a>] error_code+0x72/0x78 [574125.014159] [<c015ed6b>] vfs_readdir+0x5d/0x89 [574125.018828] [<c015edf5>] sys_getdents64+0x5e/0xa0 [574125.023759] [<c010264e>] sysenter_past_esp+0x5f/0x85 [574125.028948] ======================= [574125.032630] Code: 7f 31 f6 89 1c 24 89 74 24 04 8b 45 c0 ff 55 bc 85 c0 0f 85 9f 00 00 00 8b 4f 10 83 c1 1f 83 e1 f8 31 d2 29 4d d0 19 55 d4 01 cf <8b> 47 08 8b 57 0c 89 45 d8 89 55 dc 83 7d d4 00 7f a1 7c 06 83 [574125.053259] EIP: [<c01d7154>] xfs_file_readdir+0xea/0x199 SS:ESP 0068:d26f1f10 [574125.060956] ---[ end trace 140cd0f5f61f796f ]--- [600377.481204] a.out[8789]: segfault at 00000062 eip b7d5457b esp bfbaf318 error 4 piii config: http://sysadminday.org.ru/2.6.24.2/2.6.24.2-p3_config p4 config: http://sysadminday.org.ru/2.6.24.2/2.6.24.2-p4_config p3 bug (full): http://sysadminday.org.ru/2.6.24.2/2.6.24.2-p3_xfs_file_readdir.bug p4 bug (short): http://sysadminday.org.ru/2.6.24.2/2.6.24.2-p4_xfs_file_readdir.bug Steps to reproduce:
It worth to always check last commits not only bugzilla history :(, it's probably already fixed on 2.6.24.3: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=450790a2c51e6d9d47ed30dbdcf486656b8e186f;hp=cbc89dcfd24fd161f7a8e262266177db160a58fb
Alexander, I suppose you've tested the kernel with the fix. Since it sure does look like the right fix, closing the bug. Please reopen if the problem still there.