Distribution: Fedora Core 4 Hardware Environment: Dual AthlonMP 2000+, 760MPX, 2GB RAM, Sil3114 Problem Description: Kernel Oops when sata drive is under load Steps to reproduce: dd if=/dev/sda of=/dev/null bs=1M I can run the above dd command on hda and hdc all day long but sda will oops every time. Started when I added the Sil3114 and tried to create a raid5 array. The system would oops when initializing the array. Finally found that the sata drive would oops with the above dd read op. The drive is a WD2000JD-00H. I have tried with a PATA drive hooked up via an Sil3611 bridge but ran into the same problem. Most Oops messages left on the console show "BUG: spinlock lockup on CPU#0" ... Nov 9 17:11:28 mcp kernel: Unable to handle kernel paging request at virtual address 041242c7 Nov 9 17:11:28 mcp kernel: printing eip: Nov 9 17:11:28 mcp kernel: c014f2fa Nov 9 17:11:28 mcp kernel: *pde = 00000000 Nov 9 17:11:28 mcp kernel: Oops: 0002 [#1] Nov 9 17:11:28 mcp kernel: SMP Nov 9 17:11:28 mcp kernel: Modules linked in: ipv6 parport_pc lp parport autofs4 w83627hf hwmon_vid eeprom i2c_isa i2c_matroxfb i2c_algo_bit matroxfb_base matroxfb_DAC1064 matroxfb_accel matroxfb_Ti3026 matroxfb_g450 g450_pll matroxfb_misc rfcomm l2cap bluetooth sunrpc dm_mod video button battery ac ohci_hcd i2c_amd756 i2c_core shpchp snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc e1000 floppy sd_mod Nov 9 17:11:28 mcp kernel: CPU: 0 Nov 9 17:11:28 mcp kernel: EIP: 0060:[<c014f2fa>] Not tainted VLI Nov 9 17:11:28 mcp kernel: EFLAGS: 00010083 (2.6.14) Nov 9 17:11:28 mcp kernel: EIP is at cache_alloc_refill+0xaa/0x280 Nov 9 17:11:28 mcp kernel: eax: 61c584c6 ebx: 0000003c ecx: f616c000 edx: 041242c3 Nov 9 17:11:28 mcp kernel: esi: f7ffe520 edi: 00000000 ebp: f7ff1a00 esp: f5e3de8c Nov 9 17:11:28 mcp kernel: ds: 007b es: 007b ss: 0068 Nov 9 17:11:28 mcp kernel: Process automount (pid: 2100, threadinfo=f5e3d000 task=c2266030) Nov 9 17:11:28 mcp kernel: Stack: 000000d0 f7ff0dc0 f7ffe548 f7ef8000 f7ffe520 80009000 8000a000 00000202 Nov 9 17:11:28 mcp kernel: 000000d0 f7ff0dc0 c22e9570 c014f6fa f7df2680 00000023 f7cef38c c011fa78 Nov 9 17:11:28 mcp kernel: f7cef34c f5e3d000 f7df26b0 f7818960 f78189ac f7cef398 f7cef3ac f7cef3a4 Nov 9 17:11:28 mcp kernel: Call Trace: Nov 9 17:11:28 mcp kernel: [<c014f6fa>] kmem_cache_alloc+0x6a/0x70 Nov 9 17:11:28 mcp kernel: [<c011fa78>] copy_mm+0x1e8/0x3d0 Nov 9 17:11:28 mcp kernel: [<c0120719>] copy_process+0x569/0xec0 Nov 9 17:11:28 mcp kernel: [<c012116e>] do_fork+0x6e/0x206 Nov 9 17:11:28 mcp kernel: [<c010753f>] do_syscall_trace+0x20f/0x225 Nov 9 17:11:28 mcp kernel: [<c0101a92>] sys_clone+0x32/0x40 Nov 9 17:11:28 mcp kernel: [<c0102f55>] syscall_call+0x7/0xb Nov 9 17:11:28 mcp kernel: Code: 85 d2 0f 85 59 01 00 00 85 db 7e 4f 8b 74 24 10 8b 0e 39 f1 0f 84 25 01 00 00 8b 54 24 04 8b 41 10 39 42 38 77 6f 8b 11 8b 41 04 <89> 42 04 89 10 83 79 14 ff c7 01 00 01 10 00 c7 41 04 00 02 20
That's a funny-looking backtrace - it has nothing to do with the IO system. Does the oops trace always look like that? If not, please send more instances. Please enable CONFIG_DEBUG_KERNEL, CONFIG_DEBUG_SLAB and CONFIG_DEBUG_PAGEALLOC before running more tests, thanks.
Created attachment 6544 [details] Kernel debug output
Does it still reproduces in recent kernels? (libata is moving target) If so, can you try 2.6.18-rc3? There's a better error handlind there and may oputput more useful info.
Please reopen this bug if it's still present in kernel 2.6.18.