Bug 12084
Summary: | Silent data corruption on disk with nforce 630 | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Stéphane Birot (steb00-kernel) |
Component: | Serial ATA | Assignee: | Tejun Heo (tj) |
Status: | REJECTED INVALID | ||
Severity: | high | ||
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.28-rc6 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: | dmesg.log |
Description
Stéphane Birot
2008-11-22 15:35:27 UTC
If it's happening on both PATA and SATA when copying to themselves, it's unlikely to be a specific driver problem and bit flipping strongly indicates the corruption happens while the data is in transit on some bus or memory as opposed to logic problems such as overwriting in-use page or using the wrong page. Can you please post kernel boot log and the result of "lspci -nn"? Also, is it possible to get another memory stick and test whether the problem persists even with a different module? Please note that memtest86 is good when they're failing but no failure doesn't necessarily mean there is no problem. Memory corruption can depend on a lot of things such as CPU and DMA trying to access close regions concurrently or whatnot. Created attachment 18975 [details]
dmesg.log
I have only one DDR2 module so I can't test with another one right now. I will see if I can get one. Also I made my tests using SATA drivers (the IDE drive appears as /dev/sdb). But the corruption also happened with the original debian kernel where the IDE drive appears as /dev/hda. --- lspci ------------------- 00:00.0 RAM memory [0500]: nVidia Corporation MCP67 Memory Controller [10de:0547] (rev a2) 00:01.0 ISA bridge [0601]: nVidia Corporation MCP67 ISA Bridge [10de:0548] (rev a2) 00:01.1 SMBus [0c05]: nVidia Corporation MCP67 SMBus [10de:0542] (rev a2) 00:02.0 USB Controller [0c03]: nVidia Corporation MCP67 OHCI USB 1.1 Controller [10de:055e] (rev a2) 00:02.1 USB Controller [0c03]: nVidia Corporation MCP67 EHCI USB 2.0 Controller [10de:055f] (rev a2) 00:04.0 USB Controller [0c03]: nVidia Corporation MCP67 OHCI USB 1.1 Controller [10de:055e] (rev a2) 00:04.1 USB Controller [0c03]: nVidia Corporation MCP67 EHCI USB 2.0 Controller [10de:055f] (rev a2) 00:06.0 IDE interface [0101]: nVidia Corporation MCP67 IDE Controller [10de:0560] (rev a1) 00:07.0 Audio device [0403]: nVidia Corporation MCP67 High Definition Audio [10de:055c] (rev a1) 00:08.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Bridge [10de:0561] (rev a2) 00:09.0 SATA controller [0106]: nVidia Corporation MCP67 AHCI Controller [10de:0554] (rev a2) 00:0a.0 Ethernet controller [0200]: nVidia Corporation MCP67 Ethernet [10de:054c] (rev a2) 00:0b.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Express Bridge [10de:0562] (rev a2) 00:0c.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Express Bridge [10de:0563] (rev a2) 00:0d.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Express Bridge [10de:0563] (rev a2) 00:0e.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Express Bridge [10de:0563] (rev a2) 00:0f.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Express Bridge [10de:0563] (rev a2) 00:10.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Express Bridge [10de:0563] (rev a2) 00:11.0 PCI bridge [0604]: nVidia Corporation MCP67 PCI Express Bridge [10de:0563] (rev a2) 00:12.0 VGA compatible controller [0300]: nVidia Corporation GeForce 7050 PV / nForce 630a [10de:053b] (rev a2) 00:18.0 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration [1022:1100] 00:18.1 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map [1022:1101] 00:18.2 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller [1022:1102] 00:18.3 Host bridge [0600]: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control [1022:1103] --- /proc/ioports ------------------- 0000-001f : dma1 0020-0021 : pic1 0040-0043 : timer0 0050-0053 : timer1 0060-0060 : keyboard 0064-0064 : keyboard 0070-0071 : rtc0 0080-008f : dma page reg 00a0-00a1 : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : 0000:00:06.0 0170-0177 : pata_amd 01f0-01f7 : 0000:00:06.0 01f0-01f7 : pata_amd 0230-023f : pnp 00:09 0290-029f : pnp 00:09 0376-0376 : 0000:00:06.0 0376-0376 : pata_amd 03c0-03df : vga+ 03f6-03f6 : 0000:00:06.0 03f6-03f6 : pata_amd 03f8-03ff : serial 04d0-04d1 : pnp 00:05 0500-057f : pnp 00:05 0500-0503 : ACPI PM1a_EVT_BLK 0504-0505 : ACPI PM1a_CNT_BLK 0508-050b : ACPI PM_TMR 0510-0515 : ACPI CPU throttle 0520-0527 : ACPI GPE0_BLK 0580-05ff : pnp 00:05 0600-063f : 0000:00:01.1 0700-073f : 0000:00:01.1 0800-080f : pnp 00:05 0880-08ff : pnp 00:05 08a0-08af : ACPI GPE1_BLK 0900-09ff : 0000:00:01.0 0a00-0a0f : pnp 00:09 0a10-0a1f : pnp 00:09 0cf8-0cff : PCI conf1 0d00-0d7f : pnp 00:05 0d80-0dff : pnp 00:05 1100-117f : pnp 00:05 1180-11ff : pnp 00:05 d880-d887 : 0000:00:0a.0 d880-d887 : forcedeth dc00-dc0f : 0000:00:09.0 dc00-dc0f : ahci e000-e003 : 0000:00:09.0 e000-e003 : ahci e080-e087 : 0000:00:09.0 e080-e087 : ahci e400-e403 : 0000:00:09.0 e400-e403 : ahci e480-e487 : 0000:00:09.0 e480-e487 : ahci ec00-ec3f : 0000:00:01.1 ffa0-ffaf : 0000:00:06.0 ffa0-ffaf : pata_amd This is due to a faulty memory module. Memtest86 finally showed errors after running again for hours. Sorry for the fake bug. Heh.. thanks for finding out the actual problem. It's really relieving. :-) Marking INVALID. Marking... |