Most recent kernel where this bug did not occur: 2.6.14 Distribution: Gentoo, Slackware, SuSe, etc... vanilla kernel from kernel.org Hardware Environment: tested on P4 3G, Athlon XP, Celeron 2.4, etc... with IDE & SATA drivers Software Environment: XFS compiled into kernel, bootloader both lilo & grub Problem Description: XFS couldn't identify xlog client id, ends with error 5. Steps to reproduce: - create root partition with xfs on - install kernel 2.6.15 or higher, include xfs into kernel ( - reboot computer to see it all ok ) - pres RESET BUTTON - kernel panics on next boot showing : ... XFS mounting filesystem hda1 Starting XFS recovery on filesystem: hda1 (logdev: internal) XFS: xlog_recover_process_data: bad clientid XFS: log mount/recovery failed: error 5 XFS: log mount failed VFS: Cannot open root device "301" or unknown-block(3,1) Please append a correct "root=" boot option Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(3,1) ... using 2.6.14 recovers root sometimes, but not always.
Disable your device write cache? I do this kind of thing all day everyday on latest kernels and have not seen the behaviour your reporting, though I'm typically using SCSI devices. cheers.
Well we tried to, but with no luck. Device still uses write cache. But I think there's another problem - when xlog is corrupted in any way, xfs is unable to recover. In 2.6.14 it was :) root@adv01:~# hdparm -W0 /dev/hda /dev/hda: setting drive write-caching to 0 (off) root@adv01:~# hdparm -i /dev/hda /dev/hda: Model=ST3120022A, FwRev=8.54, SerialNo=4LJ042JY Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=unknown, BuffSize=2048kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=234441648 IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=no WriteCache=enabled Drive conforms to: ATA/ATAPI-6 T13 1410D revision 2: * signifies the current active mode
Hi once more, It seems it's happening only with kernel option CONFIG_CC_OPTIMIZE_FOR_SIZE enabled (which is new since 2.6.15, and is enabled by default). Also if I'd manually added the XFS_DEBUG option in the fs/xfs/Kconfig, the problem disappeared. Thanks in advice
Smells a bit like a gcc compilation bug. Which version of gcc are you using?
Hi. We were using GCC 3.2.3, 3.4.5, 3.4.6 and 4.1.1. I'm not sure that it occurred in all of these versions, but probably yes. Tharrrk
Any updates on this, does it work for you with newer kernels (as of now - 2.6.22+)? If it is, can you please do git bisect to find a commit that broke the kernel for you, between 2.6.14 and 2.6.15. Thanks.
Please reopen this bug if it's still prewent with kernel 2.6.22.