Distribution: Any Hardware Environment: Dual Pentium III/700 (GA6BXDU) and Dual Xeon 1.8GHz Software Environment: xfsdump Problem Description: When running xfsdump on a filesystem created with -b size=1024 and 1769388 inodes used, in the phase when xfsdump dumps the directories the system firstly shows kswapd0 taking 100% of sys CPU, and freezes shortly after. Steps to reproduce: 1) Create a filesystem with: mkfs -V -t xfs -f -b size=1024 -L dbase /dev/hda3 2) Fill the filesystem with 1.7 million small size files #df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/hda3 4200960 1769388 2431572 43% /opt/backup/dbase # df Filesystem 1K-blocks Used Available Use% Mounted on /dev/hda3 4190720 1667837 2522883 40% /opt/backup/dbase 3) dump the filesystem to a SCSI DAT drive with xfsdump 2.2.16 using: /sbin/xfsdump -F -l 0 -L dbase -o -f /dev/nst2 /opt/backup/dbase When the dump gets to the following stage: /sbin/xfsdump: using scsi tape (drive_scsitape) strategy /sbin/xfsdump: version 2.2.16 (dump format 3.0) - Running single-threaded /sbin/xfsdump: WARNING: most recent level 0 dump was interrupted, but not resuming that dump since resume (-R) option not specified /sbin/xfsdump: level 0 dump of mars.ghiro.org:/opt/backup/dbase /sbin/xfsdump: dump date: Sat Oct 16 20:19:33 2004 /sbin/xfsdump: session id: d7d33092-9380-44b0-bf81-c5e34522495e /sbin/xfsdump: session label: "ArkeiaServer" /sbin/xfsdump: ino map phase 1: skipping (no subtrees specified) /sbin/xfsdump: ino map phase 2: constructing initial dump list /sbin/xfsdump: ino map phase 3: skipping (no pruning necessary) /sbin/xfsdump: ino map phase 4: skipping (size estimated in phase 2) /sbin/xfsdump: ino map phase 5: skipping (only one dump stream) /sbin/xfsdump: ino map construction complete /sbin/xfsdump: estimated dump size: 1772858496 bytes /sbin/xfsdump: preparing drive /sbin/xfsdump: WARNING: media may contain data. Overwrite option specified /sbin/xfsdump: WARNING: no media label specified /sbin/xfsdump: creating dump session media file 0 (media 0, file 0) /sbin/xfsdump: dumping ino map /sbin/xfsdump: dumping directories kswapd takes 100% CPU top - 20:21:41 up 19 min, 3 users, load average: 1.35, 0.48, 0.17 Tasks: 74 total, 4 running, 70 sleeping, 0 stopped, 0 zombie top - 20:22:14 up 20 min, 3 users, load average: 1.63, 0.65, 0.24 Tasks: 74 total, 3 running, 71 sleeping, 0 stopped, 0 zombie Cpu0 : 2.0% us, 36.0% sy, 0.0% ni, 62.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu1 : 0.0% us, 100.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 645608k total, 643208k used, 2400k free, 1044k buffers Swap: 4080488k total, 0k used, 4080488k free, 359628k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 40 root 25 0 0 0 0 R 99.9 0.0 1:11.45 kswapd0 1945 root 18 0 7480 3480 3656 R 35.5 0.5 1:26.69 xfsdump 39 root 15 0 0 0 0 S 1.0 0.0 0:00.70 pdflush 1986 root 17 0 1824 900 1620 S 0.7 0.1 0:00.77 top 1987 root 16 0 1824 896 1620 R 0.7 0.1 0:00.08 top 1615 root 15 0 2348 1092 1908 S 0.3 0.2 0:00.18 arkvlib 1618 root 15 0 2120 1028 1588 S 0.3 0.2 0:00.40 arklib 1 root 16 0 1464 452 1316 S 0.0 0.1 0:01.09 init 2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 3 root 34 19 0 0 0 S 0.0 0.0 0:00.04 ksoftirqd/0 4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1 5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1 6 root 5 -10 0 0 0 S 0.0 0.0 0:00.02 events/0 7 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 events/1 8 root 8 -10 0 0 0 S 0.0 0.0 0:00.01 khelper 9 root 5 -10 0 0 0 S 0.0 0.0 0:00.03 kblockd/0 10 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 kblockd/1 11 root 15 0 0 0 0 S 0.0 0.0 0:00.00 khubd 38 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pdflush 37 root 15 0 0 0 0 S 0.0 0.0 0:00.00 kirqd 41 root 14 -10 0 0 0 S 0.0 0.0 0:00.00 aio/0 42 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 aio/1 43 root 5 -10 0 0 0 S 0.0 0.0 0:00.20 xfslogd/0 44 root 10 -10 0 0 0 S 0.0 0.0 0:00.00 xfslogd/1 A few seconds later the dump stops (the process still runs but the tape and disk drives are inactive). The system also stops taking any input from the console keyboard. A few minutes later the system is in the following conditions: top - 20:27:54 up 25 min, 3 users, load average: 9.89, 6.53, 2.93 Tasks: 75 total, 5 running, 70 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0% us, 100.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Cpu1 : 0.0% us, 0.7% sy, 0.0% ni, 0.0% id, 99.3% wa, 0.0% hi, 0.0% si Mem: 645608k total, 643096k used, 2512k free, 104k buffers Swap: 4080488k total, 72k used, 4080416k free, 344176k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 40 root 25 0 0 0 0 R 99.9 0.0 6:51.70 kswapd0 1994 root 16 0 1824 900 1620 R 0.7 0.1 0:00.14 top 1 root 16 0 1464 452 1316 S 0.0 0.1 0:01.09 init 2 root RT 0 0 0 0 R 0.0 0.0 0:00.00 migration/0 The console cannot be accessed, however the system still accepts commands from a telnet session. The xfsdump process (launched from a telnet session) cannot be stopped with Ctrl+C, nor the process can be killed with "kill -9" issuing the "kill -9" command to the xfsdump process causes kill to hang and not return prompt. Following the above, any further command sent from an open telnet session hangs indefinately. Issuing "init 0" from a responding telnet session does not shutdown the system that instead hangs indefinately and requires and hardware reset/shudown. All the above was repeated/reproduced 8 times using kernel 2.6.9-rc4 with the same results every time. When repeated using kernel 2.6.8.1 the dump was successful, however the kernel reported several times the following error: kernel: pagebuf_get: failed to lookup pages When dumping a filesystem with a significantly smaller number of inodes used (/usr had the highest number of inodes following /opt/backup/dbase) The problem could not be reproduced. Filesystem Inodes IUsed IFree IUse% Mounted on /dev/hda1 126976 31141 95835 25% / none 80701 1 80700 1% /dev/shm tmpfs 80701 4 80697 1% /tmp /dev/hda5 3542272 95040 3447232 3% /usr I hope this helps, please feel free to contact me should you require any further information or test. Mauro
Is this issue still present in kernel 2.6.19?
Please reopen this bug if it's still present with kernel 2.6.20.