Bug 43013
Summary: | creating files >8GB on xfs / very slow | ||
---|---|---|---|
Product: | File System | Reporter: | ITec (itec) |
Component: | XFS | Assignee: | XFS Guru (xfs-masters) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | itec, sandeen, szg00000, tsenapathy |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.2.x and 3.3.x | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
ver_linux and other output for 3.2.0-rc1
ver_linux and other output for 3.1.10 |
Created attachment 72756 [details]
ver_linux and other output for 3.1.10
This bug is corresponding to the downstream bug in Ubuntu #887793 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/887793 Hm, for starters, many other things are happening on / which may be affecting the test. Secondly, things such as how much free space is on that fs, and how fragmented it is, could also be a factor. That's just a very rough first guess, but generally perf testing on the root filesystem is not too useful. It might be interesting to see how fragmented the file is when you complete the test, using xfs_bmap. Stating how much memory is in the machine would be useful too, the first part of your dd is likely doing buffered IO, after that things start flushing. Yes, I know, that there are many things that could affect the test. And I am not speaking about loss of "some" performance. I am speaking about loss of 98% performance. But what could cause this? And why does it happen only on 3.2+? About fragmentation: # xfs_fsr -v /dev/sda3 / start inode=0 ino=20294080 ino=20294080 already fully defragmented. # df -m /tmp Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda3 15248 2549 12700 17% / # dd if=/dev/zero of=/tmp/zero1.tmp bs=1024 count=9000000 9000000+0 records in 9000000+0 records out 9216000000 bytes (9,2 GB) copied, 396,978 s, 23,2 MB/s The first 8GB go with 100MB/s the rest with 1.9MB/s. # xfs_bmap /tmp/zero1.tmp /tmp/zero1.tmp: 0: [0..4194303]: 26708152..30902455 1: [4194304..8388607]: 11237112..15431415 2: [8388608..12076735]: 4123968..7812095 3: [12076736..15671151]: 19841872..23436287 4: [15671152..16201615]: 18355464..18885927 5: [16201616..16547543]: 30902456..31248383 6: [16547544..16797911]: 3523816..3774183 7: [16797912..16816479]: 3783176..3801743 8: [16816480..16835343]: 3804000..3822863 9: [16835344..16950519]: 15431416..15546591 10: [16950520..16950551]: 15546600..15546631 11: [16950552..17065255]: 10757704..10872407 12: [17065256..17065295]: 10873000..10873039 13: [17065296..17083887]: 10739112..10757703 14: [17083888..17102567]: 10720432..10739111 15: [17102568..17219047]: 10603952..10720431 16: [17219048..17219079]: 10603920..10603951 17: [17219080..17241775]: 10581224..10603919 18: [17241776..17262391]: 10560608..10581223 19: [17262392..17283671]: 10539328..10560607 20: [17283672..17302879]: 10018552..10037759 21: [17302880..17302887]: 10037976..10037983 22: [17302888..17323039]: 9998400..10018551 23: [17323040..17342535]: 9966824..9986319 24: [17342536..17573871]: 18960144..19191479 25: [17573872..17655255]: 19264640..19346023 26: [17655256..17688759]: 18919824..18953327 27: [17688760..17706695]: 18901888..18919823 28: [17706696..17725951]: 19191480..19210735 29: [17725952..17745271]: 18142008..18161327 30: [17745272..17764551]: 18122728..18142007 31: [17764552..17783799]: 18103480..18122727 32: [17783800..17861359]: 15546632..15624191 33: [17861360..17899223]: 25742880..25780743 34: [17899224..17969279]: 26181368..26251423 35: [17969280..17999999]: 26399192..26429911 # df -m /tmp Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda3 15248 11338 3911 75% / Uups! Sorry, accidentally committed to early! Just so we can see it, do you have numbers for the same test under 3.1.10? (Second attempt to post this:) Yes, I know, that there are many things that could affect the test. And I am not speaking about loss of "some" performance. I am speaking about loss of 98% performance. But what could cause this? And why does it happen only on 3.2+? About fragmentation: # xfs_fsr -v /dev/sda3 / start inode=0 ino=20294080 ino=20294080 already fully defragmented. # df -m /tmp Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda3 15248 2549 12700 17% / # dd if=/dev/zero of=/tmp/zero1.tmp bs=1024 count=9000000 9000000+0 records in 9000000+0 records out 9216000000 bytes (9,2 GB) copied, 396,978 s, 23,2 MB/s The first 8GB go with 100MB/s the rest with 1.9MB/s. # xfs_bmap /tmp/zero1.tmp /tmp/zero1.tmp: 0: [0..4194303]: 26708152..30902455 1: [4194304..8388607]: 11237112..15431415 2: [8388608..12076735]: 4123968..7812095 3: [12076736..15671151]: 19841872..23436287 4: [15671152..16201615]: 18355464..18885927 5: [16201616..16547543]: 30902456..31248383 6: [16547544..16797911]: 3523816..3774183 7: [16797912..16816479]: 3783176..3801743 8: [16816480..16835343]: 3804000..3822863 9: [16835344..16950519]: 15431416..15546591 10: [16950520..16950551]: 15546600..15546631 11: [16950552..17065255]: 10757704..10872407 12: [17065256..17065295]: 10873000..10873039 13: [17065296..17083887]: 10739112..10757703 14: [17083888..17102567]: 10720432..10739111 15: [17102568..17219047]: 10603952..10720431 16: [17219048..17219079]: 10603920..10603951 17: [17219080..17241775]: 10581224..10603919 18: [17241776..17262391]: 10560608..10581223 19: [17262392..17283671]: 10539328..10560607 20: [17283672..17302879]: 10018552..10037759 21: [17302880..17302887]: 10037976..10037983 22: [17302888..17323039]: 9998400..10018551 23: [17323040..17342535]: 9966824..9986319 24: [17342536..17573871]: 18960144..19191479 25: [17573872..17655255]: 19264640..19346023 26: [17655256..17688759]: 18919824..18953327 27: [17688760..17706695]: 18901888..18919823 28: [17706696..17725951]: 19191480..19210735 29: [17725952..17745271]: 18142008..18161327 30: [17745272..17764551]: 18122728..18142007 31: [17764552..17783799]: 18103480..18122727 32: [17783800..17861359]: 15546632..15624191 33: [17861360..17899223]: 25742880..25780743 34: [17899224..17969279]: 26181368..26251423 35: [17969280..17999999]: 26399192..26429911 # df -m /tmp Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda3 15248 11338 3911 75% / # rm /tmp/zero1.tmp dd if=/dev/zero of=/tmp/zero1.tmp bs=1024 count=1000000; dd if=/dev/zero of=/tmp/zero2.tmp bs=1024 count=8000000; echo syncing; time sync 1000000+0 records in 1000000+0 records out 1024000000 bytes (1,0 GB) copied, 7,62766 s, 134 MB/s 8000000+0 records in 8000000+0 records out 8192000000 bytes (8,2 GB) copied, 77,9598 s, 105 MB/s syncing real 0m2.602s user 0m0.000s sys 0m0.000s # xfs_bmap /tmp/zero* /tmp/zero1.tmp: 0: [0..1048575]: 26708152..27756727 1: [1048576..1999999]: 4123968..5075391 /tmp/zero2.tmp: 0: [0..2097151]: 27756728..29853879 1: [2097152..4194303]: 5172544..7269695 2: [4194304..8388607]: 11237112..15431415 3: [8388608..11983023]: 19841872..23436287 4: [11983024..13377527]: 29853880..31248383 5: [13377528..13919927]: 7269696..7812095 6: [13919928..14450391]: 18355464..18885927 7: [14450392..14790055]: 10532744..10872407 8: [14790056..15050135]: 18960144..19220223 9: [15050136..15308935]: 3523816..3782615 10: [15308936..15501711]: 15431416..15624191 11: [15501712..15598863]: 5075392..5172543 12: [15598864..15680247]: 19264640..19346023 13: [15680248..15750303]: 26181368..26251423 14: [15750304..15819727]: 26399192..26468615 15: [15819728..15888935]: 25742880..25812087 16: [15888936..15955831]: 18886432..18953327 17: [15955832..15999999]: 18117160..18161327 So, creating one file with 9GB takes about 400s, but one file with 1GB and another file with 8GB takes about 90s together. The same test under 3.1.10: Following soon! The same test under 3.1.10: # xfs_fsr -v /dev/sda3 / start inode=0 ino=20294080 ino=20294080 already fully defragmented. # df -m /tmp Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda3 15248 2553 12696 17% / # dd if=/dev/zero of=/tmp/zero1.tmp bs=1024 count=9000000; echo syncing; time sync 9000000+0 records in 9000000+0 records out 9216000000 bytes (9,2 GB) copied, 91,3123 s, 101 MB/s syncing real 0m3.029s user 0m0.004s sys 0m0.000s # xfs_bmap /tmp/zero* /tmp/zero1.tmp: 0: [0..4194303]: 26708152..30902455 1: [4194304..8388607]: 11237112..15431415 2: [8388608..12076735]: 4123968..7812095 3: [12076736..15671151]: 19841872..23436287 4: [15671152..16201615]: 18355464..18885927 5: [16201616..16547543]: 30902456..31248383 6: [16547544..16777215]: 3523816..3753487 7: [16777216..17113183]: 10532744..10868711 8: [17113184..17139023]: 10011920..10037759 9: [17139024..17399103]: 18960144..19220223 10: [17399104..17569775]: 15431416..15602087 11: [17569776..17633927]: 18886432..18950583 12: [17633928..17715311]: 19264640..19346023 13: [17715312..17785367]: 26181368..26251423 14: [17785368..17854791]: 26399192..26468615 15: [17854792..17923999]: 25742880..25812087 16: [17924000..17988511]: 3155768..3220279 17: [17988512..17999999]: 3223528..3235015 # df -m /tmp Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda3 15248 11343 3906 75% / Now creation of a single 9GB file takes a bit more than 90s. I'm batch-closing all xfs bugs which are more than 1 year old, sorry about that. If you still have this issue on a current kernel, please retest and re-open with this information. Thanks, -Eric |
Created attachment 72755 [details] ver_linux and other output for 3.2.0-rc1 How to reproduce the issue: dd if=/dev/zero of=/tmp/zero1.tmp bs=1024 count=13000000 Where "/tmp" is on "/" and "/" is an xfs It does not happen if I create a file on another xfs then "/". What happens: The first 8GB are written with about 100MB/s ans the rest with about 1MB/s. After the first 8GB kworker starts working heavily. top - 18:20:57 up 2:44, 2 users, load average: 1.33, 0.50, 0.42 Tasks: 110 total, 3 running, 107 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 49.8%sy, 0.0%ni, 50.2%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 1791092k total, 1723636k used, 67456k free, 8k buffers Swap: 0k total, 0k used, 0k free, 1564360k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4160 root 20 0 0 0 0 R 47 0.0 0:06.81 kworker/0:1 4217 root 20 0 0 0 0 R 47 0.0 0:06.12 kworker/1:0 4216 root 20 0 11500 632 516 D 5 0.0 0:38.42 dd 3281 root 20 0 17220 1240 904 R 0 0.1 0:11.11 top 4079 root 20 0 0 0 0 S 0 0.0 0:00.43 kworker/1:2 4093 root 20 0 0 0 0 S 0 0.0 0:00.21 kworker/0:0 1 root 20 0 24172 2016 1100 S 0 0.1 0:00.46 init I tested different kernel versions. first affected: kernel-3.2.0-rc1 last affected: current last unaffected: kernel-3.1.10 Further information: see attachments