With a recent kernel, iotop is producing data like this: Total DISK READ: 817.54 B/s | Total DISK WRITE: 161.41 K/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 297 be/4 root 545.03 G/s 223.02 T/s 0.00 % 99.99 % [btrfs-transacti] 286 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [btrfs-submit-0] 3914 be/4 brian 0.00 B/s 73.45 T/s 0.00 % 99.99 % firefox-4.0-bin 2901 be/4 brian 545.03 G/s 15.44 T/s 0.00 % 99.99 % thunderbird-bin 853 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [flush-btrfs-1] 2145 be/4 brian 0.00 B/s 13.31 T/s 0.00 % 0.00 % firefox-4.0-bin 287 be/4 root 0.00 B/s 3.19 T/s 0.00 % 0.00 % [btrfs-delalloc-] 1417 be/4 root 0.00 B/s 58.02 T/s 0.00 % 0.00 % master 3549 be/4 brian 0.00 B/s 3.19 T/s 0.00 % 0.00 % indicator-applet-complete 1682 be/4 root 0.00 B/s 79.31 T/s 0.00 % 0.00 % [btrfs-endio-wri] 3905 be/4 syslog 0.00 B/s 20.23 T/s 0.00 % 0.00 % rsyslogd -c4 1906 be/4 brian 0.00 B/s 10.11 T/s 0.00 % 0.00 % nm-applet --sm-disable The above was an average over 30 seconds (iotop -d 30), so plenty of programs had a chance to do a tiny bit of disk activity. It looks like the totals are correct, but the per-process numbers are way out of whack. I've even seen P/s reported before. I've observed this on two amd64 machines and haven't tested i386. Both machines have btrfs and ext4, and the problem shows up with access to either. 2.6.36 produces correct results.
Could you post your dmesg and your .config? iotop is just polling /proc/<pid>/io I believe. Could you post /proc/<pid>/io file for thunderbird? Maybe read it, wait for a second, then read the file again.
Created attachment 39802 [details] dmesg on Ubuntu natty's 2.6.37-9-generic (based off 2.6.37-rc5)
Created attachment 39812 [details] kernel config
Processes show 0.00 B/s when they're not writing, so I got /proc/<pid>/io for a process that is continually writing at the moment, mythbackend: brian@fwiffo:/proc/2959$ cat io; sleep 10; echo; cat io rchar: 3155467253 wchar: 457520817 syscr: 98710 syscw: 48422 read_bytes: 3858432 write_bytes: 566554624 cancelled_write_bytes: 0 rchar: 3170223737 wchar: 460091353 syscr: 99127 syscw: 48613 read_bytes: 3858432 write_bytes: 569700352 cancelled_write_bytes: 0 This process is showing numbers from 100 - 300 T/sec.
I don't think iotop gets its info from /proc/<pid>/io because the data there looks correct. If I look at cumulative io in kilobytes (iotop -ak), the number reported is always a multiple of 17179869184.00 K. That is, 2^44 bytes. Seems like it could be an endianness issue or misaligned data.
I think you are right. iotop uses hard coded offsets to find the read_bytes struct members, but this got changed. Please can you try revert this patch, and let me know if that fixes it for you: commit 85893120699f8bae8caa12a8ee18ab5fceac978e Author: Jeff Mahoney <jeffm@suse.com> Date: Wed Oct 27 15:34:43 2010 -0700 delayacct: align to 8 byte boundary on 64-bit systems
Yeah, reverting that fixes the problem.
First-Bad-Commit: 85893120699f8bae8caa12a8ee18ab5fceac978e
Created attachment 40922 [details] taskstats: Pad taskstats netlink response for aligment issues on ia64 The taskstats structure is internally aligned on 8 byte boundaries but the layout of the aggregrate reply, with two NLA headers and the pid (each 4 bytes), actually force the entire structure to be unaligned. This causes the kernel to issue unaligned access warnings on some architectures like ia64. Unfortunately, some software out there doesn't properly unroll the NLA packet and assumes that the start of the taskstats structure will always be 20 bytes from the start of the netlink payload. Aligning the start of the taskstats structure breaks this software, which we don't want. So, for now the alignment only happens on architectures that require it and those users will have to update to fixed versions of those packages. Space is reserved in the packet only when needed. This ifdef should be removed in several years e.g. 2012 once we can be confident that fixed versions are installed on most systems. We add the padding before the aggregate since the aggregate is already a defined type. Commit 85893120 previously addressed the alignment issues by padding out the pid field. This was supposed to be a compatible change but the circumstances described above mean that it wasn't. This patch backs out that change, since it was a hack, and introduces a new NULL attribute type to provide the padding. Padding the response with 4 bytes avoids allocating an aligned taskstats structure and copying it back. Since the structure weighs in at 328 bytes, it's too big to do it on the stack.
Patch: https://bugzilla.kernel.org/attachment.cgi?id=40922 Is merged as commit 4be2c95d1f7706ca0e74499f2bd118e1cee19669 Author: Jeff Mahoney <jeffm@suse.com> Date: Tue Dec 21 17:24:30 2010 -0800 taskstats: pad taskstats netlink response for aligment issues on ia64 And will probably be further tweaked by: Patch: https://patchwork.kernel.org/patch/438641/