Bug 24272 - iotop reports insane per-process disk read/write statistics
Summary: iotop reports insane per-process disk read/write statistics
Status: CLOSED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 low
Assignee: io_other
URL:
Keywords:
Depends on:
Blocks: 21782
  Show dependency tree
 
Reported: 2010-12-03 12:00 UTC by Brian Rogers
Modified: 2011-01-02 12:28 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.37-rc4-00153-g59e57c6
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg on Ubuntu natty's 2.6.37-9-generic (based off 2.6.37-rc5) (55.55 KB, text/plain)
2010-12-11 05:42 UTC, Brian Rogers
Details
kernel config (123.85 KB, text/plain)
2010-12-11 05:44 UTC, Brian Rogers
Details
taskstats: Pad taskstats netlink response for aligment issues on ia64 (6.06 KB, patch)
2010-12-19 15:44 UTC, Jeff Mahoney
Details | Diff

Description Brian Rogers 2010-12-03 12:00:13 UTC
With a recent kernel, iotop is producing data like this:

Total DISK READ: 817.54 B/s | Total DISK WRITE: 161.41 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
  297 be/4 root      545.03 G/s  223.02 T/s  0.00 % 99.99 % [btrfs-transacti]
  286 be/4 root        0.00 B/s    0.00 B/s  0.00 % 99.99 % [btrfs-submit-0]
 3914 be/4 brian       0.00 B/s   73.45 T/s  0.00 % 99.99 % firefox-4.0-bin
 2901 be/4 brian     545.03 G/s   15.44 T/s  0.00 % 99.99 % thunderbird-bin
  853 be/4 root        0.00 B/s    0.00 B/s  0.00 % 99.99 % [flush-btrfs-1]
 2145 be/4 brian       0.00 B/s   13.31 T/s  0.00 %  0.00 % firefox-4.0-bin
  287 be/4 root        0.00 B/s    3.19 T/s  0.00 %  0.00 % [btrfs-delalloc-]
 1417 be/4 root        0.00 B/s   58.02 T/s  0.00 %  0.00 % master
 3549 be/4 brian       0.00 B/s    3.19 T/s  0.00 %  0.00 % indicator-applet-complete
 1682 be/4 root        0.00 B/s   79.31 T/s  0.00 %  0.00 % [btrfs-endio-wri]
 3905 be/4 syslog      0.00 B/s   20.23 T/s  0.00 %  0.00 % rsyslogd -c4
 1906 be/4 brian       0.00 B/s   10.11 T/s  0.00 %  0.00 % nm-applet --sm-disable

The above was an average over 30 seconds (iotop -d 30), so plenty of programs had a chance to do a tiny bit of disk activity. It looks like the totals are correct, but the per-process numbers are way out of whack. I've even seen P/s reported before.

I've observed this on two amd64 machines and haven't tested i386. Both machines have btrfs and ext4, and the problem shows up with access to either. 2.6.36 produces correct results.
Comment 1 Dan Carpenter 2010-12-09 09:59:13 UTC
Could you post your dmesg and your .config?

iotop is just polling /proc/<pid>/io I believe.  Could you post /proc/<pid>/io file for thunderbird?  Maybe read it, wait for a second, then read the file again.
Comment 2 Brian Rogers 2010-12-11 05:42:28 UTC
Created attachment 39802 [details]
dmesg on Ubuntu natty's 2.6.37-9-generic (based off 2.6.37-rc5)
Comment 3 Brian Rogers 2010-12-11 05:44:35 UTC
Created attachment 39812 [details]
kernel config
Comment 4 Brian Rogers 2010-12-11 05:51:18 UTC
Processes show 0.00 B/s when they're not writing, so I got /proc/<pid>/io for a process that is continually writing at the moment, mythbackend:

brian@fwiffo:/proc/2959$ cat io; sleep 10; echo; cat io
rchar: 3155467253
wchar: 457520817
syscr: 98710
syscw: 48422
read_bytes: 3858432
write_bytes: 566554624
cancelled_write_bytes: 0

rchar: 3170223737
wchar: 460091353
syscr: 99127
syscw: 48613
read_bytes: 3858432
write_bytes: 569700352
cancelled_write_bytes: 0

This process is showing numbers from 100 - 300 T/sec.
Comment 5 Brian Rogers 2010-12-11 06:55:25 UTC
I don't think iotop gets its info from /proc/<pid>/io because the data there looks correct.

If I look at cumulative io in kilobytes (iotop -ak), the number reported is always a multiple of 17179869184.00 K. That is, 2^44 bytes. Seems like it could be an endianness issue or misaligned data.
Comment 6 Dan Carpenter 2010-12-13 06:02:15 UTC
I think you are right.  iotop uses hard coded offsets to find the read_bytes struct members, but this got changed.

Please can you try revert this patch, and let me know if that fixes it for you:

commit 85893120699f8bae8caa12a8ee18ab5fceac978e
Author: Jeff Mahoney <jeffm@suse.com>
Date:   Wed Oct 27 15:34:43 2010 -0700

    delayacct: align to 8 byte boundary on 64-bit systems
Comment 7 Brian Rogers 2010-12-13 09:34:49 UTC
Yeah, reverting that fixes the problem.
Comment 8 Florian Mickler 2010-12-17 00:59:20 UTC
First-Bad-Commit: 85893120699f8bae8caa12a8ee18ab5fceac978e
Comment 9 Jeff Mahoney 2010-12-19 15:44:36 UTC
Created attachment 40922 [details]
taskstats: Pad taskstats netlink response for aligment issues on ia64

The taskstats structure is internally aligned on 8 byte boundaries but
 the layout of the aggregrate reply, with two NLA headers and the pid
 (each 4 bytes), actually force the entire structure to be unaligned.
 This causes the kernel to issue unaligned access warnings on some
 architectures like ia64. Unfortunately, some software out there doesn't
 properly unroll the NLA packet and assumes that the start of the
 taskstats structure will always be 20 bytes from the start of the
 netlink payload. Aligning the start of the taskstats structure breaks
 this software, which we don't want. So, for now the alignment only
 happens on architectures that require it and those users will have to
 update to fixed versions of those packages. Space is reserved in the
 packet only when needed.  This ifdef should be removed in several years
 e.g. 2012 once we can be confident that fixed versions are installed on
 most systems. We add the padding before the aggregate since the
 aggregate is already a defined type.

 Commit 85893120 previously addressed the alignment issues by padding out
 the pid field. This was supposed to be a compatible change but the
 circumstances described above mean that it wasn't. This patch backs out
 that change, since it was a hack, and introduces a new NULL attribute
 type to provide the padding. Padding the response with 4 bytes avoids
 allocating an aligned taskstats structure and copying it back. Since
 the structure weighs in at 328 bytes, it's too big to do it on the stack.
Comment 10 Florian Mickler 2011-01-02 12:28:12 UTC
Patch: https://bugzilla.kernel.org/attachment.cgi?id=40922
Is merged as  

commit 4be2c95d1f7706ca0e74499f2bd118e1cee19669
Author: Jeff Mahoney <jeffm@suse.com>
Date:   Tue Dec 21 17:24:30 2010 -0800

    taskstats: pad taskstats netlink response for aligment issues on ia64

And will probably be further tweaked by:
Patch: https://patchwork.kernel.org/patch/438641/

Note You need to log in before you can comment on or make changes to this bug.