Bug 24272
Summary: | iotop reports insane per-process disk read/write statistics | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Brian Rogers (brian) |
Component: | Other | Assignee: | io_other |
Status: | CLOSED CODE_FIX | ||
Severity: | low | CC: | error27, florian, maciej.rutecki, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.37-rc4-00153-g59e57c6 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 21782 | ||
Attachments: |
dmesg on Ubuntu natty's 2.6.37-9-generic (based off 2.6.37-rc5)
kernel config taskstats: Pad taskstats netlink response for aligment issues on ia64 |
Description
Brian Rogers
2010-12-03 12:00:13 UTC
Could you post your dmesg and your .config? iotop is just polling /proc/<pid>/io I believe. Could you post /proc/<pid>/io file for thunderbird? Maybe read it, wait for a second, then read the file again. Created attachment 39802 [details]
dmesg on Ubuntu natty's 2.6.37-9-generic (based off 2.6.37-rc5)
Created attachment 39812 [details]
kernel config
Processes show 0.00 B/s when they're not writing, so I got /proc/<pid>/io for a process that is continually writing at the moment, mythbackend: brian@fwiffo:/proc/2959$ cat io; sleep 10; echo; cat io rchar: 3155467253 wchar: 457520817 syscr: 98710 syscw: 48422 read_bytes: 3858432 write_bytes: 566554624 cancelled_write_bytes: 0 rchar: 3170223737 wchar: 460091353 syscr: 99127 syscw: 48613 read_bytes: 3858432 write_bytes: 569700352 cancelled_write_bytes: 0 This process is showing numbers from 100 - 300 T/sec. I don't think iotop gets its info from /proc/<pid>/io because the data there looks correct. If I look at cumulative io in kilobytes (iotop -ak), the number reported is always a multiple of 17179869184.00 K. That is, 2^44 bytes. Seems like it could be an endianness issue or misaligned data. I think you are right. iotop uses hard coded offsets to find the read_bytes struct members, but this got changed. Please can you try revert this patch, and let me know if that fixes it for you: commit 85893120699f8bae8caa12a8ee18ab5fceac978e Author: Jeff Mahoney <jeffm@suse.com> Date: Wed Oct 27 15:34:43 2010 -0700 delayacct: align to 8 byte boundary on 64-bit systems Yeah, reverting that fixes the problem. First-Bad-Commit: 85893120699f8bae8caa12a8ee18ab5fceac978e Created attachment 40922 [details]
taskstats: Pad taskstats netlink response for aligment issues on ia64
The taskstats structure is internally aligned on 8 byte boundaries but
the layout of the aggregrate reply, with two NLA headers and the pid
(each 4 bytes), actually force the entire structure to be unaligned.
This causes the kernel to issue unaligned access warnings on some
architectures like ia64. Unfortunately, some software out there doesn't
properly unroll the NLA packet and assumes that the start of the
taskstats structure will always be 20 bytes from the start of the
netlink payload. Aligning the start of the taskstats structure breaks
this software, which we don't want. So, for now the alignment only
happens on architectures that require it and those users will have to
update to fixed versions of those packages. Space is reserved in the
packet only when needed. This ifdef should be removed in several years
e.g. 2012 once we can be confident that fixed versions are installed on
most systems. We add the padding before the aggregate since the
aggregate is already a defined type.
Commit 85893120 previously addressed the alignment issues by padding out
the pid field. This was supposed to be a compatible change but the
circumstances described above mean that it wasn't. This patch backs out
that change, since it was a hack, and introduces a new NULL attribute
type to provide the padding. Padding the response with 4 bytes avoids
allocating an aligned taskstats structure and copying it back. Since
the structure weighs in at 328 bytes, it's too big to do it on the stack.
Patch: https://bugzilla.kernel.org/attachment.cgi?id=40922 Is merged as commit 4be2c95d1f7706ca0e74499f2bd118e1cee19669 Author: Jeff Mahoney <jeffm@suse.com> Date: Tue Dec 21 17:24:30 2010 -0800 taskstats: pad taskstats netlink response for aligment issues on ia64 And will probably be further tweaked by: Patch: https://patchwork.kernel.org/patch/438641/ |