Latest working kernel version: 2.6.26 Earliest failing kernel version: 2.6.27-rc1 Distribution: Debian Hardware Environment: Dell Software Environment: Problem Description: /proc/diskstats does not contain all the block devices it should. /sys/block has all the devices, but /proc/diskstats does not. Steps to reproduce: boot a system with >9 (10?) disk devices (24 block devices?)
Reply-To: akpm@linux-foundation.org On Wed, 13 Aug 2008 12:12:44 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=11323 > > Summary: /proc/diskstats does not contain all disk devices > Product: File System > Version: 2.5 > KernelVersion: 2.6.27-rcX > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: Other > AssignedTo: fs_other@kernel-bugs.osdl.org > ReportedBy: genanr@emsphone.com > > > Latest working kernel version: 2.6.26 > Earliest failing kernel version: 2.6.27-rc1 Post-2.6.26 regression. > Distribution: Debian > Hardware Environment: Dell > Software Environment: > Problem Description: /proc/diskstats does not contain all the block devices > it > should. /sys/block has all the devices, but /proc/diskstats does not. > > Steps to reproduce: boot a system with >9 (10?) disk devices (24 block > devices?) The below would be a prime suspect. Unfortunately a simple revert results in an uncompilable kernel. (It drives me up the wall and across the ceiling how the patch has a commit "date" of three months prior to the 2.6.26 release, however it wasn't present in 2.6.26. What a dumb feature. How do I make it stop doing this? gitk kind of gets it right, but isn't useful across DSL) commit 27f302519148f311307637d4c9a6d0fd87d07e4c Author: Greg Kroah-Hartman <gregkh@suse.de> Date: Thu May 22 17:21:08 2008 -0400 block: make /proc/partitions and /proc/diskstats use class_find_device() Use the proper class iterator function instead of mucking around in the internals of the class structures. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> diff --git a/block/genhd.c b/block/genhd.c index 70f1d70..c13cc77 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -317,17 +317,21 @@ static void *part_start(struct seq_file return NULL; } +static int find_next(struct device *dev, void *data) +{ + if (dev->type == &disk_type) + return 1; + return 0; +} + static void *part_next(struct seq_file *part, void *v, loff_t *pos) { struct gendisk *gp = v; struct device *dev; ++*pos; - list_for_each_entry(dev, &gp->dev.node, node) { - if (&dev->node == &block_class.devices) - return NULL; - if (dev->type == &disk_type) - return dev_to_disk(dev); - } + dev = class_find_device(&block_class, &gp->dev, NULL, find_next); + if (dev) + return dev_to_disk(dev); return NULL; } @@ -578,12 +582,9 @@ static void *diskstats_next(struct seq_f struct device *dev; ++*pos; - list_for_each_entry(dev, &gp->dev.node, node) { - if (&dev->node == &block_class.devices) - return NULL; - if (dev->type == &disk_type) - return dev_to_disk(dev); - } + dev = class_find_device(&block_class, &gp->dev, NULL, find_next); + if (dev) + return dev_to_disk(dev); return NULL; }
On Wed, Aug 13, 2008 at 01:01:58PM -0700, Andrew Morton wrote: > > Problem Description: /proc/diskstats does not contain all the block devices > it > > should. /sys/block has all the devices, but /proc/diskstats does not. > > > > Steps to reproduce: boot a system with >9 (10?) disk devices (24 block > > devices?) > > The below would be a prime suspect. > > Unfortunately a simple revert results in an uncompilable kernel. > > > (It drives me up the wall and across the ceiling how the patch has a > commit "date" of three months prior to the 2.6.26 release, however it > wasn't present in 2.6.26. What a dumb feature. How do I make it stop > doing this? gitk kind of gets it right, but isn't useful across DSL) $ git show --pretty=fuller 27f302519148f311307637d4c9a6d0fd87d07e4c commit 27f302519148f311307637d4c9a6d0fd87d07e4c Author: Greg Kroah-Hartman <gregkh@suse.de> AuthorDate: Thu May 22 17:21:08 2008 -0400 Commit: Greg Kroah-Hartman <gregkh@suse.de> CommitDate: Mon Jul 21 21:54:49 2008 -0700 There is a commit date, and the date the patch was written. Both are preserved in git. And even if it was committed to a branch before 2.6.26 was released, and then pulled in, that's fine, it's distributed development :) $ git describe --contains 27f302519148f311307637d4c9a6d0fd87d07e4c v2.6.27-rc1~866^2~40 showing it first showed up on 2.6.27-rc1. Anyway, I don't have any systems with such a large number of devices to test with. Running git-bisect should narrow the problem down, you can't just revert this patch as later-on patches relied on it, as you found out... Also, what is the output of these files, what exactly is missing? thanks, greg k-h
Reply-To: akpm@linux-foundation.org On Wed, 13 Aug 2008 16:51:12 -0700 Greg KH <greg@kroah.com> wrote: > On Wed, Aug 13, 2008 at 01:01:58PM -0700, Andrew Morton wrote: > > > Problem Description: /proc/diskstats does not contain all the block > devices it > > > should. /sys/block has all the devices, but /proc/diskstats does not. > > > > > > Steps to reproduce: boot a system with >9 (10?) disk devices (24 block > > > devices?) > > > > The below would be a prime suspect. > > > > Unfortunately a simple revert results in an uncompilable kernel. > > > > > > (It drives me up the wall and across the ceiling how the patch has a > > commit "date" of three months prior to the 2.6.26 release, however it > > wasn't present in 2.6.26. What a dumb feature. How do I make it stop > > doing this? gitk kind of gets it right, but isn't useful across DSL) > > $ git show --pretty=fuller 27f302519148f311307637d4c9a6d0fd87d07e4c <writes a script> > commit 27f302519148f311307637d4c9a6d0fd87d07e4c > Author: Greg Kroah-Hartman <gregkh@suse.de> > AuthorDate: Thu May 22 17:21:08 2008 -0400 > Commit: Greg Kroah-Hartman <gregkh@suse.de> > CommitDate: Mon Jul 21 21:54:49 2008 -0700 > > There is a commit date, and the date the patch was written. Both are > preserved in git. > > And even if it was committed to a branch before 2.6.26 was released, and > then pulled in, that's fine, it's distributed development :) It's useless. I have never ever ever ever wanted to know when random person X committed a patch to some local tree. The overwhelmingly most common question is "when did that go into Linux". Sigh. > $ git describe --contains 27f302519148f311307637d4c9a6d0fd87d07e4c > v2.6.27-rc1~866^2~40 > > showing it first showed up on 2.6.27-rc1. Spose that works. My usual recourse is searching the commits list, which has useful-to-humans ordering information. Is the date at which it went into mainline recorded? > Anyway, I don't have any systems with such a large number of devices to > test with. I suppose that partitioning a junk disk with lots of little partitions will show it. parted wants to go stupid on me though. > Running git-bisect should narrow the problem down, you can't > just revert this patch as later-on patches relied on it, as you found > out... > > Also, what is the output of these files, what exactly is missing?
Hmm, I can not reproduce this here with 2.6.27-rc3-00171-gb635ace: $ cat /proc/diskstats 8 16 sdb 489 1349 13612 8923 0 0 0 0 0 2338 8923 8 17 sdb1 2 0 4 789 0 0 0 0 0 789 789 8 21 sdb5 52 230 1176 344 0 0 0 0 0 198 344 8 22 sdb6 26 108 1072 553 0 0 0 0 0 417 553 8 23 sdb7 38 108 1168 456 0 0 0 0 0 298 456 8 24 sdb8 26 108 1072 684 0 0 0 0 0 547 684 8 25 sdb9 38 108 1168 620 0 0 0 0 0 485 620 8 26 sdb10 26 108 1072 918 0 0 0 0 0 771 918 8 27 sdb11 38 108 1168 1090 0 0 0 0 0 959 1090 8 28 sdb12 26 108 1072 983 0 0 0 0 0 842 983 8 29 sdb13 38 108 1168 492 0 0 0 0 0 358 492 8 30 sdb14 26 108 1072 741 0 0 0 0 0 607 741 8 31 sdb15 38 108 1168 865 0 0 0 0 0 720 865 Care to attach the output of: cat /proc/partitions cat /proc/diskstats ls -l /sys/class/block Thanks!
Ah, seems we need disks, not partitions in the "long" list to see it truncated. If I try: modprobe scsi_debug max_luns=25 num_parts=2 I get with: $ cat /proc/partitions major minor #blocks name 8 0 97685784 sda 8 1 1 sda1 8 5 31463239 sda5 8 6 20972826 sda6 8 7 9767488 sda7 8 8 9767488 sda8 8 9 23438803 sda9 8 10 2273166 sda10 9 0 9767424 md0 8 96 8192 sdg 8 97 3952 sdg1 8 98 4224 sdg2 8 112 8192 sdh 8 113 3952 sdh1 8 114 4224 sdh2 8 128 8192 sdi 8 129 3952 sdi1 8 130 4224 sdi2 8 144 8192 sdj 8 145 3952 sdj1 8 146 4224 sdj2 8 160 8192 sdk 8 161 3952 sdk1 8 162 4224 sdk2 8 176 8192 sdl 8 177 3952 sdl1 8 178 4224 sdl2 8 192 8192 sdm 8 193 3952 sdm1 8 194 4224 sdm2 8 208 8192 sdn 8 209 3952 sdn1 8 210 4224 sdn2 8 224 8192 sdo 8 225 3952 sdo1 8 226 4224 sdo2 8 240 8192 sdp 8 241 3952 sdp1 8 242 4224 sdp2 But with: $ less /proc/partitions major minor #blocks name 8 0 97685784 sda 8 1 1 sda1 8 5 31463239 sda5 8 6 20972826 sda6 8 7 9767488 sda7 8 8 9767488 sda8 8 9 23438803 sda9 8 10 2273166 sda10 9 0 9767424 md0 8 96 8192 sdg 8 97 3952 sdg1 8 98 4224 sdg2 8 112 8192 sdh 8 113 3952 sdh1 8 114 4224 sdh2 8 128 8192 sdi 8 129 3952 sdi1 8 130 4224 sdi2 8 144 8192 sdj 8 145 3952 sdj1 8 146 4224 sdj2 8 160 8192 sdk 8 161 3952 sdk1 8 162 4224 sdk2 8 176 8192 sdl 8 177 3952 sdl1 8 178 4224 sdl2 8 192 8192 sdm 8 193 3952 sdm1 8 194 4224 sdm2 8 208 8192 sdn 8 209 3952 sdn1 8 210 4224 sdn2 8 224 8192 sdo 8 225 3952 sdo1 8 226 4224 sdo2 8 240 8192 sdp 8 241 3952 sdp1 8 242 4224 sdp2 65 0 8192 sdq 65 1 3952 sdq1 65 2 4224 sdq2 65 16 8192 sdr 65 17 3952 sdr1 65 18 4224 sdr2 65 32 8192 sds 65 33 3952 sds1 65 34 4224 sds2 65 48 8192 sdt 65 49 3952 sdt1 65 50 4224 sdt2 65 64 8192 sdu 65 65 3952 sdu1 65 66 4224 sdu2 65 80 8192 sdv 65 81 3952 sdv1 65 82 4224 sdv2 65 96 8192 sdw 65 97 3952 sdw1 65 98 4224 sdw2 65 112 8192 sdx 65 113 3952 sdx1 65 114 4224 sdx2 65 128 8192 sdy 65 129 3952 sdy1 65 130 4224 sdy2 65 144 8192 sdz 65 145 3952 sdz1 65 146 4224 sdz2 65 160 8192 sdaa 65 161 3952 sdaa1 65 162 4224 sdaa2 65 176 8192 sdab 65 177 3952 sdab1 65 178 4224 sdab2 65 192 8192 sdac 65 193 3952 sdac1 65 194 4224 sdac2 65 208 8192 sdad 65 209 3952 sdad1 65 210 4224 sdad2 65 224 8192 sdae 65 225 3952 sdae1 65 226 4224 sdae2 stracing "cat" shows: 8 224 8192 sdo 8 225 3952 sdo1 8 226 4224 sdo2 8 240 8192 sdp 8 2) = 1024 read(3, "41 3952 sdp1\n 8 242 "..., 1024) = 46 write(1, "41 3952 sdp1\n 8 242 "..., 4641 3952 sdp1 8 242 4224 sdp2 ) = 46 read(3, "", 1024) = 0 close(3) = 0 Seems we return 0 bytes somewhere in the device list loop, while we still should have data to print.
Handled-By : Greg Kroah-Hartman <greg@kroah.com> Handled-By : Kay Sievers <kay.sievers@vrfy.org>
Created attachment 17255 [details] find_start fix The attached patch seems to work for me. Care to test it? Thanks!
Created attachment 17256 [details] find_start fix2 Oops, sorry, 5am here. :) Here's the right patch.
Created attachment 17257 [details] find_start3 This fixes /proc/partitions and also /proc/diskstats here.
Patch : http://bugzilla.kernel.org/attachment.cgi?id=17257&action=view
The patch fixes the problem for me.
Fixed by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=068281d302def5efd96d84a163ba17e85f5ea564