Bug 11323

Summary: /proc/diskstats does not contain all disk devices
Product: File System Reporter: Andy Ryan (genanr)
Component: OtherAssignee: fs_other
Status: CLOSED CODE_FIX    
Severity: normal CC: bunk, greg, kay, randy.dunlap, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27-rcX Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 11167    
Attachments: find_start fix
find_start fix2
find_start3

Description Andy Ryan 2008-08-13 12:12:44 UTC
Latest working kernel version: 2.6.26
Earliest failing kernel version: 2.6.27-rc1
Distribution: Debian
Hardware Environment: Dell
Software Environment:
Problem Description: /proc/diskstats does not contain all the block devices it should. /sys/block has all the devices, but /proc/diskstats does not.

Steps to reproduce: boot a system with >9 (10?) disk devices (24 block devices?)
Comment 1 Anonymous Emailer 2008-08-13 13:02:26 UTC
Reply-To: akpm@linux-foundation.org

On Wed, 13 Aug 2008 12:12:44 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> 
> http://bugzilla.kernel.org/show_bug.cgi?id=11323
> 
>            Summary: /proc/diskstats does not contain all disk devices
>            Product: File System
>            Version: 2.5
>      KernelVersion: 2.6.27-rcX
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: fs_other@kernel-bugs.osdl.org
>         ReportedBy: genanr@emsphone.com
> 
> 
> Latest working kernel version: 2.6.26
> Earliest failing kernel version: 2.6.27-rc1

Post-2.6.26 regression.

> Distribution: Debian
> Hardware Environment: Dell
> Software Environment:
> Problem Description: /proc/diskstats does not contain all the block devices
> it
> should. /sys/block has all the devices, but /proc/diskstats does not.
> 
> Steps to reproduce: boot a system with >9 (10?) disk devices (24 block
> devices?)

The below would be a prime suspect.

Unfortunately a simple revert results in an uncompilable kernel.


(It drives me up the wall and across the ceiling how the patch has a
commit "date" of three months prior to the 2.6.26 release, however it
wasn't present in 2.6.26.  What a dumb feature.  How do I make it stop
doing this?  gitk kind of gets it right, but isn't useful across DSL)



commit 27f302519148f311307637d4c9a6d0fd87d07e4c
Author: Greg Kroah-Hartman <gregkh@suse.de>
Date:   Thu May 22 17:21:08 2008 -0400

    block: make /proc/partitions and /proc/diskstats use class_find_device()
    
    Use the proper class iterator function instead of mucking around in the
    internals of the class structures.
    
    Cc: Kay Sievers <kay.sievers@vrfy.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

diff --git a/block/genhd.c b/block/genhd.c
index 70f1d70..c13cc77 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -317,17 +317,21 @@ static void *part_start(struct seq_file 
 	return NULL;
 }
 
+static int find_next(struct device *dev, void *data)
+{
+	if (dev->type == &disk_type)
+		return 1;
+	return 0;
+}
+
 static void *part_next(struct seq_file *part, void *v, loff_t *pos)
 {
 	struct gendisk *gp = v;
 	struct device *dev;
 	++*pos;
-	list_for_each_entry(dev, &gp->dev.node, node) {
-		if (&dev->node == &block_class.devices)
-			return NULL;
-		if (dev->type == &disk_type)
-			return dev_to_disk(dev);
-	}
+	dev = class_find_device(&block_class, &gp->dev, NULL, find_next);
+	if (dev)
+		return dev_to_disk(dev);
 	return NULL;
 }
 
@@ -578,12 +582,9 @@ static void *diskstats_next(struct seq_f
 	struct device *dev;
 
 	++*pos;
-	list_for_each_entry(dev, &gp->dev.node, node) {
-		if (&dev->node == &block_class.devices)
-			return NULL;
-		if (dev->type == &disk_type)
-			return dev_to_disk(dev);
-	}
+	dev = class_find_device(&block_class, &gp->dev, NULL, find_next);
+	if (dev)
+		return dev_to_disk(dev);
 	return NULL;
 }
 
Comment 2 Greg Kroah-Hartman 2008-08-13 16:55:00 UTC
On Wed, Aug 13, 2008 at 01:01:58PM -0700, Andrew Morton wrote:
> > Problem Description: /proc/diskstats does not contain all the block devices
> it
> > should. /sys/block has all the devices, but /proc/diskstats does not.
> > 
> > Steps to reproduce: boot a system with >9 (10?) disk devices (24 block
> > devices?)
> 
> The below would be a prime suspect.
> 
> Unfortunately a simple revert results in an uncompilable kernel.
> 
> 
> (It drives me up the wall and across the ceiling how the patch has a
> commit "date" of three months prior to the 2.6.26 release, however it
> wasn't present in 2.6.26.  What a dumb feature.  How do I make it stop
> doing this?  gitk kind of gets it right, but isn't useful across DSL)

$ git show --pretty=fuller 27f302519148f311307637d4c9a6d0fd87d07e4c

commit 27f302519148f311307637d4c9a6d0fd87d07e4c
Author:     Greg Kroah-Hartman <gregkh@suse.de>
AuthorDate: Thu May 22 17:21:08 2008 -0400
Commit:     Greg Kroah-Hartman <gregkh@suse.de>
CommitDate: Mon Jul 21 21:54:49 2008 -0700

There is a commit date, and the date the patch was written.  Both are
preserved in git.

And even if it was committed to a branch before 2.6.26 was released, and
then pulled in, that's fine, it's distributed development :)

$ git describe --contains 27f302519148f311307637d4c9a6d0fd87d07e4c
v2.6.27-rc1~866^2~40

showing it first showed up on 2.6.27-rc1.

Anyway, I don't have any systems with such a large number of devices to
test with.  Running git-bisect should narrow the problem down, you can't
just revert this patch as later-on patches relied on it, as you found
out...

Also, what is the output of these files, what exactly is missing?

thanks,

greg k-h
Comment 3 Anonymous Emailer 2008-08-13 17:20:53 UTC
Reply-To: akpm@linux-foundation.org

On Wed, 13 Aug 2008 16:51:12 -0700
Greg KH <greg@kroah.com> wrote:

> On Wed, Aug 13, 2008 at 01:01:58PM -0700, Andrew Morton wrote:
> > > Problem Description: /proc/diskstats does not contain all the block
> devices it
> > > should. /sys/block has all the devices, but /proc/diskstats does not.
> > > 
> > > Steps to reproduce: boot a system with >9 (10?) disk devices (24 block
> > > devices?)
> > 
> > The below would be a prime suspect.
> > 
> > Unfortunately a simple revert results in an uncompilable kernel.
> > 
> > 
> > (It drives me up the wall and across the ceiling how the patch has a
> > commit "date" of three months prior to the 2.6.26 release, however it
> > wasn't present in 2.6.26.  What a dumb feature.  How do I make it stop
> > doing this?  gitk kind of gets it right, but isn't useful across DSL)
> 
> $ git show --pretty=fuller 27f302519148f311307637d4c9a6d0fd87d07e4c

<writes a script>

> commit 27f302519148f311307637d4c9a6d0fd87d07e4c
> Author:     Greg Kroah-Hartman <gregkh@suse.de>
> AuthorDate: Thu May 22 17:21:08 2008 -0400
> Commit:     Greg Kroah-Hartman <gregkh@suse.de>
> CommitDate: Mon Jul 21 21:54:49 2008 -0700
> 
> There is a commit date, and the date the patch was written.  Both are
> preserved in git.
> 
> And even if it was committed to a branch before 2.6.26 was released, and
> then pulled in, that's fine, it's distributed development :)

It's useless.  I have never ever ever ever wanted to know when random
person X committed a patch to some local tree.  The overwhelmingly most
common question is "when did that go into Linux".  Sigh.

> $ git describe --contains 27f302519148f311307637d4c9a6d0fd87d07e4c
> v2.6.27-rc1~866^2~40
> 
> showing it first showed up on 2.6.27-rc1.

Spose that works.  My usual recourse is searching the commits list,
which has useful-to-humans ordering information.

Is the date at which it went into mainline recorded?

> Anyway, I don't have any systems with such a large number of devices to
> test with.

I suppose that partitioning a junk disk with lots of little partitions
will show it.  parted wants to go stupid on me though.

>  Running git-bisect should narrow the problem down, you can't
> just revert this patch as later-on patches relied on it, as you found
> out...
> 
> Also, what is the output of these files, what exactly is missing?
Comment 4 Kay Sievers 2008-08-14 07:37:54 UTC
Hmm, I can not reproduce this here with 2.6.27-rc3-00171-gb635ace:
  $ cat /proc/diskstats
   8   16 sdb 489 1349 13612 8923 0 0 0 0 0 2338 8923
   8   17 sdb1 2 0 4 789 0 0 0 0 0 789 789
   8   21 sdb5 52 230 1176 344 0 0 0 0 0 198 344
   8   22 sdb6 26 108 1072 553 0 0 0 0 0 417 553
   8   23 sdb7 38 108 1168 456 0 0 0 0 0 298 456
   8   24 sdb8 26 108 1072 684 0 0 0 0 0 547 684
   8   25 sdb9 38 108 1168 620 0 0 0 0 0 485 620
   8   26 sdb10 26 108 1072 918 0 0 0 0 0 771 918
   8   27 sdb11 38 108 1168 1090 0 0 0 0 0 959 1090
   8   28 sdb12 26 108 1072 983 0 0 0 0 0 842 983
   8   29 sdb13 38 108 1168 492 0 0 0 0 0 358 492
   8   30 sdb14 26 108 1072 741 0 0 0 0 0 607 741
   8   31 sdb15 38 108 1168 865 0 0 0 0 0 720 865

Care to attach the output of:
  cat /proc/partitions
  cat /proc/diskstats
  ls -l /sys/class/block

Thanks!
Comment 5 Kay Sievers 2008-08-14 08:00:43 UTC
Ah, seems we need disks, not partitions in the "long" list to see it truncated.

If I try:
  modprobe scsi_debug max_luns=25 num_parts=2

I get with:
  $ cat /proc/partitions
  major minor  #blocks  name

   8     0   97685784 sda
   8     1          1 sda1
   8     5   31463239 sda5
   8     6   20972826 sda6
   8     7    9767488 sda7
   8     8    9767488 sda8
   8     9   23438803 sda9
   8    10    2273166 sda10
   9     0    9767424 md0
   8    96       8192 sdg
   8    97       3952 sdg1
   8    98       4224 sdg2
   8   112       8192 sdh
   8   113       3952 sdh1
   8   114       4224 sdh2
   8   128       8192 sdi
   8   129       3952 sdi1
   8   130       4224 sdi2
   8   144       8192 sdj
   8   145       3952 sdj1
   8   146       4224 sdj2
   8   160       8192 sdk
   8   161       3952 sdk1
   8   162       4224 sdk2
   8   176       8192 sdl
   8   177       3952 sdl1
   8   178       4224 sdl2
   8   192       8192 sdm
   8   193       3952 sdm1
   8   194       4224 sdm2
   8   208       8192 sdn
   8   209       3952 sdn1
   8   210       4224 sdn2
   8   224       8192 sdo
   8   225       3952 sdo1
   8   226       4224 sdo2
   8   240       8192 sdp
   8   241       3952 sdp1
   8   242       4224 sdp2

But with:
  $ less /proc/partitions
  major minor  #blocks  name

   8     0   97685784 sda
   8     1          1 sda1
   8     5   31463239 sda5
   8     6   20972826 sda6
   8     7    9767488 sda7
   8     8    9767488 sda8
   8     9   23438803 sda9
   8    10    2273166 sda10
   9     0    9767424 md0
   8    96       8192 sdg
   8    97       3952 sdg1
   8    98       4224 sdg2
   8   112       8192 sdh
   8   113       3952 sdh1
   8   114       4224 sdh2
   8   128       8192 sdi
   8   129       3952 sdi1
   8   130       4224 sdi2
   8   144       8192 sdj
   8   145       3952 sdj1
   8   146       4224 sdj2
   8   160       8192 sdk
   8   161       3952 sdk1
   8   162       4224 sdk2
   8   176       8192 sdl
   8   177       3952 sdl1
   8   178       4224 sdl2
   8   192       8192 sdm
   8   193       3952 sdm1
   8   194       4224 sdm2
   8   208       8192 sdn
   8   209       3952 sdn1
   8   210       4224 sdn2
   8   224       8192 sdo
   8   225       3952 sdo1
   8   226       4224 sdo2
   8   240       8192 sdp
   8   241       3952 sdp1
   8   242       4224 sdp2
  65     0       8192 sdq
  65     1       3952 sdq1
  65     2       4224 sdq2
  65    16       8192 sdr
  65    17       3952 sdr1
  65    18       4224 sdr2
  65    32       8192 sds
  65    33       3952 sds1
  65    34       4224 sds2
  65    48       8192 sdt
  65    49       3952 sdt1
  65    50       4224 sdt2
  65    64       8192 sdu
  65    65       3952 sdu1
  65    66       4224 sdu2
  65    80       8192 sdv
  65    81       3952 sdv1
  65    82       4224 sdv2
  65    96       8192 sdw
  65    97       3952 sdw1
  65    98       4224 sdw2
  65   112       8192 sdx
  65   113       3952 sdx1
  65   114       4224 sdx2
  65   128       8192 sdy
  65   129       3952 sdy1
  65   130       4224 sdy2
  65   144       8192 sdz
  65   145       3952 sdz1
  65   146       4224 sdz2
  65   160       8192 sdaa
  65   161       3952 sdaa1
  65   162       4224 sdaa2
  65   176       8192 sdab
  65   177       3952 sdab1
  65   178       4224 sdab2
  65   192       8192 sdac
  65   193       3952 sdac1
  65   194       4224 sdac2
  65   208       8192 sdad
  65   209       3952 sdad1
  65   210       4224 sdad2
  65   224       8192 sdae
  65   225       3952 sdae1
  65   226       4224 sdae2

stracing "cat" shows:
   8   224       8192 sdo
   8   225       3952 sdo1
   8   226       4224 sdo2
   8   240       8192 sdp
   8   2) = 1024
read(3, "41       3952 sdp1\n   8   242   "..., 1024) = 46
write(1, "41       3952 sdp1\n   8   242   "..., 4641       3952 sdp1
   8   242       4224 sdp2
) = 46
read(3, "", 1024)                       = 0
close(3)                                = 0

Seems we return 0 bytes somewhere in the device list loop, while we still should have data to print.
Comment 6 Rafael J. Wysocki 2008-08-14 08:41:21 UTC
Handled-By : Greg Kroah-Hartman <greg@kroah.com>
Handled-By : Kay Sievers <kay.sievers@vrfy.org>
Comment 7 Kay Sievers 2008-08-14 21:04:33 UTC
Created attachment 17255 [details]
find_start fix

The attached patch seems to work for me. Care to test it? Thanks!
Comment 8 Kay Sievers 2008-08-14 21:46:24 UTC
Created attachment 17256 [details]
find_start fix2

Oops, sorry, 5am here. :) Here's the right patch.
Comment 9 Kay Sievers 2008-08-14 22:05:37 UTC
Created attachment 17257 [details]
find_start3

This fixes /proc/partitions and also /proc/diskstats here.
Comment 10 Rafael J. Wysocki 2008-08-15 06:42:42 UTC
Patch : http://bugzilla.kernel.org/attachment.cgi?id=17257&action=view
Comment 11 Andy Ryan 2008-08-20 08:22:04 UTC
The patch fixes the problem for me.