Bug 12591

Summary: NULL pointer dereference in blk_queue_io_stat
Product: IO/Storage Reporter: Petr Vandrovec (vandrove)
Component: Block LayerAssignee: Jens Axboe (axboe)
Status: CLOSED CODE_FIX    
Severity: normal CC: rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29-rc3-64-00113-gbc58ba9 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 12398    

Description Petr Vandrovec 2009-01-31 23:58:22 UTC
Latest working kernel version: 2.6.29-rc3-64-00112-g7598909e3ee2a08726276d6415b69dadb52d0d76
Earliest failing kernel version: 2.6.29-rc3-64-00113-gbc58ba9468d94d62c56ab9b47173583ec140b165
Distribution:  Debian
Hardware Environment:  AMD 64bit kernel, 32bit userspace
Software Environment:  hddtemp (happens to run on hdd connected to sata_nv)
Problem Description:

Since commit bc58ba9468d94d62c56ab9b47173583ec140b165 my box crashes during startup when hddtemp tries to start.  It occurs when hddtemp tries to issue INQUIRY request through SG_IO on /dev/sg0.  During completion of request new accounting code crashes because request being completed happens to have non-NULL rq_disk, but that disk happens to have NULL queue.  Backtrace from oopses is:

__end_that_request_first + 0xe1/0x380
end_that_request_data + 0x2c/0x70
blk_end_io + 0x2d/0xb0
blk_end_request + 0xe/0x10
scsi_io_completion + 0x133/0x4e0
scsi_finish_command + 0xac/0xe0
scsi_softirq_done + 0xb8/0140
blk_done_softirq + 0x85/0xa0
__do_softirq + 0x7d/0x140
call_softirq + 0x1c/0x30
do_softirq + 0x65/0xb0
irq_exit + 0x8d/0xb0
do_IRQ + 0xa/0x140

For now I've put this into my tree...

diff --git a/block/blk-core.c b/block/blk-core.c
index ca69f3d..da3c9fa 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -69,7 +69,7 @@ static void drive_stat_acct(struct request *rq, int new_io)
        int rw = rq_data_dir(rq);
        int cpu;
 
-       if (!blk_fs_request(rq) || !disk || !blk_queue_io_stat(disk->queue))
+       if (!blk_fs_request(rq) || !disk || !disk->queue || !blk_queue_io_stat(disk->queue))
                return;
 
        cpu = part_stat_lock();
@@ -1667,7 +1667,7 @@ static void blk_account_io_completion(struct request *req, unsigned int bytes)
 {
        struct gendisk *disk = req->rq_disk;
 
-       if (!disk || !blk_queue_io_stat(disk->queue))
+       if (!disk || !disk->queue || !blk_queue_io_stat(disk->queue))
                return;
 
        if (blk_fs_request(req)) {
@@ -1686,7 +1686,7 @@ static void blk_account_io_done(struct request *req)
 {
        struct gendisk *disk = req->rq_disk;
 
-       if (!disk || !blk_queue_io_stat(disk->queue))
+       if (!disk || !disk->queue || !blk_queue_io_stat(disk->queue))
                return;
 
        /*


Steps to reproduce:

Run 'hddtemp /dev/sg0'
Comment 1 Petr Vandrovec 2009-02-01 01:18:48 UTC
> Software Environment:  hddtemp (happens to run on hdd connected to sata_nv)

Actually problematic device is DVD drive without any media inside, not harddisk.  I guess that explains why it has no queue...
Comment 2 Jens Axboe 2009-02-01 08:57:24 UTC
It'll still have a persistent queue, even without media. So a little odd, I'll take a look and fix it tomorrow.
Comment 3 Rafael J. Wysocki 2009-02-01 14:36:27 UTC
Handled-By : Jens Axboe <jens.axboe@oracle.com>

Caused by:

commit bc58ba9468d94d62c56ab9b47173583ec140b165
Author: Jens Axboe <jens.axboe@oracle.com>
Date:   Fri Jan 23 10:54:44 2009 +0100

    block: add sysfs file for controlling io stats accounting

    Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

First-Bad-Commit : bc58ba9468d94d62c56ab9b47173583ec140b165
Comment 4 Rafael J. Wysocki 2009-02-04 17:14:52 UTC
On Wednesday 04 February 2009, Jens Axboe wrote:
> On Wed, Feb 04 2009, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.28.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=12591
> > Subject             : NULL pointer dereference in blk_queue_io_stat
> > Submitter   : Petr Vandrovec <vandrove@vc.cvut.cz>
> > Date                : 2009-01-31 23:58 (5 days old)
> > First-Bad-Commit:
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=bc58ba9468d94d62c56ab9b47173583ec140b165
> > Handled-By  : Jens Axboe <jens.axboe@oracle.com>
> 
> This one is already fixed in -git by commit
> fb8ec18c316d869271137c97320dbfd2def56569.