Bug 13942

Summary: Troubles with AoE and uninitialized object
Product: IO/Storage Reporter: Rafael J. Wysocki (rjw)
Component: OtherAssignee: io_other
Status: CLOSED CODE_FIX    
Severity: normal CC: cebbert, ecashin, rm+bko
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31-rc5 Tree: Mainline
Regression: Yes
Bug Depends on:    
Bug Blocks: 13615    
Attachments: aoe: allocate unused request queue
aoe: end barrier bios with EOPNOTSUPP

Description Rafael J. Wysocki 2009-08-09 21:45:06 UTC
Subject    : [2.6.31-rc5 regression] Troubles with AoE and uninitialized object
Submitter  : Bruno Prémont <bonbons@linux-vserver.org>
Date       : 2009-08-04 10:12
References : http://marc.info/?l=linux-kernel&m=124938117104811&w=4

This entry is being used for tracking a regression from 2.6.30.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2009-08-20 21:28:26 UTC
On Thursday 20 August 2009, Bruno Prémont wrote:
> On Wed, 19 August 2009 "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.30.  Please verify if it still should be listed and let me
> > know (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=13942
> > Subject             : Troubles with AoE and uninitialized object
> > Submitter   : Bruno Prémont <bonbons@linux-vserver.org>
> > Date                : 2009-08-04 10:12 (16 days old)
> > References  :
> > http://marc.info/?l=linux-kernel&m=124938117104811&w=4
> > 
> 
> This one is still valid for 2.6.31-rc6 though I have not yet taken
> the time to attempt bisecting it. I will probably bisect over the
> week-end.
Comment 2 Chuck Ebbert 2009-09-08 03:32:17 UTC
[ 3161.831201] kobject '<NULL>' (dc083948): tried to add an uninitialized object, something is seriously \
wrong. [ 3161.831215] Pid: 4, comm: events/0 Tainted: G   M       2.6.31-rc5 #68
[ 3161.831222] Call Trace:
[ 3161.831240]  [<c12b70a2>] ? printk+0x18/0x1e
[ 3161.831276]  [<c10fe62d>] kobject_add+0x4d/0x60
[ 3161.831303]  [<c10f7f30>] ? exact_match+0x0/0x10
[ 3161.831313]  [<c10f46a5>] blk_register_queue+0x45/0xb0
[ 3161.831323]  [<c10f7f30>] ? exact_match+0x0/0x10
[ 3161.831333]  [<c10f89a2>] add_disk+0xe2/0x130
[ 3161.831342]  [<c10f7f30>] ? exact_match+0x0/0x10
[ 3161.831351]  [<c10f8430>] ? exact_lock+0x0/0x20
[ 3161.831364]  [<c11efa93>] aoeblk_gdalloc+0x113/0x170

Looks like the aoe driver embeds a block queue directly in the aoe device instead of dynamically allocating it with blk_alloc_queue(). Doing it dynamically automatically calls kobject_init() for the queue's kobject -- and I can't see that being called anywhere in the aoe code.
Comment 3 Chuck Ebbert 2009-09-08 03:39:07 UTC
When attempting to unmount XFS filesystem lying on the AoE device:
[ 5259.349897] aoe: bi_io_vec is NULL
[ 5259.349940] ------------[ cut here ]------------
[ 5259.349958] kernel BUG at /usr/src/linux-2.6/drivers/block/aoe/aoeblk.c:177!

[ 5259.350941]  [<c10f2a3d>] ? generic_make_request+0x28d/0x360
[ 5259.350956]  [<c101df0e>] ? update_curr+0x12e/0x160
[ 5259.350987]  [<c101f72e>] ? set_next_entity+0x2e/0x70
[ 5259.351003]  [<c12b73c3>] ? schedule+0x203/0x340
[ 5259.351035]  [<c1050b1e>] ? mempool_alloc_slab+0xe/0x10
[ 5259.351047]  [<c1050b1e>] ? mempool_alloc_slab+0xe/0x10
[ 5259.351077]  [<c10f2b52>] ? submit_bio+0x42/0xb0
[ 5259.351089]  [<c12b75a5>] ? _cond_resched+0x25/0x40
[ 5259.351119]  [<c1096a5b>] ? bio_alloc_bioset+0x2b/0xe0
[ 5259.351132]  [<c10f4c74>] ? blkdev_issue_flush+0x74/0xb0
[ 5259.351245]  [<df3eb73d>] ? xfs_blkdev_issue_flush+0xd/0x10 [xfs]

blkdev_issue_flush() does:
        bio = bio_alloc(GFP_KERNEL, 0);
        bio->bi_end_io = bio_end_empty_barrier;
        bio->bi_private = &wait;
        bio->bi_bdev = bdev;
        submit_bio(WRITE_BARRIER, bio);

I think having a null bi_io_vec in the flush request is normal?
Comment 4 Ed Cashin 2009-09-11 12:47:07 UTC
Created attachment 23067 [details]
aoe: allocate unused request queue
Comment 5 Ed Cashin 2009-09-11 12:50:32 UTC
Comment on attachment 23067 [details]
aoe: allocate unused request queue

This patch is now in the mainline git tree.
Comment 6 Ed Cashin 2009-09-11 12:52:40 UTC
Created attachment 23068 [details]
aoe: end barrier bios with EOPNOTSUPP

Jens Axboe intends to push this patch to the mainline.
Comment 7 Rafael J. Wysocki 2009-10-02 21:21:17 UTC
Fixed by commits 18d8217bc441630c3c5ec7416c5a65c69e8a0979 and 7135a71b19be1faf48b7148d77844d03bc0717d6.
Comment 8 Roman Mamedov 2009-10-07 21:32:12 UTC
*** Bug 14343 has been marked as a duplicate of this bug. ***