Subject : [2.6.31-rc5 regression] Troubles with AoE and uninitialized object Submitter : Bruno Prémont <bonbons@linux-vserver.org> Date : 2009-08-04 10:12 References : http://marc.info/?l=linux-kernel&m=124938117104811&w=4 This entry is being used for tracking a regression from 2.6.30. Please don't close it until the problem is fixed in the mainline.
On Thursday 20 August 2009, Bruno Prémont wrote: > On Wed, 19 August 2009 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > This message has been generated automatically as a part of a report > > of recent regressions. > > > > The following bug entry is on the current list of known regressions > > from 2.6.30. Please verify if it still should be listed and let me > > know (either way). > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13942 > > Subject : Troubles with AoE and uninitialized object > > Submitter : Bruno Prémont <bonbons@linux-vserver.org> > > Date : 2009-08-04 10:12 (16 days old) > > References : > > http://marc.info/?l=linux-kernel&m=124938117104811&w=4 > > > > This one is still valid for 2.6.31-rc6 though I have not yet taken > the time to attempt bisecting it. I will probably bisect over the > week-end.
[ 3161.831201] kobject '<NULL>' (dc083948): tried to add an uninitialized object, something is seriously \ wrong. [ 3161.831215] Pid: 4, comm: events/0 Tainted: G M 2.6.31-rc5 #68 [ 3161.831222] Call Trace: [ 3161.831240] [<c12b70a2>] ? printk+0x18/0x1e [ 3161.831276] [<c10fe62d>] kobject_add+0x4d/0x60 [ 3161.831303] [<c10f7f30>] ? exact_match+0x0/0x10 [ 3161.831313] [<c10f46a5>] blk_register_queue+0x45/0xb0 [ 3161.831323] [<c10f7f30>] ? exact_match+0x0/0x10 [ 3161.831333] [<c10f89a2>] add_disk+0xe2/0x130 [ 3161.831342] [<c10f7f30>] ? exact_match+0x0/0x10 [ 3161.831351] [<c10f8430>] ? exact_lock+0x0/0x20 [ 3161.831364] [<c11efa93>] aoeblk_gdalloc+0x113/0x170 Looks like the aoe driver embeds a block queue directly in the aoe device instead of dynamically allocating it with blk_alloc_queue(). Doing it dynamically automatically calls kobject_init() for the queue's kobject -- and I can't see that being called anywhere in the aoe code.
When attempting to unmount XFS filesystem lying on the AoE device: [ 5259.349897] aoe: bi_io_vec is NULL [ 5259.349940] ------------[ cut here ]------------ [ 5259.349958] kernel BUG at /usr/src/linux-2.6/drivers/block/aoe/aoeblk.c:177! [ 5259.350941] [<c10f2a3d>] ? generic_make_request+0x28d/0x360 [ 5259.350956] [<c101df0e>] ? update_curr+0x12e/0x160 [ 5259.350987] [<c101f72e>] ? set_next_entity+0x2e/0x70 [ 5259.351003] [<c12b73c3>] ? schedule+0x203/0x340 [ 5259.351035] [<c1050b1e>] ? mempool_alloc_slab+0xe/0x10 [ 5259.351047] [<c1050b1e>] ? mempool_alloc_slab+0xe/0x10 [ 5259.351077] [<c10f2b52>] ? submit_bio+0x42/0xb0 [ 5259.351089] [<c12b75a5>] ? _cond_resched+0x25/0x40 [ 5259.351119] [<c1096a5b>] ? bio_alloc_bioset+0x2b/0xe0 [ 5259.351132] [<c10f4c74>] ? blkdev_issue_flush+0x74/0xb0 [ 5259.351245] [<df3eb73d>] ? xfs_blkdev_issue_flush+0xd/0x10 [xfs] blkdev_issue_flush() does: bio = bio_alloc(GFP_KERNEL, 0); bio->bi_end_io = bio_end_empty_barrier; bio->bi_private = &wait; bio->bi_bdev = bdev; submit_bio(WRITE_BARRIER, bio); I think having a null bi_io_vec in the flush request is normal?
Created attachment 23067 [details] aoe: allocate unused request queue
Comment on attachment 23067 [details] aoe: allocate unused request queue This patch is now in the mainline git tree.
Created attachment 23068 [details] aoe: end barrier bios with EOPNOTSUPP Jens Axboe intends to push this patch to the mainline.
Fixed by commits 18d8217bc441630c3c5ec7416c5a65c69e8a0979 and 7135a71b19be1faf48b7148d77844d03bc0717d6.
*** Bug 14343 has been marked as a duplicate of this bug. ***