Bug 29302 - Null pointer dereference with large max_sectors_kb
Summary: Null pointer dereference with large max_sectors_kb
Status: RESOLVED OBSOLETE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jens Axboe
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-17 13:20 UTC by Tomas Zvala
Modified: 2013-12-11 12:28 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.36 - 2.6.38-rc5
Tree: Mainline
Regression: No


Attachments
dmesg with error displayed. (83.75 KB, text/plain)
2011-02-17 13:20 UTC, Tomas Zvala
Details
kernel config (53.36 KB, text/plain)
2011-02-17 13:21 UTC, Tomas Zvala
Details

Description Tomas Zvala 2011-02-17 13:20:18 UTC
Created attachment 48132 [details]
dmesg with error displayed.

Hello, 
I'm not really sure I identified the problem product/component correctly, but this seems most appropriate.


[140517]stg-bot ~ # echo 256 >  /sys/block/sdb/queue/max_sectors_kb
[140523]stg-bot ~ # mkfs.btrfs /dev/sdb

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdb
        nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
Btrfs Btrfs v0.19
[140532]stg-bot ~ # mount /dev/sdb /mnt
[140540]stg-bot ~ # umount /mnt
[140543]stg-bot ~ # cat /sys/block/sdb/queue/max_hw_sectors_kb > /sys/block/sdb/queue/max_sectors_kb
[140710]stg-bot ~ # mkfs.btrfs /dev/sdb

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdb
        nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
Btrfs Btrfs v0.19
[140713]stg-bot ~ # mount /dev/sdb /mnt
Killed
[140715]stg-bot ~ #

Now there is a bug in dmesg (output attached) and another attempt to mount the device kind of freezes it. The mount blocks, sync blocks, but i can read/write the device using dd. And if I, instead of trying to mount again, zero out first 1MB, mkfs.btrfs and mount, I get the bug again. Freeze again on second mount attempt after that. 

This happens on 2.6.36 and 2.6.37 with scst patches, 2.6.37 vanilla and on 2.6.38-rc5 it just freezes the first time I try to mount. No outuput in dmesg.

The hardware is Dual Xeon E5620, 12GB ram, Areca 1880 with 3 arrays (testing on 3TB raid10 and 10TB raid6), SuperMicro X8DTU-F.

If I left out any important info, please let me know ;).
Comment 1 Tomas Zvala 2011-02-17 13:21:51 UTC
Created attachment 48142 [details]
kernel config
Comment 2 Andrew Morton 2011-02-17 23:37:48 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 17 Feb 2011 13:20:20 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=29302
> 
>            Summary: Null pointer dereference with large max_sectors_kb
>            Product: IO/Storage
>            Version: 2.5
>     Kernel Version: 2.6.36 - 2.6.38-rc5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Block Layer
>         AssignedTo: axboe@kernel.dk
>         ReportedBy: fox@murder.cz
>         Regression: No
> 
> 
> Created an attachment (id=48132)
>  --> (https://bugzilla.kernel.org/attachment.cgi?id=48132)
> dmesg with error displayed.
> 
> Hello, 
> I'm not really sure I identified the problem product/component correctly, but
> this seems most appropriate.
> 
> 
> [140517]stg-bot ~ # echo 256 >  /sys/block/sdb/queue/max_sectors_kb
> [140523]stg-bot ~ # mkfs.btrfs /dev/sdb
> 
> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
> WARNING! - see http://btrfs.wiki.kernel.org before using
> 
> fs created label (null) on /dev/sdb
>         nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
> Btrfs Btrfs v0.19
> [140532]stg-bot ~ # mount /dev/sdb /mnt
> [140540]stg-bot ~ # umount /mnt
> [140543]stg-bot ~ # cat /sys/block/sdb/queue/max_hw_sectors_kb >
> /sys/block/sdb/queue/max_sectors_kb
> [140710]stg-bot ~ # mkfs.btrfs /dev/sdb
> 
> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
> WARNING! - see http://btrfs.wiki.kernel.org before using
> 
> fs created label (null) on /dev/sdb
>         nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
> Btrfs Btrfs v0.19
> [140713]stg-bot ~ # mount /dev/sdb /mnt
> Killed
> [140715]stg-bot ~ #
> 
> Now there is a bug in dmesg (output attached) and another attempt to mount
> the
> device kind of freezes it. The mount blocks, sync blocks, but i can
> read/write
> the device using dd. And if I, instead of trying to mount again, zero out
> first
> 1MB, mkfs.btrfs and mount, I get the bug again. Freeze again on second mount
> attempt after that. 
> 
> This happens on 2.6.36 and 2.6.37 with scst patches, 2.6.37 vanilla and on
> 2.6.38-rc5 it just freezes the first time I try to mount. No outuput in
> dmesg.
> 
> The hardware is Dual Xeon E5620, 12GB ram, Areca 1880 with 3 arrays (testing
> on
> 3TB raid10 and 10TB raid6), SuperMicro X8DTU-F.
> 
> If I left out any important info, please let me know ;).
> 

A btrfs bug, I suspect.


> [  605.109630] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000010
> [  605.109928] IP: [<ffffffff81100a7a>] bio_add_page+0xa/0x40
> [  605.110089] PGD 277d70067 PUD 277e0a067 PMD 0 
> [  605.110247] Oops: 0000 [#1] SMP 
> [  605.110394] last sysfs file:
> /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
> [  605.110686] CPU 6 
> [  605.110698] Modules linked in: ip6table_filter ip6_tables nf_nat_tftp
> nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp
> nf_nat_amanda nf_conntrack_amanda nf_conntrack_tftp nf_conntrack_sip
> nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre
> nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc
> nf_conntrack_h323 nf_conntrack_ftp xt_physdev xt_hashlimit nfs ib_iser
> libiscsi scsi_transport_iscsi ib_ucm ib_ipoib rdma_ucm rdma_cm ib_cm iw_cm
> ib_sa ib_addr ib_uverbs ib_umad mlx4_ib ib_mthca ib_mad ib_core i7core_edac
> edac_core mlx4_core iTCO_wdt iTCO_vendor_support
> [  605.112285] 
> [  605.112419] Pid: 16666, comm: mount Not tainted 2.6.37stg #6 X8DTU/X8DTU
> [  605.112586] RIP: 0010:[<ffffffff81100a7a>]  [<ffffffff81100a7a>]
> bio_add_page+0xa/0x40
> [  605.112879] RSP: 0000:ffff8801833b39b8  EFLAGS: 00010296
> [  605.113035] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> [  605.113207] RDX: 0000000000001000 RSI: ffffea000c3cd200 RDI:
> 0000000000000000
> [  605.113382] RBP: ffff8801833b3ba0 R08: 0000000000000000 R09:
> 0000000000000000
> [  605.113554] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> [  605.113723] R13: 0000000000000000 R14: 000000000000a000 R15:
> ffff88024a19ab98
> [  605.113895] FS:  00007fbcfd971740(0000) GS:ffff880339c80000(0000)
> knlGS:0000000000000000
> [  605.114188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  605.114352] CR2: 0000000000000010 CR3: 00000001c17d5000 CR4:
> 00000000000006e0
> [  605.114525] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  605.114695] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  605.114864] Process mount (pid: 16666, threadinfo ffff8801833b2000, task
> ffff8801b8b48cf0)
> [  605.115157] Stack:
> [  605.115290]  0000000000000000 ffffffff81251384 0000000001400000
> ffffea000c3cd200
> [  605.115590]  0000000000000000 000000004a19ab88 ffff8801b966f380
> 0000100000000000
> [  605.115884]  ffffffff81255810 0000000000000000 0000000000000002
> 0000000001400000
> [  605.116180] Call Trace:
> [  605.116320]  [<ffffffff81251384>] ? submit_extent_page+0x164/0x280
> [  605.116488]  [<ffffffff81255810>] ? end_bio_extent_readpage+0x0/0x210
> [  605.116654]  [<ffffffff81257241>] ? __extent_read_full_page+0x4e1/0x680
> [  605.116820]  [<ffffffff81255810>] ? end_bio_extent_readpage+0x0/0x210
> [  605.116990]  [<ffffffff8122c260>] ? btree_get_extent+0x0/0x1e0
> [  605.117151]  [<ffffffff81257660>] ? read_extent_buffer_pages+0x280/0x3c0
> [  605.117320]  [<ffffffff812d77ec>] ? radix_tree_insert+0x1bc/0x210
> [  605.117488]  [<ffffffff8122c260>] ? btree_get_extent+0x0/0x1e0
> [  605.117651]  [<ffffffff8122e945>] ?
> btree_read_extent_buffer_pages+0x55/0xb0
> [  605.117820]  [<ffffffff8122ea05>] ? read_tree_block+0x35/0x60
> [  605.117980]  [<ffffffff8122ffc2>] ? open_ctree+0xd22/0x1440
> [  605.118140]  [<ffffffff812118f0>] ? btrfs_set_super+0x0/0x20
> [  605.118300]  [<ffffffff81212302>] ? btrfs_mount+0x372/0x4e0
> [  605.118465]  [<ffffffff810d7c85>] ? vfs_kern_mount+0x75/0x1b0
> [  605.118627]  [<ffffffff810ee19e>] ? get_fs_type+0x3e/0xd0
> [  605.118783]  [<ffffffff810d7e33>] ? do_kern_mount+0x53/0x130
> [  605.118942]  [<ffffffff810f15b9>] ? do_mount+0x2d9/0x840
> [  605.119100]  [<ffffffff810ab7eb>] ? memdup_user+0x3b/0x80
> [  605.119257]  [<ffffffff810f1bba>] ? sys_mount+0x9a/0x100
> [  605.119417]  [<ffffffff81002d7b>] ? system_call_fastpath+0x16/0x1b
> [  605.119579] Code: ff ff ff 44 29 e2 31 c0 41 89 57 08 e9 7b fe ff ff 48 83
> 63 18 f7 e9 44 ff ff ff 66 0f 1f 44 00 00 48 83 ec 08 48 89 f8 41 89 c8 <48>
> 8b 4f 10 48 8b 89 98 00 00 00 48 8b b9 f0 01 00 00 89 d1 44 
> [  605.120217] RIP  [<ffffffff81100a7a>] bio_add_page+0xa/0x40
> [  605.120384]  RSP <ffff8801833b39b8>
> [  605.120527] CR2: 0000000000000010
> [  605.121058] ---[ end trace a5eba365422d1ba8 ]---
Comment 3 Anonymous Emailer 2011-02-18 19:05:51 UTC
Reply-To: dave@jikos.cz

On Thu, Feb 17, 2011 at 03:37:19PM -0800, Andrew Morton wrote:
> > [  605.109630] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000010
> > [  605.109928] IP: [<ffffffff81100a7a>] bio_add_page+0xa/0x40

disassembly of the crashpoint obtained from Code: bytes below:

0:   48 8b 4f 10             mov    0x10(%rdi),%rcx
4:   48 8b 89 98 00 00 00    mov    0x98(%rcx),%rcx
b:   48 8b b9 f0 01 00 00    mov    0x1f0(%rcx),%rdi
12:  89 d1                   mov    %edx,%ecx

line 0, accessing some struct member at offset 0x10

struct bio {
        sector_t                bi_sector;      /* device address in 512 byte
                                                   sectors */
        struct bio              *bi_next;       /* request queue link */
        struct block_device     *bi_bdev;
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

so this is zero in bio_add_page:

int bio_add_page(struct bio *bio, struct page *page, unsigned int len,
                 unsigned int offset)
{
        struct request_queue *q = bdev_get_queue(bio->bi_bdev);
        return __bio_add_page(q, bio, page, len, offset, queue_max_sectors(q));
}

static inline struct request_queue *bdev_get_queue(struct block_device *bdev)
{
	return bdev->bd_disk->queue;
}

just to verify following asm instructions (lines 4 and b):
4: dereference something at offset 0x98, roughly matches bd_disk
b: then dereference something at 0x1f0, rougly matches queue

So, bio->bi_bdev == NULL, going backwards in call stack to
__extent_read_full_page(), where the bdev is set:
2050                 bdev = em->bdev;

em = get_extent(...), indirect call of btree_get_extent()
passed from btree_read_extent_buffer_pages() to read_extent_buffer_pages()

165         em->bdev = BTRFS_I(inode)->root->fs_info->fs_devices->latest_bdev;

(or at line 149, but I do not think that there could be any succesful lookup
 147         em = lookup_extent_mapping(em_tree, start, len);
be it either way, the assigned value is exactly the same
)

Now let's hunt why latest_bdev is NULL. Startin in btrfs_mount, where fs_devices
is being filled in.

fs/btrfs/volumes.c:__btrfs_open_devices:
579         struct block_device *latest_bdev = NULL;
...
583         u64 latest_transid = 0;
...
590         list_for_each_entry(device, head, dev_list) {
591                 if (device->bdev)
592                         continue;
593                 if (!device->name)
594                         continue;

...
618                 device->generation = btrfs_super_generation(disk_super);
619                 if (!latest_transid || device->generation > latest_transid) {
620                         latest_devid = devid;
621                         latest_transid = device->generation;
622                         latest_bdev = bdev;
623                 }
...
653         }
...
660         fs_devices->latest_bdev = latest_bdev;

Line 591 or 593 will not stop the loop, device is not filled yet.

Could be, that any of the devices from 'head' list does not satisfy condition
on line 619, but it cannot be the first device, as !latest_transid would
work. There is only one device, /dev/sdb, latest_bdev is set and later on
set on line 660.

/me sees no more options

Meanwhile I've tried it myself and the error does not happen here, with
head at 795abaf1e4e185 (.38-rc4-178-g795abaf). I'll try it with latest -rc5.




> > [  605.110089] PGD 277d70067 PUD 277e0a067 PMD 0 
> > [  605.110247] Oops: 0000 [#1] SMP 
> > [  605.110394] last sysfs file:
> /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
> > [  605.110686] CPU 6 
> > [  605.110698] Modules linked in: ip6table_filter ip6_tables nf_nat_tftp
> nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp
> nf_nat_amanda nf_conntrack_amanda nf_conntrack_tftp nf_conntrack_sip
> nf_conntrack_proto_sctp nf_conntrack_pptp nf_conntrack_proto_gre
> nf_conntrack_netlink nf_conntrack_netbios_ns nf_conntrack_irc
> nf_conntrack_h323 nf_conntrack_ftp xt_physdev xt_hashlimit nfs ib_iser
> libiscsi scsi_transport_iscsi ib_ucm ib_ipoib rdma_ucm rdma_cm ib_cm iw_cm
> ib_sa ib_addr ib_uverbs ib_umad mlx4_ib ib_mthca ib_mad ib_core i7core_edac
> edac_core mlx4_core iTCO_wdt iTCO_vendor_support
> > [  605.112285] 
> > [  605.112419] Pid: 16666, comm: mount Not tainted 2.6.37stg #6 X8DTU/X8DTU
> > [  605.112586] RIP: 0010:[<ffffffff81100a7a>]  [<ffffffff81100a7a>]
> bio_add_page+0xa/0x40
> > [  605.112879] RSP: 0000:ffff8801833b39b8  EFLAGS: 00010296
> > [  605.113035] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> 0000000000000000
> > [  605.113207] RDX: 0000000000001000 RSI: ffffea000c3cd200 RDI:
> 0000000000000000
> > [  605.113382] RBP: ffff8801833b3ba0 R08: 0000000000000000 R09:
> 0000000000000000
> > [  605.113554] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000000
> > [  605.113723] R13: 0000000000000000 R14: 000000000000a000 R15:
> ffff88024a19ab98
> > [  605.113895] FS:  00007fbcfd971740(0000) GS:ffff880339c80000(0000)
> knlGS:0000000000000000
> > [  605.114188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  605.114352] CR2: 0000000000000010 CR3: 00000001c17d5000 CR4:
> 00000000000006e0
> > [  605.114525] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> > [  605.114695] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> > [  605.114864] Process mount (pid: 16666, threadinfo ffff8801833b2000, task
> ffff8801b8b48cf0)
> > [  605.115157] Stack:
> > [  605.115290]  0000000000000000 ffffffff81251384 0000000001400000
> ffffea000c3cd200
> > [  605.115590]  0000000000000000 000000004a19ab88 ffff8801b966f380
> 0000100000000000
> > [  605.115884]  ffffffff81255810 0000000000000000 0000000000000002
> 0000000001400000
> > [  605.116180] Call Trace:
> > [  605.116320]  [<ffffffff81251384>] ? submit_extent_page+0x164/0x280
> > [  605.116488]  [<ffffffff81255810>] ? end_bio_extent_readpage+0x0/0x210
> > [  605.116654]  [<ffffffff81257241>] ? __extent_read_full_page+0x4e1/0x680
> > [  605.116820]  [<ffffffff81255810>] ? end_bio_extent_readpage+0x0/0x210
> > [  605.116990]  [<ffffffff8122c260>] ? btree_get_extent+0x0/0x1e0
> > [  605.117151]  [<ffffffff81257660>] ? read_extent_buffer_pages+0x280/0x3c0
> > [  605.117320]  [<ffffffff812d77ec>] ? radix_tree_insert+0x1bc/0x210
> > [  605.117488]  [<ffffffff8122c260>] ? btree_get_extent+0x0/0x1e0
> > [  605.117651]  [<ffffffff8122e945>] ?
> btree_read_extent_buffer_pages+0x55/0xb0
> > [  605.117820]  [<ffffffff8122ea05>] ? read_tree_block+0x35/0x60
> > [  605.117980]  [<ffffffff8122ffc2>] ? open_ctree+0xd22/0x1440
> > [  605.118140]  [<ffffffff812118f0>] ? btrfs_set_super+0x0/0x20
> > [  605.118300]  [<ffffffff81212302>] ? btrfs_mount+0x372/0x4e0
> > [  605.118465]  [<ffffffff810d7c85>] ? vfs_kern_mount+0x75/0x1b0
> > [  605.118627]  [<ffffffff810ee19e>] ? get_fs_type+0x3e/0xd0
> > [  605.118783]  [<ffffffff810d7e33>] ? do_kern_mount+0x53/0x130
> > [  605.118942]  [<ffffffff810f15b9>] ? do_mount+0x2d9/0x840
> > [  605.119100]  [<ffffffff810ab7eb>] ? memdup_user+0x3b/0x80
> > [  605.119257]  [<ffffffff810f1bba>] ? sys_mount+0x9a/0x100
> > [  605.119417]  [<ffffffff81002d7b>] ? system_call_fastpath+0x16/0x1b
> > [  605.119579] Code: ff ff ff 44 29 e2 31 c0 41 89 57 08 e9 7b fe ff ff 48
> 83 63 18 f7 e9 44 ff ff ff 66 0f 1f 44 00 00 48 83 ec 08 48 89 f8 41 89 c8
> <48> 8b 4f 10 48 8b 89 98 00 00 00 48 8b b9 f0 01 00 00 89 d1 44 
> > [  605.120217] RIP  [<ffffffff81100a7a>] bio_add_page+0xa/0x40
> > [  605.120384]  RSP <ffff8801833b39b8>
> > [  605.120527] CR2: 0000000000000010
> > [  605.121058] ---[ end trace a5eba365422d1ba8 ]---
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Comment 4 Chris Mason 2011-02-18 19:16:46 UTC
Excerpts from David Sterba's message of 2011-02-18 12:43:15 -0500:

[ great analysis, thanks so much ]

> 
> Could be, that any of the devices from 'head' list does not satisfy condition
> on line 619, but it cannot be the first device, as !latest_transid would
> work. There is only one device, /dev/sdb, latest_bdev is set and later on
> set on line 660.
> 
> /me sees no more options
> 
> Meanwhile I've tried it myself and the error does not happen here, with
> head at 795abaf1e4e185 (.38-rc4-178-g795abaf). I'll try it with latest -rc5.
> 
> > > [  605.110089] PGD 277d70067 PUD 277e0a067 PMD 0 
> > > [  605.110247] Oops: 0000 [#1] SMP 

Are there any more kernel messages involved before the oops starts?  It
really feels like we're failing to open the block device somehow inside
the btrfs scanning code.

-chris
Comment 5 Andrew Morton 2011-02-18 19:57:03 UTC
On Fri, 18 Feb 2011 14:16:12 -0500
Chris Mason <chris.mason@oracle.com> wrote:

> Are there any more kernel messages involved before the oops starts? 

The full dmesg is in bugzilla. https://bugzilla.kernel.org/show_bug.cgi?id=29302
Comment 6 Chris Mason 2011-02-18 20:21:11 UTC
Excerpts from Andrew Morton's message of 2011-02-18 14:56:03 -0500:
> On Fri, 18 Feb 2011 14:16:12 -0500
> Chris Mason <chris.mason@oracle.com> wrote:
> 
> > Are there any more kernel messages involved before the oops starts? 
> 
> The full dmesg is in bugzilla.
> https://bugzilla.kernel.org/show_bug.cgi?id=29302

Ok, so it isn't part of the open devices code that prints errors, my
guess is we're failing to read a good super.

Could you please mkfs.btrfs /dev/xxx, sync, then btrfsck /dev/xxx, I want
to make sure things are really getting written.

Here's a patch that makes sure we have a good bdev after scanning,
hopefully it will let us debug things without your box going boom.

-chris

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d39a989..d8f7ee0 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -757,6 +757,10 @@ static struct dentry *btrfs_mount(struct file_system_type *fs_type, int flags,
 	tree_root->fs_info = fs_info;
 
 	bdev = fs_devices->latest_bdev;
+	if (!bdev) {
+		printk(KERN_CRIT "btrfs read devices failed on %s\n", dev_name);
+		goto error_close_devices;
+	}
 	s = sget(fs_type, btrfs_test_super, btrfs_set_super, tree_root);
 	if (IS_ERR(s))
 		goto error_s;
Comment 7 Anonymous Emailer 2011-02-20 12:00:34 UTC
Reply-To: tomas@zvala.cz

On 18.2.2011 21:18, Chris Mason wrote:
> Ok, so it isn't part of the open devices code that prints errors, my
> guess is we're failing to read a good super.
>
> Could you please mkfs.btrfs /dev/xxx, sync, then btrfsck /dev/xxx, I want
> to make sure things are really getting written.
>
> Here's a patch that makes sure we have a good bdev after scanning,
> hopefully it will let us debug things without your box going boom.
>
> -chris
>
>
Hello Chris,
I patched the kernel, recompiled and installed, but I don't see the 
message in the kernel log.  I forgot to mention before, that 
max_hw_sectors_kb and thus max_sectors_kb is 4096.

Linux stg-top 2.6.37stg #7 SMP Sun Feb 20 10:48:57 CET 2011 x86_64 
Intel(R) Xeon(R) CPU E5620 @ 2.40GHz GenuineIntel GNU/Linux


With default max_sectors_kb = 512
mkfs, sync, btrfsck going through fine
first mount is fine

[112005]stg-top ~ # cat /sys/block/sdb/queue/max_sectors_kb
512
[112012]stg-top ~ # mkfs.btrfs /dev/sdb

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdb
         nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
Btrfs Btrfs v0.19
[112045]stg-top ~ # sync
[112046]stg-top ~ # btrfsck /dev/sdb
found 28672 bytes used err is 0
total csum bytes: 0
total tree bytes: 28672
total fs tree bytes: 8192
btree space waste bytes: 23875
file data blocks allocated: 0
  referenced 0
Btrfs Btrfs v0.19
[112050]stg-top ~ # mount /dev/sdb /mnt/
[112054]stg-top ~ # df /mnt/
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sdb             2929685760        56 2927559936   1% /mnt
[112057]stg-top ~ # umount /mnt


With max_sectors_kb = 4096
mkfs, sync, btrfsck is fine
second mount  BUGs out

[112100]stg-top ~ # echo 4096 > /sys/block/sdb/queue/max_sectors_kb
[112107]stg-top ~ # mkfs.btrfs /dev/sdb

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdb
         nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
Btrfs Btrfs v0.19
[112113]stg-top ~ # sync
[112117]stg-top ~ # btrfsck /dev/sdb
found 28672 bytes used err is 0
total csum bytes: 0
total tree bytes: 28672
total fs tree bytes: 8192
btree space waste bytes: 23875
file data blocks allocated: 0
  referenced 0
Btrfs Btrfs v0.19
[112123]stg-top ~ # mount /dev/sdb /mnt/
Killed

Back to max_sectors_kb = 512
mkfs, can't sync, because that'd block, btrfsck is fine
third mount fails

[112126]stg-top ~ # echo 512 > /sys/block/sdb/queue/max_sectors_kb
[112141]stg-top ~ # mkfs.btrfs /dev/sdb

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdb
         nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
Btrfs Btrfs v0.19
[112144]stg-top ~ # btrfs
btrfs          btrfs-bcp      btrfs-convert  btrfs-image    
btrfs-show     btrfs-vol      btrfsck        btrfsctl       btrfstune
[112144]stg-top ~ # btrfsck /dev/sdb
found 28672 bytes used err is 0
total csum bytes: 0
total tree bytes: 28672
total fs tree bytes: 8192
btree space waste bytes: 23875
file data blocks allocated: 0
  referenced 0
Btrfs Btrfs v0.19
[112150]stg-top ~ # mount /dev/sdb /mnt
mount: wrong fs type, bad option, bad superblock on /dev/sdb,
        missing codepage or helper program, or other error
        In some cases useful info is found in syslog - try
        dmesg | tail  or so

Run sync in background (it just sits there) and try to mount again (no 
other commands in meantime)

[112155]stg-top ~ # sync &
[1] 15800
[112202]stg-top ~ # mount /dev/sdb /mnt
[112411]stg-top ~ # df /mnt/
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sdb             2929685760        56 2927559936   1% /mnt
[112456]stg-top ~ #

(dmesg after 1st mount)
[  122.085837] device fsid 8d440bbadcd5b357-9e8f05f44e926498 devid 1 
transid 7 /dev/sdb

(dmesg after 2nd mount is pretty much the same as before - no mention of 
the patch)

(dmesg after 3rd mount)
[  183.535484] device fsid 945db83dbf98e60-a27fadce8932f892 devid 1 
transid 7 /dev/sdb
[  183.536785] btrfs bad fsid on block 20971520
[  183.537041] btrfs bad fsid on block 20971520
[  183.537268] btrfs bad fsid on block 20971520
[  183.537469] btrfs: failed to read chunk root on sdb
[  183.545654] btrfs: open_ctree failed

(dmesg after 4th mount)
[  192.481985] device fsid 945db83dbf98e60-a27fadce8932f892 devid 1 
transid 7 /dev/sdb


Tomas
Comment 8 Chris Mason 2011-02-21 15:09:15 UTC
Excerpts from Tomas Zvala's message of 2011-02-20 06:06:42 -0500:
> On 18.2.2011 21:18, Chris Mason wrote:
> > Ok, so it isn't part of the open devices code that prints errors, my
> > guess is we're failing to read a good super.
> >
> > Could you please mkfs.btrfs /dev/xxx, sync, then btrfsck /dev/xxx, I want
> > to make sure things are really getting written.
> >
> > Here's a patch that makes sure we have a good bdev after scanning,
> > hopefully it will let us debug things without your box going boom.
> >
> > -chris
> >
> >
> Hello Chris,
> I patched the kernel, recompiled and installed, but I don't see the 
> message in the kernel log.  I forgot to mention before, that 
> max_hw_sectors_kb and thus max_sectors_kb is 4096.

ok, more below.

> 
> Linux stg-top 2.6.37stg #7 SMP Sun Feb 20 10:48:57 CET 2011 x86_64 
> Intel(R) Xeon(R) CPU E5620 @ 2.40GHz GenuineIntel GNU/Linux
> 
> 
> With default max_sectors_kb = 512
> mkfs, sync, btrfsck going through fine
> first mount is fine
> 
> [112005]stg-top ~ # cat /sys/block/sdb/queue/max_sectors_kb
> 512
> [112012]stg-top ~ # mkfs.btrfs /dev/sdb
> 
> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
> WARNING! - see http://btrfs.wiki.kernel.org before using
> 
> fs created label (null) on /dev/sdb
>          nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
> Btrfs Btrfs v0.19
> [112045]stg-top ~ # sync
> [112046]stg-top ~ # btrfsck /dev/sdb
> found 28672 bytes used err is 0
> total csum bytes: 0
> total tree bytes: 28672
> total fs tree bytes: 8192
> btree space waste bytes: 23875
> file data blocks allocated: 0
>   referenced 0
> Btrfs Btrfs v0.19
> [112050]stg-top ~ # mount /dev/sdb /mnt/
> [112054]stg-top ~ # df /mnt/
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sdb             2929685760        56 2927559936   1% /mnt
> [112057]stg-top ~ # umount /mnt
> 
> 
> With max_sectors_kb = 4096
> mkfs, sync, btrfsck is fine
> second mount  BUGs out
> 
> [112100]stg-top ~ # echo 4096 > /sys/block/sdb/queue/max_sectors_kb
> [112107]stg-top ~ # mkfs.btrfs /dev/sdb
> 
> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
> WARNING! - see http://btrfs.wiki.kernel.org before using
> 
> fs created label (null) on /dev/sdb
>          nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
> Btrfs Btrfs v0.19
> [112113]stg-top ~ # sync
> [112117]stg-top ~ # btrfsck /dev/sdb
> found 28672 bytes used err is 0
> total csum bytes: 0
> total tree bytes: 28672
> total fs tree bytes: 8192
> btree space waste bytes: 23875
> file data blocks allocated: 0
>   referenced 0
> Btrfs Btrfs v0.19
> [112123]stg-top ~ # mount /dev/sdb /mnt/
> Killed

At this point, your box is broken.  You have to reboot before anything
new is valid (sorry).  Could you please:

Set max_sectors_kb to 4096
mkfs.btrfs /dev/xxx
sync
echo 3 > /proc/sys/vm/drop_caches
btrfsck /dev/xxx

Don't try and mount, clearly my patch isn't saving us.

-chris
Comment 9 Anonymous Emailer 2011-02-28 12:31:55 UTC
Reply-To: tomas@zvala.cz

On 02/21/2011 04:08 PM, Chris Mason wrote:
> Excerpts from Tomas Zvala's message of 2011-02-20 06:06:42 -0500:
>> On 18.2.2011 21:18, Chris Mason wrote:
>>> Ok, so it isn't part of the open devices code that prints errors, my
>>> guess is we're failing to read a good super.
>>>
>>> Could you please mkfs.btrfs /dev/xxx, sync, then btrfsck /dev/xxx, I want
>>> to make sure things are really getting written.
>>>
>>> Here's a patch that makes sure we have a good bdev after scanning,
>>> hopefully it will let us debug things without your box going boom.
>>>
>>> -chris
>>>
>>>
>> Hello Chris,
>> I patched the kernel, recompiled and installed, but I don't see the
>> message in the kernel log.  I forgot to mention before, that
>> max_hw_sectors_kb and thus max_sectors_kb is 4096.
> ok, more below.
>
>> Linux stg-top 2.6.37stg #7 SMP Sun Feb 20 10:48:57 CET 2011 x86_64
>> Intel(R) Xeon(R) CPU E5620 @ 2.40GHz GenuineIntel GNU/Linux
>>
>>
>> With default max_sectors_kb = 512
>> mkfs, sync, btrfsck going through fine
>> first mount is fine
>>
>> [112005]stg-top ~ # cat /sys/block/sdb/queue/max_sectors_kb
>> 512
>> [112012]stg-top ~ # mkfs.btrfs /dev/sdb
>>
>> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
>> WARNING! - see http://btrfs.wiki.kernel.org before using
>>
>> fs created label (null) on /dev/sdb
>>           nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
>> Btrfs Btrfs v0.19
>> [112045]stg-top ~ # sync
>> [112046]stg-top ~ # btrfsck /dev/sdb
>> found 28672 bytes used err is 0
>> total csum bytes: 0
>> total tree bytes: 28672
>> total fs tree bytes: 8192
>> btree space waste bytes: 23875
>> file data blocks allocated: 0
>>    referenced 0
>> Btrfs Btrfs v0.19
>> [112050]stg-top ~ # mount /dev/sdb /mnt/
>> [112054]stg-top ~ # df /mnt/
>> Filesystem           1K-blocks      Used Available Use% Mounted on
>> /dev/sdb             2929685760        56 2927559936   1% /mnt
>> [112057]stg-top ~ # umount /mnt
>>
>>
>> With max_sectors_kb = 4096
>> mkfs, sync, btrfsck is fine
>> second mount  BUGs out
>>
>> [112100]stg-top ~ # echo 4096>  /sys/block/sdb/queue/max_sectors_kb
>> [112107]stg-top ~ # mkfs.btrfs /dev/sdb
>>
>> WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
>> WARNING! - see http://btrfs.wiki.kernel.org before using
>>
>> fs created label (null) on /dev/sdb
>>           nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
>> Btrfs Btrfs v0.19
>> [112113]stg-top ~ # sync
>> [112117]stg-top ~ # btrfsck /dev/sdb
>> found 28672 bytes used err is 0
>> total csum bytes: 0
>> total tree bytes: 28672
>> total fs tree bytes: 8192
>> btree space waste bytes: 23875
>> file data blocks allocated: 0
>>    referenced 0
>> Btrfs Btrfs v0.19
>> [112123]stg-top ~ # mount /dev/sdb /mnt/
>> Killed
> At this point, your box is broken.  You have to reboot before anything
> new is valid (sorry).  Could you please:
>
> Set max_sectors_kb to 4096
> mkfs.btrfs /dev/xxx
> sync
> echo 3>  /proc/sys/vm/drop_caches
> btrfsck /dev/xxx
>
> Don't try and mount, clearly my patch isn't saving us.
>
> -chris
>
Hello Chris,
I'm sorry it took so long, I've been a bit busy lately ...

[132926]stg-bot linux-2.6.37-scst # cat 
/sys/block/sdb/queue/max_hw_sectors_kb > /sys/block/sdb/queue/max_sectors_kb
[132932]stg-bot linux-2.6.37-scst # mkfs.btrfs /dev/sdb

WARNING! - Btrfs Btrfs v0.19 IS EXPERIMENTAL
WARNING! - see http://btrfs.wiki.kernel.org before using

fs created label (null) on /dev/sdb
     nodesize 4096 leafsize 4096 sectorsize 4096 size 2.73TB
Btrfs Btrfs v0.19
[132945]stg-bot linux-2.6.37-scst # sync
[132948]stg-bot linux-2.6.37-scst # echo 3 > /proc/sys/vm/drop_caches
[132958]stg-bot linux-2.6.37-scst # btrfsc
btrfsck   btrfsctl
[132958]stg-bot linux-2.6.37-scst # btrfsck /dev/sdb
found 28672 bytes used err is 0
total csum bytes: 0
total tree bytes: 28672
total fs tree bytes: 8192
btree space waste bytes: 23875
file data blocks allocated: 0
  referenced 0
Btrfs Btrfs v0.19


And still cannot mount.


Tomas
Comment 10 Alan 2013-12-11 12:28:21 UTC

Closing as obsolete, if this is still seen on modern kernels please update

Note You need to log in before you can comment on or make changes to this bug.