Bug 16393
Summary: | kernel BUG at fs/block_dev.c:765! | ||
---|---|---|---|
Product: | File System | Reporter: | Maciej Rutecki (maciej.rutecki) |
Component: | Other | Assignee: | Tejun Heo (tj) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | cebbert, florian, Larry.Finger, maciej.rutecki, markus, rjw, tj |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.35-rc5 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 16055 | ||
Attachments: |
dmesg
dmesg full fdisk -l parted -l Full dmesg showing warning in mark_inode_dirty for bdi-block fdisk -l output fix-bd_prepare_to_claim |
Description
Maciej Rutecki
2010-07-14 19:50:49 UTC
It happened again today. See attachment. Created attachment 27217 [details]
dmesg
On Friday, July 23, 2010, Markus Trippelsdorf wrote:
> On Fri, Jul 23, 2010 at 01:47:36PM +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a summary report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.34. Please verify if it still should be listed and let the
> tracking team
> > know (either way).
>
> It happend again today:
>
> ...
> Jul 23 09:22:00 arch kernel: r8169 0000:02:05.0: eth0: link up
> Jul 23 09:22:00 arch kernel: r8169: WARNING! Changing of MTU on this NIC may
> lead to frame reception errors!
> Jul 23 13:41:43 arch kernel: ------------[ cut here ]------------
> Jul 23 13:41:43 arch kernel: kernel BUG at fs/block_dev.c:765!
> Jul 23 13:41:43 arch kernel: invalid opcode: 0000 [#1] SMP
> Jul 23 13:41:43 arch kernel: last sysfs file:
> /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
> Jul 23 13:41:43 arch kernel: CPU 1
> Jul 23 13:41:43 arch kernel: Pid: 2616, comm: mount Not tainted
> 2.6.35-rc6-00012-g7e5a57e #32 M4A78T-E/System Product Name
> Jul 23 13:41:43 arch kernel: RIP: 0010:[<ffffffff810f66c7>]
> [<ffffffff810f66c7>] __bd_abort_claiming+0x17/0x40
> Jul 23 13:41:43 arch kernel: RSP: 0018:ffff880035d07d18 EFLAGS: 00010207
> Jul 23 13:41:43 arch kernel: RAX: 0000000000009292 RBX: ffff88011f8249c0 RCX:
> 0000000000000002
> Jul 23 13:41:43 arch kernel: RDX: 0000000000000003 RSI: ffffffff8182f0a0 RDI:
> ffff88011f8249c0
> Jul 23 13:41:43 arch kernel: RBP: ffff880035d07d28 R08: ffff880035d07b68 R09:
> ffffffff81877ea8
> Jul 23 13:41:43 arch kernel: R10: ffff880035d07cb8 R11: 0000000000000010 R12:
> ffff88011f8249c0
> Jul 23 13:41:43 arch kernel: R13: 0000000000000001 R14: ffffffff8182f0a0 R15:
> 00000000ffffff85
> Jul 23 13:41:43 arch kernel: FS: 00007f727e6a8740(0000)
> GS:ffff880001a80000(0000) knlGS:0000000000000000
> Jul 23 13:41:43 arch kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
> 000000008005003b
> Jul 23 13:41:43 arch kernel: CR2: 00007f99480f2000 CR3: 000000003c34d000 CR4:
> 00000000000006e0
> Jul 23 13:41:43 arch kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> Jul 23 13:41:43 arch kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> Jul 23 13:41:43 arch kernel: Process mount (pid: 2616, threadinfo
> ffff880035d06000, task ffff88011afe0000)
> Jul 23 13:41:43 arch kernel: Stack:
> Jul 23 13:41:43 arch kernel: ffff880035d07d28 ffffffffffffff85
> ffff880035d07d68 ffffffff810f7b6c
> Jul 23 13:41:43 arch kernel: <0> ffff880035d07d98 0000000000000001
> ffffffff8182f0a0 0000000000000001
> Jul 23 13:41:43 arch kernel: <0> ffff88003c3c5000 ffffffff8182f0a0
> ffff880035d07de8 ffffffff810d3daa
> Jul 23 13:41:43 arch kernel: Call Trace:
> Jul 23 13:41:43 arch kernel: [<ffffffff810f7b6c>]
> open_bdev_exclusive+0xb3/0xc3
> Jul 23 13:41:43 arch kernel: [<ffffffff810d3daa>] get_sb_bdev+0x47/0x178
> Jul 23 13:41:43 arch kernel: [<ffffffff8113121c>] ?
> ext4_fill_super+0x0/0x208d
> Jul 23 13:41:43 arch kernel: [<ffffffff8112e6ba>] ext4_get_sb+0x18/0x1a
> Jul 23 13:41:43 arch kernel: [<ffffffff810d40af>] vfs_kern_mount+0x62/0xff
> Jul 23 13:41:43 arch kernel: [<ffffffff810d431b>] do_kern_mount+0x4d/0xef
> Jul 23 13:41:43 arch kernel: [<ffffffff810e86ad>] do_mount+0x6cb/0x744
> Jul 23 13:41:43 arch kernel: [<ffffffff810e87fb>] sys_mount+0x8c/0xcd
> Jul 23 13:41:43 arch kernel: [<ffffffff8102b93b>]
> system_call_fastpath+0x16/0x1b
> Jul 23 13:41:43 arch kernel: Code: c7 c3 f4 ff ff ff 5Jul 23 13:42:38 arch
> kernel: Linux version 2.6.35-rc6-00012-g7e5a57e (markus@arch.tripp.de) (gcc
> version 4.5.0 20100610 (prerelease) (GCC) ) #32 SMP Fri Jul 23 09:21:31 CEST
> 2010
> Jul 23 13:42:38 arch kernel: Command line:
> BOOT_IMAGE=/usr/src/linux/arch/x86/boot/bzImage root=/dev/sdb2 fbcon=rotate:3
> drm.edid_force_checksum=1 quiet
> Jul 23 13:42:38 arch kernel: BIOS-provided physical RAM map:
Can you please attach boot log of a successful boot and the output of "fdisk -l"? Thanks. Sure. See new attachments. Thanks. Created attachment 27320 [details]
dmesg full
Created attachment 27321 [details]
fdisk -l
Created attachment 27322 [details]
parted -l
Created attachment 27325 [details]
Full dmesg showing warning in mark_inode_dirty for bdi-block
This dmesg is for my machine, which shows the warning at fs/fs-writeback.c:964 - a different location than for the OP.
Created attachment 27326 [details]
fdisk -l output
On my system, / is mounted on /dev/sda1 and /home is on /dev/sda3. Partition /dev/sda5 is not mounted.
Patch posted upstream. http://thread.gmane.org/gmane.linux.acpi.devel/46852/focus=60778 Thanks. Created attachment 27349 [details]
fix-bd_prepare_to_claim
Patch attached. Can you please verify the fix?
Thanks.
Thanks Tejun. The bug happens very rarely (once every 3-4 weeks on this machine) so it will be hard to verify. But the fix looks obviously right. (BTW the problem was reported by me, Maciej just opened this bug report) Oops, sorry about that. Updated. Unfortunately, my box that shows the problem had a disk drive failure, and will be going into the shop today for warranty repair. Nothing lost and it will not take long to do the test when I get it back as I was already running with a replacement drive, but Best Buy owes me a 250 GB 2.5 inch disk. Should be fixed by Tejun's patch, now upstream: commit e75aa85892b2ee78c79edac720868cbef16e62eb "block_dev: always serialize exclusive open attempts" This commit also went into 2.6.35.1 Also reported to be fixed by Markus: On Mon, 30 Aug 2010 02:56:04 +0200 Markus Trippelsdorf <markus@trippelsdorf.de> wrote: > On Mon, Aug 30, 2010 at 01:13:29AM +0200, Rafael J. Wysocki wrote: > > This message has been generated automatically as a part of a report > > of regressions introduced between 2.6.34 and 2.6.35. > > > > The following bug entry is on the current list of known regressions > > introduced between 2.6.34 and 2.6.35. Please verify if it still should > > be listed and let the tracking team know (either way). > > > > > > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16393 > > Subject : kernel BUG at fs/block_dev.c:765! > > Submitter : Markus Trippelsdorf <markus@trippelsdorf.de> > > Date : 2010-07-14 13:52 (47 days old) > > Message-ID : <20100714135217.GA1797@arch.tripp.de> > > References : http://marc.info/?l=linux-kernel&m=127911564213748&w=2 > > > > This bug is already fixed by: > e75aa85892b2ee78c79edac720868cbef16e62eb > > Please close this entry. > Thanks. > Commit e75aa85892b2ee78c79edac720868cbef16e62eb does not fix the problem on my system. Running 2.6.36-rc3, I get the following traceback: [ 12.192109] ------------[ cut here ]------------ [ 12.192118] WARNING: at fs/fs-writeback.c:972 __mark_inode_dirty+0x1d4/0x1e0() [ 12.192122] Hardware name: HP Pavilion dv2700 Notebook PC [ 12.192124] bdi-block not registered [ 12.192126] Modules linked in: loop(+) dm_mod ide_cd_mod cdrom snd_hda_codec_conexant ide_pci_generic snd_hda_intel snd_hda_codec r8712u(C) amd74xx snd_pcm ide_core snd_timer snd k8temp sg joydev snd_page_alloc forcedeth hwmon i2c_nforce2 serio_raw i2c_core battery ac button ext3 jbd mbcache ohci_hcd sd_mod ehci_hcd usbcore fan processor ahci libahci libata scsi_mod thermal [ 12.192164] Pid: 2338, comm: udisks-part-id Tainted: G C 2.6.36-rc3-realtek+ #7 [ 12.192167] Call Trace: [ 12.192174] [<ffffffff8104614a>] warn_slowpath_common+0x7a/0xb0 [ 12.192178] [<ffffffff81046221>] warn_slowpath_fmt+0x41/0x50 [ 12.192183] [<ffffffff81325584>] ? _raw_spin_lock+0x34/0x40 [ 12.192186] [<ffffffff8112ded5>] ? __mark_inode_dirty+0x65/0x1e0 [ 12.192190] [<ffffffff8112e044>] __mark_inode_dirty+0x1d4/0x1e0 [ 12.192195] [<ffffffff811210bd>] touch_atime+0x12d/0x170 [ 12.192201] [<ffffffff810cbb61>] generic_file_aio_read+0x5c1/0x720 [ 12.192205] [<ffffffff810e7fbd>] ? handle_mm_fault+0x75d/0x950 [ 12.192211] [<ffffffff8110a862>] do_sync_read+0xd2/0x110 [ 12.192214] [<ffffffff8107a4a0>] ? mark_held_locks+0x70/0xa0 [ 12.192218] [<ffffffff8132436f>] ? __mutex_unlock_slowpath+0xbf/0x150 [ 12.192222] [<ffffffff8107a6cd>] ? trace_hardirqs_on+0xd/0x10 [ 12.192226] [<ffffffff8110af93>] vfs_read+0xc3/0x170 [ 12.192229] [<ffffffff8110b08c>] sys_read+0x4c/0x80 [ 12.192234] [<ffffffff81002d2b>] system_call_fastpath+0x16/0x1b [ 12.192237] ---[ end trace 9b3df7d4a57af8dc ]--- [ 12.198091] loop: module loaded Should this one be reopened, or should I post a new bug? Although mine has the same symptom as others have posted, it is likely to arise from a different cause. This is probably up to Tejun. I've reopened it in the meantime, so we don't forget about it. Larry, you're hitting a different WARN_ON(). Your symptom is not the same at all. :-) Please file a separate bug report. Thanks. Closed. The new bug report is #17461. |