Bug 41102
Summary: | BUG dentry: Poison overwritten / BUG buffer_head: Poison overwritten | ||
---|---|---|---|
Product: | Memory Management | Reporter: | Paul Bolle (pebolle) |
Component: | Slab Allocator | Assignee: | Andrew Morton (akpm) |
Status: | NEW --- | ||
Severity: | normal | CC: | alan, florian, hart3778avery, maciej.rutecki, michael2012zhao, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.0.1 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Paul Bolle
2011-08-13 22:29:57 UTC
I marked this as a regression. I *think* it's a 2.6.39->3.0 regression? (In reply to comment #1) > I marked this as a regression. I *think* it's a 2.6.39->3.0 regression? 0) That might well be correct. But please note that the 3.0.1 kernel I triggered this issue with seems to be the first I ran on this machine that has CONFIG_SLUB_DEBUG_ON set. So on this machine this issue - if it's more than a one time goof - would never have showed up in the logs unless I manually booted with "slub_debug" on the command line. And I can't remember booting with "slub_debug". (1) Background: mm/slub.c:check_bytes_and_report() and work your way up in code and CONFIG_* options.) (2) Uninteresting detail: I changed the way I built the kernel for this machine - which still tracks Fedora 14 - after it was made clear that v2.6.39.4 would be the last stable release for v2.6.39. I now sort of rebuilt a Fedora Rawhide kernel, but dropping Fedora's non-upstreamed patches.) On Tuesday, August 30, 2011, Paul Bolle wrote:
> On Sun, 2011-08-28 at 21:01 +0200, Rafael J. Wysocki wrote:
> > Please verify if it still should
> > be listed and let the tracking team know (either way).
>
> From comment #2:
> > [...] please note that the 3.0.1 kernel I
> > triggered this issue with seems to be the first I ran on this machine that
> has
> > CONFIG_SLUB_DEBUG_ON set. So on this machine this issue - if it's more than
> a
> > one time goof - would never have showed up in the logs unless I manually
> booted
> > with "slub_debug" on the command line. And I can't remember booting with
> > "slub_debug".
>
> At this moment I see little reason to track this bug entry as a
> regression.
Dropping from the list of recent regressions. (0) This is a copy of a message I sent directly to one of the people tracking this bug on Sep 11, 2011, since kernel.org was down at that time. I forgot to update this report once bugzilla.kernel.org was up again. I add this to this report on the odd chance that someone running v3.0.x runs into something similar.) 1) I've hit this again (that is, the first of the free "Poison overwritten" messages). Thawing from hibernation seems to be a possible cause of this. Kernel is now v3.0.4. So it seems not to be just a one time goof and might even be reproducible. (I haven't yet dared to reproduce this.) 2) And this was what I found in the logs: ============================================================================= BUG dentry: Poison overwritten ----------------------------------------------------------------------------- INFO: 0xffff88012f6ef000-0xffff88012f6ef01f. First byte 0x0 instead of 0x6b INFO: Allocated in d_alloc+0x27/0x1b3 age=8009211 cpu=1 pid=9833 INFO: Freed in __d_free+0x59/0x5e age=10700 cpu=1 pid=13756 INFO: Slab 0xffffea0004260448 objects=12 used=5 fp=0xffff88012f6ef000 flags=0x400000000000c1 INFO: Object 0xffff88012f6ef000 @offset=0 fp=0xffff88012f6efbd0 Object 0xffff88012f6ef000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object 0xffff88012f6ef010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object 0xffff88012f6ef020: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef030: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef040: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef050: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef060: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef070: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef090: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef0a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef0b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef0c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef0d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef0e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef0f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk Object 0xffff88012f6ef100: 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkk� Redzone 0xffff88012f6ef108: bb bb bb bb bb bb bb bb �������� Padding 0xffff88012f6ef148: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ Pid: 14674, comm: pm-powersave Not tainted 3.0.4-local0.fc14.x86_64 #1 Call Trace: [<ffffffff8112f7db>] print_trailer+0x133/0x13c [<ffffffff8112faf8>] check_bytes_and_report+0xc1/0xff [<ffffffff8113088d>] check_object+0xc5/0x1b0 [<ffffffff81155a26>] ? d_alloc+0x27/0x1b3 [<ffffffff81130b92>] alloc_debug_processing+0x7f/0x120 [<ffffffff81131bb5>] __slab_alloc+0x302/0x372 [<ffffffff81155a26>] ? d_alloc+0x27/0x1b3 [<ffffffff8104a356>] ? should_resched+0xe/0x2e [<ffffffff81155a26>] ? d_alloc+0x27/0x1b3 [<ffffffff81132f6b>] kmem_cache_alloc+0x53/0x105 [<ffffffff81156aa0>] ? __d_lookup+0xf8/0x10a [<ffffffff81155a26>] d_alloc+0x27/0x1b3 [<ffffffff8114c785>] d_alloc_and_lookup+0x2c/0x6d [<ffffffff8114db93>] walk_component+0x1ea/0x3da [<ffffffff8114cec4>] ? handle_dots+0x218/0x218 [<ffffffff8114f1fa>] path_lookupat+0xae/0x34b [<ffffffff81252fb1>] ? __strncpy_from_user+0x1f/0x4e [<ffffffff8114f4c1>] do_path_lookup+0x2a/0x99 [<ffffffff8114f8fb>] user_path_at+0x56/0x93 [<ffffffff8107eb40>] ? up_read+0x2b/0x33 [<ffffffff814f2b9a>] ? do_page_fault+0x31e/0x3a8 [<ffffffff8110ffcc>] ? might_fault+0x5c/0xac [<ffffffff81147227>] vfs_fstatat+0x49/0x74 [<ffffffff8108f8e7>] ? lock_release+0x18a/0x1b2 [<ffffffff8114728d>] vfs_stat+0x1b/0x1d [<ffffffff811473a3>] sys_newstat+0x1f/0x39 [<ffffffff8114c5a1>] ? path_put+0x22/0x26 [<ffffffff8110ffcc>] ? might_fault+0x5c/0xac [<ffffffff810b4949>] ? audit_syscall_entry+0x11c/0x148 [<ffffffff81252d1e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff814f62c2>] system_call_fastpath+0x16/0x1b FIX dentry: Restoring 0xffff88012f6ef000-0xffff88012f6ef01f=0x6b FIX dentry: Marking all objects used (0) And this is a copy of a message I sent directly to one of the people tracking this bug on Sep 20, 2011, since kernel.org was still down at that time. I again add this to this report on the odd chance that someone running v3.0.x runs into something similar.) 1) I've just hit an almost identical message, and again after thawing from hibernation. Kernel is still now v3.0.4. So it really seems to be reproducible by thawing from hibernation. We're making progress, I'd guess. (2) But I have to note that I've stopped using hibernation on this machine some time after sending this message. I seem to remember running into filesystem trouble after thawing. The filesystem concerned gave me a Desktop Manager with one of its executables having the contents of some PNG. Try debugging that! Anyhow, this machine has an Intel 965M chipset, and so it uses the i915 module. Perhaps this all is related to the issues fixed with commit 3fa016a0b5c5237e9c387fc3249592b2cb5391c6 ("drm/i915: suspend fbdev device around suspend/hibernate"), which shipped in v3.4-rc1 and later. See bug #37142 for some background. Perhaps I'll try again once this machine is running v3.4.) Did 3.4 help ? (In reply to comment #7) 0) Thanks for taking the time take a look at this report. I, of course, had already forgotten it. > Did 3.4 help ? 1) I'd have to check. I don't think I ever ran v3.4 with SLUB_DEBUG_ON set. Nor did I ever use the slub_debug parameter. And, more importantly, after filing this report I stopped using hibernation (see comment #6). 2) So feel free to close this report (with insufficient data or whatever). 3) In the mean time I'll have to think whether I again dare to hibernate this machine. And if I do dare to do that, I'll do so with "slub_debug" set. Then we'll see whether v3.4 helped. If not, I could always reopen this report. |