Hi Linus, sorry to say that, but following commit caused troubles to my machine, which in 2.6.35.2 begans to freeze, crash, vomit traces... My bisection shows: 52423b90e1f5b1bdbbcc6e32f4d37ada29b790c4 is the first bad commit commit 52423b90e1f5b1bdbbcc6e32f4d37ada29b790c4 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu Aug 12 17:54:33 2010 -0700 mm: keep a guard page below a grow-down stack segment commit 320b2b8de12698082609ebbc1a17165727f4c893 upstream. All the following kernels are tested and affected: * 2.6.35.2 * 2.6.34.4 * 2.6.32.19 Carrot for search engines: [ 112.320284] CPUFREQ: Per core ondemand sysfs interface is deprecated - up_threshold [ 182.864528] BUG: scheduling while atomic: bash/6444/0x10000001 [ 182.864533] Modules linked in: cpufreq_stats nouveau ttm drm_kms_helper drm i2c_algo_bit serio_raw [last unloaded: scsi_wait_scan] [ 182.864546] Pid: 6444, comm: bash Not tainted 2.6.35.2 #1 [ 182.864549] Call Trace: [ 182.864560] [<c13e65cb>] ? schedule+0x46b/0x540 [ 182.864566] [<c13e8879>] ? apic_timer_interrupt+0x31/0x38 [ 182.864573] [<c103007b>] ? sched_debug_show+0x85b/0xe00 [ 182.864577] [<c13e6779>] ? _cond_resched+0x29/0x40 [ 182.864583] [<c10a6c55>] ? kmem_cache_alloc+0x75/0xa0 [ 182.864589] [<c109b701>] ? anon_vma_prepare+0xe1/0x110 [ 182.864594] [<c109733c>] ? expand_downwards+0x1c/0x160 [ 182.864600] [<c1025c33>] ? pte_alloc_one+0x33/0x40 [ 182.864604] [<c109534f>] ? handle_mm_fault+0x9af/0xa20 [ 182.864609] [<c10954d9>] ? __get_user_pages+0x119/0x3d0 [ 182.864614] [<c10b189e>] ? get_arg_page+0x4e/0xb0 [ 182.864619] [<c1196d93>] ? strnlen_user+0x23/0x50 [ 182.864623] [<c10b19cc>] ? copy_strings+0xcc/0x190 [ 182.864628] [<c10b1ab1>] ? copy_strings_kernel+0x21/0x40 [ 182.864633] [<c10b3291>] ? do_execve+0x191/0x260 [ 182.864639] [<c100a2a3>] ? sys_execve+0x33/0x80 [ 182.864643] [<c1002cfa>] ? ptregs_execve+0x12/0x18 [ 182.864647] [<c1002c97>] ? sysenter_do_call+0x12/0x26 [ 687.818752] BUG: scheduling while atomic: make/12066/0x10000001 [ 687.818757] Modules linked in: cpufreq_stats nouveau ttm drm_kms_helper drm i2c_algo_bit serio_raw [last unloaded: scsi_wait_scan] [ 687.818771] Pid: 12066, comm: make Not tainted 2.6.35.2 #1 [ 687.818774] Call Trace: [ 687.818784] [<c13e65cb>] ? schedule+0x46b/0x540 [ 687.818791] [<c13e8879>] ? apic_timer_interrupt+0x31/0x38 [ 687.818795] [<c13e6779>] ? _cond_resched+0x29/0x40 [ 687.818801] [<c109b63d>] ? anon_vma_prepare+0x1d/0x110 [ 687.818806] [<c109733c>] ? expand_downwards+0x1c/0x160 [ 687.818812] [<c108ea5b>] ? inc_zone_page_state+0x1b/0x20 [ 687.818819] [<c1025c33>] ? pte_alloc_one+0x33/0x40 [ 687.818823] [<c109534f>] ? handle_mm_fault+0x9af/0xa20 [ 687.818828] [<c10954d9>] ? __get_user_pages+0x119/0x3d0 [ 687.818834] [<c10b189e>] ? get_arg_page+0x4e/0xb0 [ 687.818840] [<c1196d93>] ? strnlen_user+0x23/0x50 [ 687.818844] [<c10b19cc>] ? copy_strings+0xcc/0x190 [ 687.818849] [<c10b1ab1>] ? copy_strings_kernel+0x21/0x40 [ 687.818853] [<c10b3291>] ? do_execve+0x191/0x260 [ 687.818859] [<c100a2a3>] ? sys_execve+0x33/0x80 [ 687.818864] [<c1002cfa>] ? ptregs_execve+0x12/0x18 [ 687.818868] [<c1002c97>] ? sysenter_do_call+0x12/0x26 [ 744.816563] BUG: scheduling while atomic: gcc/12308/0x10000001 [ 744.816568] Modules linked in: cpufreq_stats nouveau ttm drm_kms_helper drm i2c_algo_bit serio_raw [last unloaded: scsi_wait_scan] [ 744.816582] Pid: 12308, comm: gcc Not tainted 2.6.35.2 #1 [ 744.816585] Call Trace: [ 744.816596] [<c13e65cb>] ? schedule+0x46b/0x540 [ 744.816602] [<c13e8879>] ? apic_timer_interrupt+0x31/0x38 [ 744.816606] [<c13e6779>] ? _cond_resched+0x29/0x40 [ 744.816612] [<c109b63d>] ? anon_vma_prepare+0x1d/0x110 [ 744.816617] [<c109733c>] ? expand_downwards+0x1c/0x160 [ 744.816623] [<c108ea5b>] ? inc_zone_page_state+0x1b/0x20 [ 744.816629] [<c1025c33>] ? pte_alloc_one+0x33/0x40 [ 744.816634] [<c109534f>] ? handle_mm_fault+0x9af/0xa20 [ 744.816639] [<c10954d9>] ? __get_user_pages+0x119/0x3d0 [ 744.816645] [<c10b189e>] ? get_arg_page+0x4e/0xb0 [ 744.816650] [<c1196d93>] ? strnlen_user+0x23/0x50 [ 744.816655] [<c10b19cc>] ? copy_strings+0xcc/0x190 [ 744.816659] [<c10b1ab1>] ? copy_strings_kernel+0x21/0x40 [ 744.816664] [<c10b3291>] ? do_execve+0x191/0x260 [ 744.816670] [<c100a2a3>] ? sys_execve+0x33/0x80 [ 744.816674] [<c1002cfa>] ? ptregs_execve+0x12/0x18 [ 744.816678] [<c1002c97>] ? sysenter_do_call+0x12/0x26
Sometime even crashes during the boot, so it won't boot up. Searchnig the net gives me other possible victims: http://lkml.org/lkml/2010/8/14/42 http://www.spinics.net/lists/kernel/msg1070907.html I wanted to add gregkh to let him know about problems in stable, but bugzilla was against adding him to CC ;)
You also need to revert: commit 3aba3fa070fc0f38de2d41252aee9ff17d2de984 "mm: fix missing page table unmap for stack guard page failure case" [1] http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-allstable.git;a=commit;h=3aba3fa070fc0f38de2d41252aee9ff17d2de984 P.S.: I will attach revert patches to this BR.
Created attachment 27442 [details] revert-patch #1
Created attachment 27443 [details] revert-patch #2
BTW, upstream (issues with 2.6.35-git{12..15}) seems also be affected here on Debian/sid i386 (see [1]) - Sedat - [1] http://lkml.org/lkml/2010/8/14/54
https://bugzilla.kernel.org/show_bug.cgi?id=16589 Is duplicate of this issue.
Created attachment 27444 [details] Potential fix Duh. I should have thought about the locking more. Does this fix it?
Gaah, and this is why I think we shouldn't hurry with -stable patches. They go into -devel first for two reasons - to make sure that we don't miss any bugs that were fixed in -stable but never got fixed in devel, but _also_ because that way the -stable patches have hopefully gotten more testing.
(In reply to comment #7) > > Does this fix it? I'm going ahead, give me few mins ;)
This commit also causes the following problem when creating logical volumes: lvcreate -L1G -n backup gentoo Internal error: Maps lock 14405632 < unlock 14409728 Logical volume "backup" created I don't know if it's serious or not. However, the proposed patch doesn't solve this issue.
OK, it survied whole round of stress test. That's the first time with kernel 2.6.35.2 :c) I'll load that machine with another ten rounds and will come back... But looks good for me now. Thanks for quick fix!
On Sat, Aug 14, 2010 at 10:16 AM, <bugzilla-daemon@bugzilla.kernel.org> wrote: ? > This commit also causes the following problem when creating logical volumes: > > lvcreate -L1G -n backup gentoo > Internal error: Maps lock 14405632 < unlock 14409728 > Logical volume "backup" created Are you sure that's from this commit? That seems to be some lvcreate internal error, and it makes no sense what-so-ever that any of these patches would have anything to do with that. > I don't know if it's serious or not. However, the proposed patch doesn't > solve > this issue. Indeed. I don't think the proposed patch should matter in any way. I've tested my patch, and now I enabled all the lock debugging. I do see locking problems in current -git even with that patch, but they are unrelated to the VM layer (there seems to be a "inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage" thing going on in the ipv6 counters code). Greg - let me mull on the whole do_anonymous_page locking a bit more. Linus
I did a git bisect run and it pointed this commit as the first bad one. Indeed, if I revert it (after fixing trivial conflicts due to commit 3aba3fa070fc0f38de2d41252aee), the error doesn't occur.
On Sat, Aug 14, 2010 at 10:36 AM, <bugzilla-daemon@bugzilla.kernel.org> wrote: >- > I did a git bisect run and it pointed this commit as the first bad one. > Indeed, > if I revert it (after fixing trivial conflicts due to commit > 3aba3fa070fc0f38de2d41252aee), the error doesn't occur. Hmm. Turning those "Maps lock" numbers into hex gives us "Internal error: Maps lock 0xdbd000 < unlock 0xdbe000" which is a bit suggestive - it's "page aligned". I wonder if the "Maps" part refers to /proc/self/maps, and the process reads its own maps file to see where the segments are. The whole (and only) point of the whole stack guard page patch is to make sure that a grow-down stack segment always has a guard page as it's lowest page. And that guard page obviously means that the stack segment is now one page bigger (downwards). If the lvcreate binary for some reason looks at how big its virtual mappings are, and assumes that they must be some particular size, then that would explain the error message. I wonder who knows. I'm cc'ing lvm-devel@redhat.com on this email, to see if I can reach somebody who knows that that "Internal error" means. Linus
lvremove or lvchange also give similar errors but with different numbers. For example: lvchange -a n /dev/gentoo/portage Internal error: Maps lock 14426112 < unlock 14430208 Node /dev/mapper/gentoo-portage was not removed by udev. Falling back to direct node removal. The link /dev/gentoo/portage should have been removed by udev but it is still present. Falling back to direct link removal. lvremove gentoo/backup Do you really want to remove active logical volume backup? [y/n]: y Internal error: Maps lock 14409728 < unlock 14413824 Logical volume "backup" successfully removed
On Sat, Aug 14, 2010 at 10:53 AM, <bugzilla-daemon@bugzilla.kernel.org> wrote: > > lvremove gentoo/backup > Do you really want to remove active logical volume backup? [y/n]: y > Internal error: Maps lock 14409728 < unlock 14413824 > Logical volume "backup" successfully removed It looks like things still work, though, and this is just a really annoying warning. No? If so, and assuming we can get confirmation from the lvm people that it's really just lvm being odd and verifying some detail in /proc/self/maps but not relying on it, I suspect we'll just let it go. The alternative would be to have to make the whole stack guard page be something subtly configurable, which is something I worried about but would really like to avoid (it increases the complexity and fragility of the thing quite a bit, and makes the whole reliability of "stack cannot grow on top of something else" questionable (because it's then not necessarily enabled). Linus
I'm back again and happy to announce, that i'm unable to reproduce originally reported problem for 2.6.35.2 (tested just this stable) with applied Linus' patch from comment #7. As i see, you still see there some issues, which are out of my knowledge scope and also not reproducible here, so i yield this bug report to you ;) But if you think i could help somehow, just do a mail ping... Last thing. Could someone give me a git command for testing stable? I was following Linus's development git without a problem, but when i decided to bisect between 2.6.35.1 and 2.6.35.2 i didn't know how to reuse the downloaded tree. So i cloned the stable kernel tree -> downloaded another 500Megs :-( Is there a simple command for switching current developmnet tree into stable tree on my local machine without redownloading all the history?? This could help me and speed up things next time ;) Thank you!
Created attachment 27445 [details] Output of lvcreate -vvvv.... Maybe this will help to verify whether we should worry of the errors reported by LVM. This is the output of lvcreate -vvvv -L1G -n backup gentoo.
(In reply to comment #17) > Last thing. Could someone give me a git command for testing stable? I was > following Linus's development git without a problem, but when i decided to > bisect between 2.6.35.1 and 2.6.35.2 i didn't know how to reuse the > downloaded > tree. So i cloned the stable kernel tree -> downloaded another 500Megs :-( Is > there a simple command for switching current developmnet tree into stable > tree > on my local machine without redownloading all the history?? This could help > me > and speed up things next time ;) Thank you! I use the git URL given in the stable announcement email (which you can get from lwn.net if you did not get it via email) together with "git pull": git pull git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.35.y.git master The following should also work: git pull git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-2.6.35.y.git v2.6.35.2 You can also play games with "git clone --reference" if you want to.
(In reply to comment #6) > https://bugzilla.kernel.org/show_bug.cgi?id=16589 > > Is duplicate of this issue. Stress Tested Linus's patch against 2.6.35.2 no problems found for me. Thanks Stuart
OK, I applied this patch on top of 2.6.35-git15 (upstream aka 2.6.36). Then, stressed the new kernel with a mesa-from-git build. Looks fine on first sight. For more Details (dmesg, kernel-config) [1]. BTW, the official patch "mm: fix page table unmap for stack guard page properly" is now in upstream [2]. - Sedat - [1] http://lkml.org/lkml/2010/8/14/120 [2] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=11ac552477e32835cb6970bf0a70c210807f5673
On Sat, Aug 14, 2010 at 09:26:59PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > Stress Tested Linus's patch against 2.6.35.2 no problems found for me. Thanks for testing, I'll wait till next week to queue up another .35-stable tree with this patch in it after it gets some more testing. greg k-h
I can also reproduce the LVM annoyance. AMD64, latest Debian squeeze userspace. I can confirm it is a problem exposed by the recent stable changes. Here's the offending code from lvm: static void _unlock_mem(struct cmd_context *cmd) { size_t unlock_mstats; log_very_verbose("Unlocking memory"); if (!_memlock_maps(cmd, LVM_MUNLOCK, &unlock_mstats)) stack; if (!_use_mlockall) { if (fclose(_mapsh)) log_sys_error("fclose", _procselfmaps); if (_mstats < unlock_mstats) log_error(INTERNAL_ERROR "Maps lock %ld < unlock %ld", (long)_mstats, (long)unlock_mstats); } if (setpriority(PRIO_PROCESS, 0, _priority)) log_error("setpriority %u failed: %s", _priority, strerror(errno)); _release_memory(); } Where: _memlock_maps will use mlockall(MCL_CURRENT | MCL_FUTURE). Then it rewind(_mapsh), zeroes _mstats, reads every line of _mapsh, parses it and does some counting of certain map entries using "to - from" math, and accumulates that in _mstats. _mapsh is /proc/self/maps. I don't know why lvm thinks it has to mess with /proc/self/maps, but it doesn't look like anything will break, as long as mlockall() is still locking everything that needs to be locked, and munlockall() is unlocking everything that needs to be unlocked.
There's an error in my analysis. mlockall()/munlockall() are NOT being used. When in !_use_mlockall mode (i.e when the warnings happen, _memlock_maps() does a lot more than just count pages on /proc/self/maps. It calls mlock or munlock on the pages it decides to be "relevant", using "to - from" map to get the size of the range it needs to lock/unlock (rounding up to change from bytes to KiB). Ah, found this somewhere in the lvm2 example configs: # While activating devices, I/O to devices being (re)configured is # suspended, and as a precaution against deadlocks, LVM2 needs to pin # any memory it is using so it is not paged out. Groups of pages that # are known not to be accessed during activation need not be pinned # into memory. Each string listed in this setting is compared against # each line in /proc/self/maps, and the pages corresponding to any # lines that match are not pinned. On some systems locale-archive was # found to make up over 80% of the memory used by the process. # mlock_filter = [ "locale/locale-archive", "gconv/gconv-modules.cache" ] # Set to 1 to revert to the default behaviour prior to version 2.02.62 # which used mlockall() to pin the whole process's memory while activating # devices. use_mlockall = 0
On Sun, Aug 15, 2010 at 8:13 AM, Henrique de Moraes Holschuh <hmh@hmh.eng.br> changed: > > I don't know why lvm thinks it has to mess with /proc/self/maps, but it > doesn't > look like anything will break, as long as mlockall() is still locking > everything that needs to be locked, and munlockall() is unlocking everything > that needs to be unlocked. Hey, thanks for the analysis. That all makes sense, and at least to some degree explains why lvm showed any difference at all, even if it doesn't explain why lvm cared, or why lt doesn't just use mlockall. Anyway, it looks purely cosmetic, but it does strike me that we _can_ (and perhaps even should) teach things like /proc/self/maps about the guard page, and indeed we should probably do the same for mlock{all} even if lvm doesn't use that. I have an experimental patch for it. I'll attach it to the bugzilla. Linus
Created attachment 27453 [details] Possible patch for the lvm warning Ok, so this fixes up some of the user-visible effects of the stack guard page: - it doesn't show the guard page in /proc/<pid>/maps - it doesn't try to mlock the guard page (which would just create a new one) It's UNTESTED. It's one of my famous "this can't possibly break anything" patches, so caveat emptor.
I'm marking the bugzilla as "resolved", because I think the lvm issue is a small detail and on the whole the bug is resolved. But please do test the proposed lvm patch too, and feel free to re-open it (or open the lvm issue as a separate bugzilla entry)
Your last patch solves the LVM warning and doesn't seems to break anything. Setting use_mlockall = 1 in lvm.conf also avoids the warning. Thanks for your help.
On Sun, Aug 15, 2010 at 11:22 AM, François Valenduc <francois.valenduc@tvcablenet.be> wrote: > Your last patch solves the LVM warning and doesn't seems to break anything. > Setting use_mlockall = 1 in lvm.conf also avoids the warning. Hey, thanks for testing. It's now in my tree as commit d7824370e263. Linus
(In reply to comment #7) > Created an attachment (id=27444) [details] > Potential fix > > Duh. I should have thought about the locking more. > > Does this fix it? In my particular case this patch doesn't fix a bug. I don't want to flood bugzilla, so I'm attaching my files elsewhere. Here's boot sequence video - too many screens of oopses: http://www.mediafire.com/?kdiyl4v1q2astme The first call trace begins with: get_page_from_freelist The second and third call traces begin with: rcu_process_callbacks The last call trace (JPEG: http://yfrog.com/mrlastcalltracej ) begins with: kstrdup 2.6.35.1 works fine here. I can post any required sw/hw information.
On Mon, Aug 16, 2010 at 5:02 PM Artem S. Tashkinov <t.artem@mailcity.com> wrote: > > In my particular case this patch doesn't fix a bug. Your bug is unrelated. I don't know what it is, but it's not the same thing. > I don't want to flood > bugzilla, so I'm attaching my files elsewhere. Here's boot sequence video - > too > many screens of oopses: http://www.mediafire.com/?kdiyl4v1q2astme The first oops is modprobe doing a "memset()" on some bugus address (page fault at 0xf6a28000). The call trace is hard to make out, becaus eyou don't have frame pointers enabled, but it seems to be based in agp_intel_init -> agp_intel_probe -> agp_add_bridge -> intel_i965_create_gatt_table -> ioremap_nocache -> some page allocator -> oopsing memset() I wonder if somebody did a HIGHMEM alloc, and didn't map the page before memsettign it. That's what it kin of smells like to me. But it is definitely not the same bug as this bugzilla is about. But if 2.6.35.1 works fine, I don't really see what changed in 35.2 that could cause this. Could you try to bisect it? Linus
I have found a bad commit and filed a new bug 16612 report.
I upgraded today to 35.2 and the day got a few full freeze. And firefox and maybe other app as it began to slow down and load the processor, although I did not update their (app)
Launchpad is getting flooded with apport bugs from this. In case apport picked up any useful information here is the master bug report I Made: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/620810
Reply-To: zkabelac@redhat.com Dne 14.8.2010 19:47, Linus Torvalds napsal(a): > On Sat, Aug 14, 2010 at 10:36 AM, <bugzilla-daemon@bugzilla.kernel.org> > wrote: >> - >> I did a git bisect run and it pointed this commit as the first bad one. >> Indeed, >> if I revert it (after fixing trivial conflicts due to commit >> 3aba3fa070fc0f38de2d41252aee), the error doesn't occur. > > Hmm. Turning those "Maps lock" numbers into hex gives us > > "Internal error: Maps lock 0xdbd000 < unlock 0xdbe000" > > which is a bit suggestive - it's "page aligned". I wonder if the > "Maps" part refers to /proc/self/maps, and the process reads its own > maps file to see where the segments are. > > The whole (and only) point of the whole stack guard page patch is to > make sure that a grow-down stack segment always has a guard page as > it's lowest page. And that guard page obviously means that the stack > segment is now one page bigger (downwards). If the lvcreate binary for > some reason looks at how big its virtual mappings are, and assumes > that they must be some particular size, then that would explain the > error message. > > I wonder who knows. I'm cc'ing lvm-devel@redhat.com on this email, to > see if I can reach somebody who knows that that "Internal error" > means. LVM is doing it's 'own' version of mlockall() functionality. (mainly to avoid locking 100MB locales files which is used on same distributions (like Fedora) to speedup setlocale() functionality.) Internal Error here means - LVM failed to avoid allocation which would require to increase the heap size (lvm is preallocating some memory to have some 'spare' room) and tried to allocate some extra space during 'memory locked' state - this is an internal error - which is usually mostly harmless - but in the case there would be high memory pressure - lvm can fail to get this extra memory and deadlock the system (in the case it would suspend some root/swap devices) (check basically sums locked and unlocked memory - which is not the best we could - but seems sufficient - numbers should be equal). With the stack-guard fix patch and it's backport to older kernels - there was a missing patch (see https://bugzilla.redhat.com/show_bug.cgi?id=643500) - I've posted this on lkml list - unsure whether backporting was properly done. Anyway - the new version of LVM (>= 2.02.75) code does read this maps file in one read operation and just after that it's doing memory locking: http://www.redhat.com/archives/lvm-devel/2010-September/msg00091.html (With stackguard patch maps list is modified during read). Zdenek