Latest working kernel version:2.6.25 Earliest failing kernel version:2.6.26-rc3 Distribution:Gentoo Hardware Environment: Software Environment: Problem Description:just after the kenrel update(that made my ath5k wireless card work) all the programs mentioned here stopped working... mabe i should try 2.6.26-rc2... Steps to reproduce: run openoffice,vlc,mplayer,kaffeine(see a youtube video with the players) here the end of strace of vlc: open("/usr/lib/libavutil.so.49", O_RDONLY) = 6 read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p\24\0\0004\0\0\0"..., 512) = 512 fstat64(6, {st_mode=S_IFREG|0755, st_size=45958, ...}) = 0 mmap2(NULL, 41680, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 6, 0) = 0xb6933000 mmap2(0xb693a000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 6, 0x6) = 0xb693a000 mmap2(0xb693c000, 4816, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb693c000 close(6) = 0 open("/usr/lib/libvorbis.so.0", O_RDONLY) = 6 read(6, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0P*\0\0004\0\0\0"..., 512) = 512 fstat64(6, {st_mode=S_IFREG|0755, st_size=164132, ...}) = 0 mmap2(NULL, 162996, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 6, 0) = 0xb690b000 mmap2(0xb6925000, 57344, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 6, 0x1a) = 0xb6925000 close(6) = 0 mprotect(0xb693a000, 4096, PROT_READ) = 0 mprotect(0xb6a73000, 4096, PROT_READ) = 0 mprotect(0xb6a85000, 4096, PROT_READ) = 0 mprotect(0xb6a9a000, 626688, PROT_READ|PROT_WRITE) = 0 mprotect(0xb6a9a000, 626688, PROT_READ|PROT_EXEC) = 0 mprotect(0xb6b33000, 4096, PROT_READ) = 0 mprotect(0xb6c42000, 4096, PROT_READ) = 0 mprotect(0xb6c7e000, 4096, PROT_READ) = 0 mprotect(0xb6c95000, 3399680, PROT_READ|PROT_WRITE) = 0 mprotect(0xb6c95000, 3399680, PROT_READ|PROT_EXEC) = 0 mprotect(0xb6fd3000, 28672, PROT_READ) = 0 mprotect(0xb7147000, 8192, PROT_READ) = 0 mprotect(0xb728d000, 32768, PROT_READ|PROT_WRITE) = 0 mprotect(0xb728d000, 32768, PROT_READ|PROT_EXEC) = 0 mprotect(0xb7295000, 4096, PROT_READ) = 0 mprotect(0xb72a8000, 4096, PROT_READ) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++
Created attachment 16270 [details] my .config of the kenrel note that some security features added at the 2.6.25 are present mabe that could be the cause of the problem
i have glibc-2.6.1
Created attachment 16271 [details] 009-allow-ap-vlan-modes.patch ah yes...i have forgetten...i've added a patch...i attached it here...
Are there any interesting messages in the kernel logs? /bin/dmesg?
unfortunately no... i can attach my whole dmesg if you want... should i recompile my kernel with some debug things?
Created attachment 16272 [details] 2.6.26-rc3-dmesg as there are memory related things at the beginning i added the whole dmesg
2.6.23-rc2 has also the same bug
oops i booted the wrong kernel...don't know why
2.6.23-rc2 is fine...so the bug was introduced between rc2 and rc3
Does an unpatches 2.6.26-rc3 kernel show the same problem? If yes, does enabling CONFIG_X86_PAT help?
unpatched does the same thing...
enabling CONFIG_X86_PAT doesn't change anything...
oops it was the trace of mplayer not vlc...
wow...this time i have a message in dmesg: vlc[6546]: segfault at b02299dc ip b02299dc sp b2153bec error 15 in libxvidcore.so.4.1[b0220000+99000]
Well I don't know what could have caused this and afaik nobody else is hitting it. So I'm afraid I'll have to ask if you are able to perform the dreaded git bisection search. It will take a couple of hours. http://www.kernel.org/doc/local/git-quick.html has the instructions. Thanks.
What hardware is your computer (especially which CPU)?
Created attachment 16276 [details] Can you try whether _reverting_ this patch fixes it?
(In reply to comment #16) > What hardware is your computer (especially which CPU)? > $ cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 2.00GHz stepping : 8 cpu MHz : 2000.000 cache size : 2048 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx bts est tm2 bogomips : 3992.51 clflush size : 64
my computer is a sony vaio laptop(VGN-BX297XP)
tremulous and nexuiz also segfault... i'll revert the patch... do someone need more strace because with vlc they are different
(In reply to comment #17) > Created an attachment (id=16276) [details] > Can you try whether _reverting_ this patch fixes it? > with the revert of this patch and CONFIG_X86_PAT it works again as expected(the bug disappeared)
Thanks for testing! Caused by: commit 1c12c4cf9411eb130b245fa8d0fbbaf989477c7b Author: Venki Pallipadi <venkatesh.pallipadi@intel.com> Date: Wed May 14 16:05:51 2008 -0700 mprotect: prevent alteration of the PAT bits There is a defect in mprotect, which lets the user change the page cache type bits by-passing the kernel reserve_memtype and free_memtype wrappers. Fix the problem by not letting mprotect change the PAT bits. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Does reverting the patch and disabling CONFIG_X86_PAT again also fix the problem?
(In reply to comment #23) > Does reverting the patch and disabling CONFIG_X86_PAT again also fix the > problem? > yes it also fixes the problem
I'm fairly sure you'll find these segfaults were fixed in 2.6.26-rc3-git2 and so in current git: please retest with a recent kernel when you can. The PAT pte_modify() changes fell foul of a misdefined PTE_MASK when NX is used: see http://lkml.org/lkml/2008/5/19/446 for a link into the mailthread; but Linus rightly put in Jeremy Fitzhardinge's proper PTE_MASK fixes instead of the two-line hack in that mail.
Short summary of the discussion in this bug: Regression introduced between 2.6.26-rc2 and 2.6.26-rc3. Caused by commit 1c12c4cf9411eb130b245fa8d0fbbaf989477c7b (mprotect: prevent alteration of the PAT bits). It does *not* matter whether CONFIG_X86_PAT is enabled or disabled in the kernel - according to the submitter it breaks in both cases and reverting the commit fixes it in both cases.
(In reply to comment #25) > I'm fairly sure you'll find these segfaults were fixed in 2.6.26-rc3-git2 > and so in current git: please retest with a recent kernel when you can. > > The PAT pte_modify() changes fell foul of a misdefined PTE_MASK when NX > is used: see http://lkml.org/lkml/2008/5/19/446 for a link into the > mailthread; but Linus rightly put in Jeremy Fitzhardinge's proper > PTE_MASK fixes instead of the two-line hack in that mail. Your comment came after I started writing my comment. Just to avoid misunderstandings, since my comment was not meant as a reply to your comment.
This is the known PAE bug. It's fixed in current -git by commit 2bd3a99c9d1851182f73d0a024dc5bdb0a470e8c ("x86: define PTE_MASK in a universally useful way") and will be in -rc4 which is planned for later today.
As described in the bug it should already be fixed. Please reopen this bug if it's still present with kernel 2.6.26-rc4.