As reported first in http://lists.debian.org/debian-ia64/2012/01/msg00016.html, I'm getting stability issues with kernel > 2.6.38. I've bisected the problem to commit 37a9d912b24f96a0591773e6e6c3642991ae5a70 (futex: Sanitize cmpxchg_futex_value_locked API). Here are some issues that can be easily observed: - in a X session (tested under GNOME Classic as well as TWM), hitting the Tab key while in a terminal window instantly triggers a X restart. X isn't crashed as I wrongly assume initially. From the logs, it's definitely properly shut down and restarted - still in a X session, clicking on the "Edit" menu or "Back button" of Firefox/Iceweasel triggers a crash of Firefox/Iceweasel. For this scenario, I have some kind of gdb stack trace in PulseAudio, before gdb itself goes wrong (more on this later; core file available) First investigations let me wrongly assume that these issues were related to something bad in PulseAudio, as uninstalling PulseAudio fixed both of them (Tab key in a terminal issue and Firefox/Iceweasel crash). However, other crashes make me believe that PulseAudio was only a evidence of something more general broken since the bisected commit. Most notably, gdb can't be started at all. Every attempts to debug a program immediately ends up with a SIGTRAP signal. Quite problematic to debug further... It's noteworthy that the exact same system doesn't exhibit these issues when rebooted with kernel 2.6.38 (Debian linux-image-2.6.38-2-mckinley if needed).
While also testing GL rendering on my system, trying to run ioQuake3-based Quake 3 demo gives at startup: ------ Initializing Sound ------ Assertion 'pthread_mutex_unlock(&m->mutex) == 0' failed at pulsecore/mutex-posix.c:106, function pa_mutex_unlock(). Aborting. ----- Client Shutdown (Received signal 6) ----- RE_Shutdown( 1 ) So, still triggered in PulseAudio, but definitely mutex-related...
Forwarded to the linux-ia64 list, though I chose a poor subject line. http://thread.gmane.org/gmane.linux.kernel/1111752/focus=22096
Created attachment 72915 [details] proposed patch I think I found the problem. GCC re-orders code because it does not know that the ia64 fault handler may change the value of register r8 to -EFAULT
Hello, Just rebuilt kernel with patch proposed in attachment #72915 [details]. Issue fixed :-) Many thanks, Emeric PS: gdb is still returning early with SIGTRAP when debugging Iceweasel (didn't have time to try other programs). However every other reported issues (futex test suite, Tab keystroke in a terminal window, Iceweasel's buttons and menus, ioQuake3-based Quake 3 demo) now work fine. So it may be a problem with gdb itself.
Fix (slightly modified from patch attached here because Linus pointed out that we should tell GCC that the __asm__ code modifies r8) is now upstream: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=c76f39bddb84f93f70a5520d9253ec0317bec216
New patch version works flawlessly. Thanks, Emeric