|Summary:||2.6.27-rc-7: BUG: scheduling while atomic, c1e_idle+0x98/0xe0|
|Product:||Platform Specific/Hardware||Reporter:||Rafael J. Wysocki (rjw)|
|Bug Depends on:|
screenshot after crash
shot of the MCE occuring
Description Rafael J. Wysocki 2008-10-04 11:33:48 UTC
Subject : 2.6.27-rc-7: BUG: scheduling while atomic: swapper/0/0x00000102 Submitter : Prakash Punnoor <firstname.lastname@example.org> Date : 2008-09-28 17:45 References : http://marc.info/?l=linux-kernel&m=122262403415629&w=4 Handled-By : Thomas Gleixner <email@example.com> This entry is being used for tracking a regression from 2.6.26. Please don't close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2008-10-24 14:09:41 UTC
Thomas, I think this is fixed now, correct?
Comment 2 Prakash Punnoor 2008-10-26 02:07:35 UTC
At least I didn't have the problem again (no being on 126.96.36.199). But I also don't know how to properly reproduce it.
Comment 3 Marcel Partap 2008-11-19 03:17:17 UTC
Created attachment 18934 [details] screenshot after crash
Comment 5 Marcel Partap 2008-11-19 03:20:34 UTC
My system is very regularly going down because of this since about a week or two. That coincided with inserting an ati PCIE card into my system but as i am getting crashes with either ati proprietary or radeonhd driver, i don't know if that is relevant to this. as of now i haven't been successfull in getting my kdump kernel to boot after the freeze for whatever reason so i am voting for reopening of this bug and attaching two JPEG stackdumps ;)
Comment 6 Marcel Partap 2008-11-19 11:24:16 UTC
..or should i rather open a new bug for a newer version?
Comment 7 Thomas Gleixner 2008-11-19 14:01:43 UTC
One of the screenshots shows a machine check exception, which means that there is something seriously wrong with your system. The second screenshot also has the machine check taint flag set ("M"). That's a hardware problem. Does it go away when you replace the PCIE card you added recently ? Any chance to capture the printks via a serial console ? Thanks, tglx
Comment 8 Marcel Partap 2008-11-29 10:00:15 UTC
Created attachment 19073 [details] shot of the MCE occuring Hi Thomas, sorry for the delay i did not yet get a 0modem cable to setup a serial console, neither did i take that PCIE card out. The strange thing is, right in between those incidents my machine was running fine for almost a week (?) and now it is back to regular crashes; i just watched another one occur. The screenshot attached shows the syslog on tty12 and feeding that line into mcelog --ascii manually gives: # echo "CPU 0: Machine Check Exception: 4 Bank 4: b200000000070f0f" |mcelog --ascii WARNING: with --dmi mcelog --ascii must run on the same machine with the same BIOS/memory configuration as where the machine check occurred. HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 0 4 northbridge Northbridge Watchdog error bit57 = processor context corrupt bit61 = error uncorrected bus error 'generic participation, request timed out generic error mem transaction generic access, level generic' STATUS b200000000070f0f MCGSTATUS 4 Can you interpret anything into this?
Comment 9 Marcel Partap 2008-12-02 19:29:22 UTC
How stupid.. after making sure memtest showed up no problems i removed the card and now with the onboard nforce GPU the crashes are gone, but also my X performance (shared memory fake VRAM).. is this a know problem with nforce4 chipsets in combination with ATI PCIE cards?
Comment 10 Thomas Gleixner 2008-12-12 00:03:40 UTC
> How stupid.. after making sure memtest showed up no problems i > removed the card and now with the onboard nforce GPU the crashes are > gone, but also my X performance (shared memory fake VRAM).. is this > a know problem with nforce4 chipsets in combination with ATI PCIE > cards? Not that I know. I have no idea what might cause the massive corruption on your board. Did you check for BIOS updates for the board already ? Thanks, tglx