Bug 13933

Summary: System lockup on dual Pentium-3 with kernel 2.6.30
Product: Other Reporter: Martin Rogge (marogge)
Component: OtherAssignee: Linus Torvalds (torvalds)
Status: CLOSED CODE_FIX    
Severity: blocking CC: beauwinters, bugs-a21, devzero, enouf4u, hilld, iordanov, mingo, nemesis, rjw, rogerx.oss, rusty, thomas.bjornell, torvalds, tpfaff, wylda
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.30.4 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 13070    
Attachments: uname -a
lsmod
.config
syslog
lspci -vv
dmesg
Opps after first bisection
git bisecting - Call for HELP
final git bisect log
One of the rare trace in this saga

Description Martin Rogge 2009-08-08 13:16:26 UTC
Created attachment 22637 [details]
uname -a

After upgrade to 2.6.30 my machine locks up cold within a number of hours (TTL ranges from minutes to days). The mainboard is an MSI-9105 with dual P-3 1400Mhz.

On the LKML similar cases have been reported for dual P2s and dual P3s. 

Further system details have been attached.
Comment 1 Martin Rogge 2009-08-08 13:17:01 UTC
Created attachment 22638 [details]
lsmod
Comment 2 Martin Rogge 2009-08-08 13:24:33 UTC
gosh, what does a man have to do to attach more than one file around here? I'll give it one more try.
Comment 3 Martin Rogge 2009-08-08 13:37:55 UTC
Sorry, bugzilla won't let me attach any more info like .config, nor can I change the attachments already posted. Anyway, it's on the LKML, and you have my email.
Comment 4 Martin Rogge 2009-08-08 21:12:27 UTC
Created attachment 22641 [details]
.config

after change of IP I managed to attach the .config... thanks, bugzilla!
Comment 5 Martin Rogge 2009-08-09 10:27:37 UTC
Created attachment 22642 [details]
syslog

A lockup occured between the time stamps 20:20:43 and 20:24:26.
Comment 6 Martin Rogge 2009-08-09 15:00:30 UTC
Created attachment 22647 [details]
lspci -vv

bugzilla only lets me attach one file every 12 hours or so. But I am persistent. ;-)
Comment 7 Roland Kletzing 2009-08-09 18:09:01 UTC
here is a similar report from Osipov Stanislav:
http://marc.info/?l=linux-kernel&m=124928938311561&w=2

here the report from martin just for reference:
http://marc.info/?l=linux-kernel&m=124931667320530&w=2
Comment 8 Roland Kletzing 2009-08-09 18:39:01 UTC
here is the report from  Frank de Jong:
http://marc.info/?l=linux-kernel&m=124967492815396&w=2

and here from John Stoffel:
http://marc.info/?l=linux-kernel&m=124967565617179&w=2
Comment 9 Roland Kletzing 2009-08-09 18:41:24 UTC
most likely a regression, so please someone with appropriate permission mark this as a regression!
Comment 10 Martin Rogge 2009-08-09 20:17:59 UTC
Created attachment 22652 [details]
dmesg
Comment 11 Roland Kletzing 2009-08-10 18:40:15 UTC
we have probably a duplicate bugreport at http://bugzilla.kernel.org/show_bug.cgi?id=13219 from David Hill
Comment 12 Martin Rogge 2009-08-10 20:10:56 UTC
I've just had a lockup after an uptime of 3 days plus.
Comment 13 Rafael J. Wysocki 2009-08-10 20:50:45 UTC
*** Bug 13945 has been marked as a duplicate of this bug. ***
Comment 14 David Hill 2009-08-13 12:07:39 UTC
This is weird though.   I never had more than 4 hours uptime since the bug is present... 

Is that server under heavy use?  Can you simulate heavy disk read/write for a while?   Or heavy CPU usage?  Memory ? etc ?
Comment 15 Martin Rogge 2009-08-13 15:01:04 UTC
(In reply to comment #14)
> This is weird though.   I never had more than 4 hours uptime since the bug is
> present... 

I've had anything from 2 minutes to 3 days.

> 
> Is that server under heavy use?  Can you simulate heavy disk read/write for a
> while?   Or heavy CPU usage?  Memory ? etc ?

the machine is mostly used as a workstation. Originally I had the feeling the lockups coincided with screen updates, but there is no conclusive evidence. I remember, on one occasion the machine was idle when it happened. I can try and simulate other workloads, but don't wait for a meaningful result.
Comment 16 Ognjan Iordanov 2009-08-15 04:34:48 UTC
I have the same problem with kernel 2.6.30.4. The motherboard is Tyan S1834/Tiger 133 with P3 1000Mhz processors.
Comment 17 Roland Kletzing 2009-08-15 16:49:19 UTC
maybe another dupe: http://bugzilla.kernel.org/show_bug.cgi?id=13982 (sorry, i don`t have permission to mark as duplicate, so just postint the link)
Comment 18 Rafael J. Wysocki 2009-08-15 20:33:44 UTC
*** Bug 13982 has been marked as a duplicate of this bug. ***
Comment 19 Wylda 2009-08-15 23:57:28 UTC
Rafael, you are right, mine bugreport #13982 is the same problem. From my tests, following kernels work perfectly:

 * 2.6.26.8
 * 2.6.27.29
 * 2.6.28.10
 * 2.6.29.6

Following freezes:
 * 2.6.30.4
 * 2.6.30.5rc2

So decided to make bisecting by following nice howto: 
http://wiki.winehq.org/RegressionTesting

git bisect start
git bisect good v2.6.29
git bisect bad v2.6.30

But when i comile such a kernel, than cat /proc/version shows 2.6.29, which is not correct i guess. After few rounds it showed me v2.6.29-rc4 which is even before "good v2.6.29" (should be something like v2.6.30-rc4, shouldn't be??)
Comment 20 Wylda 2009-08-16 02:00:58 UTC
Oldschool bisecting (without git) - following kernel seems OK:

 * 2.6.30-rc4


It would help me, if someone could advice:

1. What tag should i use now for bisect good and bad?

2. If i have local clone of Git repository, how can i set the source code for example to version 2.6.30-rc6 (so i will not have to download it separately again)?

3. Is it possible to export from local clone of Git repository for example linux-2.6.30-rc6.tar.bz2 (so i will get exactly the same file like on kernel.org)?
Comment 21 Rafael J. Wysocki 2009-08-16 10:08:51 UTC
(In reply to comment #20)
> Oldschool bisecting (without git) - following kernel seems OK:
> 
>  * 2.6.30-rc4
> 
> 
> It would help me, if someone could advice:
> 
> 1. What tag should i use now for bisect good and bad?

Your procedure in comment #19 is correct and the fact that you got v2.6.29-rc4 in the process only reflects the history of development.  Apparently, you got a bisection point in a branch that was originally based on 2.6.29-rc4 and then merged into 2.6.30.  So, you should do

git bisect start
git bisect good v2.6.30-rc4
git bisect bad v2.6.30

and do not care too much for the versions you get in the middle of bisection (that can be anything from 2.6.29 upwards).

> 2. If i have local clone of Git repository, how can i set the source code for
> example to version 2.6.30-rc6 (so i will not have to download it separately
> again)?

git checkout v2.6.30-rc6

It will complain that you don't have to a local branch for this kernel, but that's fine.

> 3. Is it possible to export from local clone of Git repository for example
> linux-2.6.30-rc6.tar.bz2 (so i will get exactly the same file like on
> kernel.org)?

After checking out a particular tag, you should get exactly the same tree as from the corresponding tarball.
Comment 22 Rafael J. Wysocki 2009-08-16 10:10:13 UTC
One mistake, I should have said "that can be anything from 2.6.28 upwards".
Comment 23 Wylda 2009-08-16 13:06:12 UTC
Going to semifinal :)

Following freezes:

 * 2.6.30-rc5

What makes me little nervous, that this time it took nearly 50mins, i.e. looks like it is much harder to trigger it. So hopefully bisecting shows something. Right now bisecting v2.6.30-rc4/v2.6.30-rc5 (good/bad).


Rafael, two more questions (if you find some time):

a) if i do "git checkout v2.6.30-rc6", then how i set it back to the latest clone version (something like git checkout latest...)?

b) how to update local git clone to Linus's latest version (to be up to date)?

I will never be a git specialist, but these are probably the last, that will be enough for me to say i can handle it ;)
Comment 24 Wylda 2009-08-16 14:57:19 UTC
Created attachment 22747 [details]
Opps after first bisection

I don't know if this is important, so for the record... After first bisection v2.6.30-rc4/v2.6.30-rc5 there was Oops, but everything worked as usually (OK, one FTP transfer died;) Oops attached.

Anyway i did a reboot to be sure, there is no influence caused by this Opss. Kernel freezed in few mins - giving "git bisect bad" and going on.


Let me know if there is no need to waste energy with Opsses in this case (ie they dont bring any light into this bug).
Comment 25 Ryan Underwood 2009-08-16 15:53:05 UTC
I also have a dual PPro system with NO problems so far with exactly the same kernel that is giving the dual PIII system fits.

The only high level difference I can tell is that the PPro system does not have ACPI while the PIII does.  Also, a different RAID card (3ware on the PPro vs aacraid on the PIII)
Comment 26 Ryan Underwood 2009-08-16 15:55:31 UTC
I forgot to mention, due to this difference I booted with acpi=off on the PIII system with no effect on the bug; it crashed just the same.
Comment 27 David Hill 2009-08-16 18:04:54 UTC
What Nic?  Is it a realtek ?

David Hill

On 2009-08-16, at 15:53, bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13933
>
>
>
>
>
> --- Comment #25 from Ryan Underwood <nemesis@icequake.net>   
> 2009-08-16 15:53:05 ---
> I also have a dual PPro system with NO problems so far with exactly  
> the same
> kernel that is giving the dual PIII system fits.
>
> The only high level difference I can tell is that the PPro system  
> does not have
> ACPI while the PIII does.  Also, a different RAID card (3ware on the  
> PPro vs
> aacraid on the PIII)
>
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
Comment 28 Ryan Underwood 2009-08-16 18:20:29 UTC
e100 in both working and non-working systems
Comment 29 Wylda 2009-08-16 18:30:31 UTC
David, i have a Realtek in my machine, but i dont use it. Even driver is not in kernel nor as module (ie. kernel does not see it) and still freezes :-/

My currently used config is as minimal as possible:
 * No modules
 * No power management, no ACPI, no Frequency scaling
 * No AGP, no ISA, ...
 * No wireless, IPv6, etc.
 * Minimal SCSI support
 * No I2C, GPIO, HW monitoring, multimedia, DVB, Sound, FB
 * No USB, HID, LED
 * (filesystem) Nothing except Ext3
 * No security options, Cryptography, Virtualization, Library routines

...and even though, it freezes. Just few regression left, but all bisections were marked as bad till now. So i have a feeling, that 2.6.30-rc4 is also bad. But stay tuned, still working on :-/
Comment 30 David Hill 2009-08-16 18:47:36 UTC
I'm bisecting from 2.6.29 to 2.6.30...  Bug is between those...

I happen to also have a tyan dual p3 system!!!
Comment 31 Ryan Underwood 2009-08-16 20:11:53 UTC
Anyone compiled working and non-working kernels with the same compiler?  Maybe it's a subtle toolchain bug.
Comment 32 Wylda 2009-08-16 20:21:17 UTC
David, i also did bisection between 2.6.29 to 2.6.30, but when i had to use "git bisect skip" because of broken compilation (sata_sil - which is must have) and than i saw versions like 2.6.29-rc4, rc1 and commit's dates somewhere from January 2009 i thought, i f***ed git ("_the_greatest_tool_ever_made_") up and did a reset after 14! rounds :-/ Based on comment #21 i know i should not do that, but you know ;-) It's my first time with git - not perfect ;)

Following freezes:

 * 2.6.30-rc4 (unfortunately)
 * 2.6.30-rc2
 * 2.6.30-rc1

So to be sure, that still chasing a bug and not ghosts, i did a careful test of Debian's lenny kernel 2.6.26-17lenny1 -> works perfectly. So bisecting again, but this time 2.6.30-rc1 and 2.6.29.



Ryan, see comment #19.

following kernels work perfectly:

 * 2.6.26.8
 * 2.6.27.29
 * 2.6.28.10
 * 2.6.29.6

all were compiled with the same gcc and other tools.
Comment 33 Wylda 2009-08-17 07:04:04 UTC
Created attachment 22750 [details]
git bisecting - Call for HELP


*** Call for HELP - Gitmaster wanted ***

After some good/bad, i'm not able to overcome sata_sil build failure by "git bisect skip". Even though i remove sata_sil from kernel, than i get immediately during the boot process kernel panic (not related to missing SATA Sil3114 driver - i have a system on PATA).

If i count correctly 36x skip * 7min = more than 4 hours of wasted time.

Now i take this bisecting and building process to 8xCPU machine, but maybe its possible, that i got with git to the point were i'm not able to continue. I have a time to bisect to Tuesday evening (UTC). On Wednesday the server goes to production and i will loose opportunity to test and bisect. I read about "Reverse Regression Testing" (http://wiki.winehq.org/ReverseRegressionTesting), but thats to much for me.

*** Call for HELP - Gitmaster wanted ***


(from the log):

git-bisect start
git-bisect good 2.6.29
git-bisect bad 2.6.30-rc1
git-bisect bad 577c9c456f0e1371cbade38eaf91ae8e8a308555 
git-bisect bad 5658ae9007490c18853fbf112f1b3516f5949e62 
git-bisect good 08abe18af1f78ee80c3c3a5ac47c3e0ae0beadf6 
git-bisect bad 6e15cf04860074ad032e88c306bea656bbdd0f22 
git-bisect bad 7c178a26d3e753d2a4346d3e4b8aa549d387f698 
git-bisect skip e2c75d9f54334646b3dcdf1fea0d1afe7bfbf644 
git-bisect skip e0c7ae376a13fd79a4dad8becab51040d13dfa90
git-bisect skip 3769e7b4d8ef113e08221a210f849ba57475aff5
git-bisect skip 6a48565ed6ac76f351def25cd5e9f181331065f6
git-bisect skip 9c39801763ed6e08ea8bc694c5ab936643a2b763
git-bisect skip fbeb2ca0224182033f196cf8f63989c3e6b90aba
git-bisect skip 4272ebfbefd0db40073f3ee5990bceaf2894f08b
git-bisect skip f67ae5c9e52e385492b94c14376e322004701555
git-bisect skip 36ef4944ee8118491631e317e406f9bd15e20e97
git-bisect skip 9e111f3e167a14dd6252cff14fc7dd2ba4c650c6
git-bisect skip 06ac8346af04f6a972072f6c5780ba734832ad13
git-bisect skip 1ff2f20de354a621ef4b56b9cfe6f9139a7e493b
git-bisect skip 1ec2dafd937c0f6fed46cbd8f6878f2c1db4a623
git-bisect skip 43f39890db2959b10891cf7bbf3f53fffc8ce3bd
git-bisect skip 1c61d8c309a4080980474de8c6689527be180782
git-bisect skip 26f7ef14a76b0e590a3797fd7b2f3cee868d9664
git-bisect skip 4b19ed915576e8034c3653b4b10b79bde10f69fa
git-bisect skip 6b64ee02da20d6c0d97115e0b1ab47f9fa2f0d8f
git-bisect skip 193c81b979adbc4a540bf89e75b9039fae75bf82
git-bisect skip e006235e5b9cfb785ecbc05551788e33f96ea0ce
git-bisect skip 7cd92366a593246650cc7d6198e2c7d3af8c1d8a
git-bisect skip d1de36f5b5a30b8f9dae7142516fb122ce1e0661
git-bisect skip 8f47e16348e8e25eedf639092a8a2f10a66aba34
git-bisect skip c3e6a2042fef33b747d2ae3961f5312af801973d
git-bisect skip 54523edd237b9e792a3b76988fde23a91d739f43
git-bisect skip 5da690d29f0de17cc1835dd3eb8f8bd0945521f0
git-bisect skip 647ad94fc0479e33958cb4d0e20e241c0bcf599c
git-bisect skip e084e531000a488d2d27864266c13ac824575a8b
git-bisect skip ed74ca6d5a3e57eb0969d4e14e46cf9f88d25d3f
git-bisect skip f154f47d5180c2012bf97999e6c600d45db8af2d
git-bisect skip 36619a8a80793a803588a17f772313d5c948357d
git-bisect skip 3e92ab3d7e2edef5dccd8b0db21528699c81d2c0
git-bisect skip 550fe4f198558c147c6b8273a709568222a1668a
git-bisect skip 9fc2e79d4f239c1c1dfdab7b10854c7588b39d9a
git-bisect skip c379698fdac7cb65c96dec549850ce606dd6ceba
git-bisect skip f095df0a0cb35a52605541f619d038339b90d7cc
Comment 34 Wylda 2009-08-17 07:48:04 UTC
The problem with sata_sil:

drivers/ata/sata_sil.c: In function ‘sil_broken_system_poweroff’:
drivers/ata/sata_sil.c:713: error: implicit declaration of function ‘dmi_first_match’
drivers/ata/sata_sil.c:713: warning: initialization makes pointer from integer without a cast
make[2]: *** [drivers/ata/sata_sil.o] Error 1
make[1]: *** [drivers/ata] Error 2
make: *** [drivers] Error 2
make: *** Waiting for unfinished jobs....
Comment 35 Wylda 2009-08-17 08:39:04 UTC
Finaly overcome build failure (8x CPU Xeon - what a difference).

git bisect skip
Bisecting: 313 revisions left to test after this
[9d45cf9e36bf9bcf16df6e1cbf049807c8402823] Merge branch 'x86/urgent' into x86/apic

I'll go on with bisecting today after 18h (UTC).
Comment 36 Martin Rogge 2009-08-17 16:46:02 UTC
(In reply to comment #35)
> Finaly overcome build failure (8x CPU Xeon - what a difference).

Excellent. I can't wait to find out what is causing the problem.

Thanks for putting in all this work, Pavel. You are the chosen one because you can trigger the bug within minutes. ;-)
Comment 37 Wylda 2009-08-18 11:37:19 UTC
I hope, that "release early and release often" also goes for this kind of spam ;c)

So to keep you informed... I don't know who said, that for git's bisecting is best practice to use two close release good/bad. In this case it's not true. Bisecting between 2.6.29/2.6.30rc1 lead me to blind track :-/ When i finally overcome sata_sil.c build failure, i though i won... But that correctly built kernel did not boot (same kernel panic). After that i tried many git bisect skip, but nothing than panic. So after deep breath i did reset a began again with 2.6.29/2.6.30. 

I think i have 1 max 2 bisect turns ahead and i also have a good commit (after a lot of bad). So now i take my son for a walk and give that machine hardtry. I think, that this evening i shoud know bad commit. So Martin (possibly others) - could you test with me to revert that commit to be sure we got a right one?

If i'm too optimistic, that would mean that statement "release early and release often" is horribly incorrect and is second wrong thing about linux (after the git statement) ;c)
Comment 38 Wylda 2009-08-18 18:26:28 UTC
Created attachment 22764 [details]
final git bisect log

Git bisect gave me following commit. Make sence??

Now i have to find a way how to remove this patch from git's v2.6.30.5 and give it a try. Please test with me...



# git bisect good
4595f9620cda8a1e973588e743cf5f8436dd20c6 is first bad commit
commit 4595f9620cda8a1e973588e743cf5f8436dd20c6
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Sat Jan 10 21:58:09 2009 -0800

    x86: change flush_tlb_others to take a const struct cpumask

    Impact: reduce stack usage, use new cpumask API.

    This is made a little more tricky by uv_flush_tlb_others which
    actually alters its argument, for an IPI to be sent to the remaining
    cpus in the mask.

    I solve this by allocating a cpumask_var_t for this case and falling back
    to IPI should this fail.

    To eliminate temporaries in the caller, all flush_tlb_others implementations
    now do the this-cpu-elimination step themselves.

    Note also the curious "cpus_or(f->flush_cpumask, cpumask, f->flush_cpumask)"
    which has been there since pre-git and yet f->flush_cpumask is always zero
    at this point.

    Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
    Signed-off-by: Mike Travis <travis@sgi.com>

:040000 040000 f970a9bfa4ae30de22b4e9ef9e38836b1ff583cd 68b4e9c75b11bf81d5e4193a46328c34ca74415d M      arch
Comment 39 Wylda 2009-08-18 19:09:06 UTC
I can't test it :-/ There were probably some changes 

 * a4a0acf8e17e3d08e28b721ceceb898fbc959ceb

 * 694aa960608d2976666d850bd4ef78053bbd0c84


which lead to: 
 # git bisect reset
 # git checkout v2.6.30-rc1
 # git show 4595f9620cda8a1e973588e743cf5f8436dd20c6 | patch -p1 -R

patching file arch/x86/include/asm/paravirt.h
Hunk #1 succeeded at 273 (offset 29 lines).
Hunk #2 succeeded at 1076 (offset 92 lines).
patching file arch/x86/include/asm/tlbflush.h
Hunk #3 succeeded at 163 (offset -3 lines).
patching file arch/x86/include/asm/uv/uv_bau.h
Hunk #1 FAILED at 325.
1 out of 1 hunk FAILED -- saving rejects to file arch/x86/include/asm/uv/uv_bau.h.rej
can't find file to patch at input line 105
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/arch/x86/kernel/tlb_32.c b/arch/x86/kernel/tlb_32.c
|index ce50546..ec53818 100644
|--- a/arch/x86/kernel/tlb_32.c
|+++ b/arch/x86/kernel/tlb_32.c
--------------------------
File to patch: 


Call for HELP - What should i do now?
Comment 40 Rafael J. Wysocki 2009-08-18 21:21:36 UTC
On Tuesday 18 August 2009, John Stoffel wrote:
> 
> Just a quick followup, I've been doing a git bisect run over the past
> week or so trying to narrow this down.  It's slow, since the system
> doesn't hang at any one point reliably.  So I far, here's my git log:
> 
> > git bisect log
> git bisect start
> # bad: [f4b9a988685da6386d7f9a72df3098bcc3270526] Merge branch
> 'for-linus' of git://git.infradead.org/ubi-2.6
> git bisect bad f4b9a988685da6386d7f9a72df3098bcc3270526
> # good: [8e0ee43bc2c3e19db56a4adaa9a9b04ce885cd84] Linux 2.6.29
> git bisect good 8e0ee43bc2c3e19db56a4adaa9a9b04ce885cd84
> # bad: [095342389e2ed8deed07b3076f990260ce3c7c9f] perf_counter, x86:
> generic use of cpuc->active
> git bisect bad 095342389e2ed8deed07b3076f990260ce3c7c9f
> # bad: [095342389e2ed8deed07b3076f990260ce3c7c9f] perf_counter, x86:
> generic use of cpuc->active
> git bisect bad 095342389e2ed8deed07b3076f990260ce3c7c9f
> # bad: [095342389e2ed8deed07b3076f990260ce3c7c9f] perf_counter, x86:
> generic use of cpuc->active
> git bisect bad 095342389e2ed8deed07b3076f990260ce3c7c9f
> # bad: [095342389e2ed8deed07b3076f990260ce3c7c9f] perf_counter, x86:
> generic use of cpuc->active
> git bisect bad 095342389e2ed8deed07b3076f990260ce3c7c9f
> # bad: [ebc8eca169be0283d5a7ab54c4411dd59cfb0f27] Merge branch 'next'
> of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Comment 41 Martin Rogge 2009-08-18 22:57:19 UTC
Adding Rusty as cc. Rusty, does this make sense to you?
Comment 42 Wylda 2009-08-19 13:36:49 UTC
Rafael, isn't it a weird bisection you posted? Because:

 * There is just one "good" - just starting one
 * Why 4x "git bisect bad" gives same commit "perf_counter, x86:...."
 * Last commit is from powerpc (not x86)
 * is suspiciously short


Anyway, i wanted to prove, that Rusty's work in this particular case brought some badness into kernel (at least for 2x CPU P3 and P2 machines).

So we all agreed, that 2.6.29 is OK (does not freeze at least). So i reverted the logic and instead of removing the 4595f9620cda8a1e973588e743cf5f8436dd20c6 (which is not possible in 2.6.30-rc1 and later) i applied this commit to 2.6.29. So i did:

git checkout v2.6.29
git show 4595f9620cda8a1e973588e743cf5f8436dd20c6 | patch -p1

and guess what... 2.6.29 began to freeze exactly the same way like 2.6.30[.12345]. Is this enough or should i do something more?


Last thing Rafael. If this bug is marked as "Blocking", does it mean that 2.6.31 cannot be released till this is fixed? Because i can confirm that this freezing also happens in 2.6.31-rc6.
Comment 43 Martin Rogge 2009-08-19 20:11:09 UTC
(In reply to comment #42)

Googling 4595f9620cda8a1e973588e743cf5f8436dd20c6 or searching lkml.org for it reveals that the commit caused some crashes at the time (January/February 2009) and was subsequently fixed by Ingo Molnar. The fix was tested on hyperthreading machines because they were thought to be most vulnerable. Maybe it is possible that the fix fails on dual P2s and P3s?
Comment 44 Wylda 2009-08-19 20:29:46 UTC
Created attachment 22779 [details]
One of the rare trace in this saga


During a preparation of my server i could not help and give it a try with debian's kernel and attached serial console just in case... And i was lucky and got a trace, after that i pressed few times Alt-SysRq l/m/s. After few sec machine died completely. At least i got something. Complete log attached.

This is probably one of my last contributions. Hope that those 10days and more than 160 restart were not wasted. Good luck!


[  201.865003] BUG: soft lockup - CPU#1 stuck for 61s! [aptitude:2183]
[  201.865003] Modules linked in: loop psmouse evdev snd_pcm snd_timer serio_raw snd soundcore snd_page_alloc pcspkr i2c_piix4 i2c_core parport_pc parport processor button sworks_agp agpgart ext3 jbd mbcache raid0 md_mod sg sr_mod cdrom sd_mod crc_t10dif ide_gd_mod ata_generic sata_sil ohci_hcd 8139cp serverworks ide_pci_generic libata e1000 usbcore 8139too mii ide_core scsi_mod thermal fan thermal_sys
[  201.865003] 
[  201.865003] Pid: 2183, comm: aptitude Not tainted (2.6.30-1-686 #1) STL2
[  201.865003] EIP: 0060:[<c031e818>] EFLAGS: 00000297 CPU: 1
[  201.865003] EIP is at _spin_lock+0xe/0x15
[  201.865003] EAX: f63ac990 EBX: 000000c3 ECX: 00000000 EDX: 0000a5a4
[  201.865003] ESI: 00000000 EDI: f63ac968 EBP: c14ca060 ESP: d1863c90
[  201.865003]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[  201.865003] CR0: 80050033 CR2: 0b0ad01c CR3: 1183a000 CR4: 000006d0
[  201.865003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  201.865003] DR6: ffff0ff0 DR7: 00000400
[  201.865003] Call Trace:
[  201.865003]  [<c018132b>] ? page_referenced_file+0x2e/0x82
[  201.865003]  [<c02c414e>] ? tcp_v4_rcv+0x3e3/0x5de
[  201.865003]  [<c012afbe>] ? irq_enter+0xf/0x45
[  201.865003]  [<c012b2c2>] ? irq_exit+0x31/0x53
[  201.865003]  [<c0103996>] ? error_interrupt+0x2a/0x30
[  201.865003]  [<c0181d8e>] ? page_referenced+0xbf/0xf1
[  201.865003]  [<c012007b>] ? find_lowest_rq+0x75/0x106
[  201.865003]  [<c01200d8>] ? find_lowest_rq+0xd2/0x106
[  201.865003]  [<c017313a>] ? shrink_page_list+0x121/0x568
[  201.865003]  [<c01724dd>] ? isolate_pages_global+0x91/0x1d0
[  201.865003]  [<c012afbe>] ? irq_enter+0xf/0x45
[  201.865003]  [<c012b2c2>] ? irq_exit+0x31/0x53
[  201.865003]  [<c0103996>] ? error_interrupt+0x2a/0x30
[  201.865003]  [<c01737af>] ? shrink_list+0x22e/0x4d6
[  201.865003]  [<c012afbe>] ? irq_enter+0xf/0x45
[  201.865003]  [<c012b2c2>] ? irq_exit+0x31/0x53
[  201.865003]  [<c0103996>] ? error_interrupt+0x2a/0x30
[  201.865003]  [<c012afbe>] ? irq_enter+0xf/0x45
[  201.865003]  [<c012afbe>] ? irq_enter+0xf/0x45
[  201.865003]  [<c012b2c2>] ? irq_exit+0x31/0x53
[  201.865003]  [<c0103996>] ? error_interrupt+0x2a/0x30
[  201.865003]  [<c0173c77>] ? shrink_zone+0x220/0x2af
[  201.865003]  [<c01748cb>] ? try_to_free_pages+0x225/0x34c
[  201.865003]  [<c017244c>] ? isolate_pages_global+0x0/0x1d0
[  201.865003]  [<c016f832>] ? __alloc_pages_internal+0x219/0x39d
[  201.865003]  [<c017b78a>] ? handle_mm_fault+0x162/0x652
[  201.865003]  [<c011774d>] ? do_page_fault+0x1d8/0x1e7
[  201.865003]  [<c0117575>] ? do_page_fault+0x0/0x1e7
[  201.865003]  [<c031ea45>] ? error_code+0x6d/0x74
[  201.865003]  [<c0117575>] ? do_page_fault+0x0/0x1e7
Comment 45 Wylda 2009-08-19 20:51:54 UTC
(In reply to comment #43)

> 
> ...fixed by Ingo Molnar. The fix was tested on hyperthreading
> machines because they were thought to be most vulnerable. Maybe it is
> possible
> that the fix fails on dual P2s and P3s?

Later fixes should not be the reason for this, because i took 4595f9620cda8a1e973588e743cf5f8436dd20c6 (without subsequent Ingo's fixes) and applied to 2.6.29. After that, machine began to freeze.
Comment 46 Roland Kletzing 2009-08-19 22:13:57 UTC
congrats for the analysis and thanks for the hard work!

let me try putting that together - this change probably introduced the issue:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4595f9620cda8a1e973588e743cf5f8436dd20c6

and this one fixed it to some degree, but obviously not entirely (maybe one more race being introduced?) :
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5766b842b23c6b40935a5f3bd435b2bcdaff2143


correct ?


>If this bug is marked as "Blocking", does it mean that 2.6.31 cannot
>be released till this is fixed?
i will hope that linus won`t release a kernel with such treacherous bug.
Comment 47 Thomas Björnell 2009-08-19 23:33:32 UTC
While most people seem to experience these lockups with dual P2s or P3s, I might aswell add that 2.6.30.x is also locking up for me on my dual Athlon MP box.
Comment 48 Ryan Underwood 2009-08-20 02:46:12 UTC
Looks like this particular commit caused some controversy before:
http://lkml.indiana.edu/hypermail/linux/kernel/0901.2/01662.html
Comment 49 Ryan Underwood 2009-08-20 03:05:01 UTC
The dual Pentium Pro I earlier reported as working (for >10 days!)  crashed today, soft lockup on CPU#1.  So PPro is susceptible too, it just takes much longer to manifest.  My P3 system would typically lock up in less than an hour.
Comment 50 Beau Winters 2009-08-20 04:19:01 UTC
I've also had this problem since 2.6.30 as well.  I'm also running a dual Athlon MP system.  Perhaps this is an issue related to dual processor systems?
Comment 51 Roger 2009-08-20 07:23:31 UTC
Ditto with freezing with dual P3's here.  But I seem to have tracked it down to
e100.c.  Preventing e100.c from loading (ie. compiling as module & blacklisting
it), I then got 16+ hours uptime before rebooting to further document the bugs.

The freeze is so bad, even serial console locks-up/freezes.

Here's my quick documentation on the e100.c freeze:
http://bugzilla.kernel.org/show_bug.cgi?id=13991

The other reason I'm pointing my finger at e100, there are tons of patches
within the past version.  Not to mention previous patches which killed wake on
lan.  I might be slightly concluding it's e100.c though.
Comment 52 Wylda 2009-08-20 07:34:42 UTC
(In reply to comment #51)
> Ditto with freezing with dual P3's here.  But I seem to have tracked it down
> to
> e100.c.

OK, if you solve it, then it seems as different bug, because this happens to me with:

 * Network card: PCI-X, Intel 1Gbps 82543GC

 * Network card: PCI Realtek RT8139

None of these use e100.c.
Comment 53 Roger 2009-08-20 08:03:21 UTC
Pavel, I haven't been able to setup kgdb by serial console, but have had a serial
console stead on 2.6.30 for the past days and didn't get any trace, but did get
irratic e100 (nic) up & down just prior to freeze on time.

Any other way of getting more debug output out of the kernel?

btw, only PCI here, (Tyan Tiger 100 i440bx)
Comment 54 Wylda 2009-08-20 09:48:54 UTC
Roger, you did not get my point. I don't argue with you - you found a bug, but you will have to decide:


 * A *) #13991 _is_ the same like #13933, than let #13991 CLOSE as DUPLICATE and _leave_ idea of e100 to be a source of all problems

 * B *) #13991 _is_not_ the same like #13933, than do the bisecting in e100 and find a commit wich is causing you troubles. _Then_ find a developer of e100 and try to persuade him, that he screwed your machine. After that be happy to close #13991.


But _PLEASE_ do not mix things together. Contrariwise, we need separete things and make them as clear as possible(such as bisecting), otherwise developers will be confused and rather go away. And we don't want this.


If you are confused and do not know, if choose A or B, take following steps:

 * Turn off one of your CPU (HW or SW way) and than simulate your lockup with
   e100. Still occur with only one CPU?

     YES: Then go for B and good luck with bisecting
     NO : Then forget about freezing bug in e100 and you are welcomed in
          #13933 club


So _before_ you answer me, PLEASE read this comment three times :c) and be sure to choose A or B before. If you don't have time to take the step you don't have time to post here ;) That's OK, i also have problems with time...

PS: I'm not a developer so i dont use KGDB and realy can't help you with debuging. Sorry :(
Comment 55 Thomas Pfaff 2009-08-20 11:55:36 UTC
I have a DELL Precision dual P3 server that also freezes with 2.6.30.

My first thought was that the nvidia driver is not yet ready for 2.6.30 and switched back to 2.6.27. Next i tried again without the nvidia driver and it also freezes under high load. Then i build the kernel without smp support and it survived a five hour stress test.

Therefore i do think that this bug and the other bug reports like http://bugzilla.kernel.org/show_bug.cgi?id=13219 are smp problems on P2 and P3 systems and not directly NIC or chipset related.

I have not seen a freeze bug report for a single P3 system nor on P4 and above (single or multicore) so far.
Comment 56 Wylda 2009-08-20 12:07:07 UTC
Still pretty quiet on developer front :-/ So i also reported this problem at Debian (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=542551). You can take it as a template for your distro bug tracking system. If your distro offer 2.6.30 and later, give it a try. Reason:

 * Let distro developer know about such issue

 * Save distro developers time with searching, replicating, bisecting 

 * Linking this bug report form other sites, will make this #1 search result in
   google

This is all done for attracting more people, so we will know, if there are 10, 10hunderds, 10zillion affected users and also can get more valuable report and platforms like those with dual Athlon MP.


If this won't help to attract some developer we will probably have to fill in some www.petitiononline.com or do some advertisement in Times like Mozilla during a release of new version :-D

OK, enough jokes. Anyone who attract developer who begin to work on this will get special tag "Nail-developer-down-by: Mr. X Y" (of course it will be in front of all those "Signed-off-by: " things)
Comment 57 David Hill 2009-08-20 12:09:44 UTC
The common point of all freezing computers, is SMP and e100?
Or did I miss something?
Comment 58 Wylda 2009-08-20 12:46:13 UTC
I was few mins off... Of course not because waiting for sedative to take effects before replaying, but to summarize facts.

Seriously, David. I don't think this is related to e100, e1000, realtek driver or whatever network driver. I even think this has nothing to do with general networking, IPv4 stack or whatever.

I can trigger this bug without network connection. Hope it's clarified now.

If i correct your conclusion, David:
 *** common point of all freezing computers _in_this_bugreport_ is 2x CPU _and_
 *** commit 4595f9620cda8a1e973588e743cf5f8436dd20c6
Comment 59 David Hill 2009-08-20 13:35:29 UTC
Great news if we found THE commit...  My bisection isn't going fast  
enough as I can reach more than 4 hours of uptime sometimes!!!!

Is there a way to test this without loosing the point I've reached?

David Hill

On 2009-08-20, at 12:46, bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13933
>
>
>
>
>
> --- Comment #58 from Pavel Vilim <wylda@volny.cz>  2009-08-20  
> 12:46:13 ---
>
> I was few mins off... Of course not because waiting for sedative to  
> take
> effects before replaying, but to summarize facts.
>
> Seriously, David. I don't think this is related to e100, e1000,  
> realtek driver
> or whatever network driver. I even think this has nothing to do with  
> general
> networking, IPv4 stack or whatever.
>
> I can trigger this bug without network connection. Hope it's  
> clarified now.
>
> If i correct your conclusion, David:
> *** common point of all freezing computers _in_this_bugreport_ is 2x  
> CPU _and_
> *** commit 4595f9620cda8a1e973588e743cf5f8436dd20c6
>
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug.
>
> -- 
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>
Comment 60 Wylda 2009-08-20 14:09:54 UTC
(In reply to comment #59)
> Great news if we found THE commit...  My bisection isn't going fast  
> enough as I can reach more than 4 hours of uptime sometimes!!!!
> 
> Is there a way to test this without loosing the point I've reached?
> 

I'm a git noob... But i think, if you backup your bisection, you can return anytime. Do "git bisect log > my_bisection.log" (or backup file: .git/BISECT_LOG)

If following freeze your machine, you share same pain and you belongs to #13933:

git bisect reset
git reset --hard HEAD
git checkout v2.6.29
git show 4595f9620cda8a1e973588e743cf5f8436dd20c6 | patch -p1
make mrproper
cp your_config .config
make oldconfig
make -j 2

Please don't come in a day, that i'm wrong and previous works perfectly for you. There are reports, that freeze occur after 10days of work/uptime.

I you want to return back to your bisection:

git bisect reset
git reset --hard HEAD
git fetch ; git rebase origin
git bisect start

and then do

git bisect good/bad <hash_from_my_bisection.log> -- do it based on your backup log.

If this does not work for you. Sorry i warned you, that i'm git noob ;)
Comment 61 Roland Kletzing 2009-08-20 14:49:27 UTC
>Is there a way to test this without loosing the point I've reached?

yes - you need to test with a kernel w/ and w/o commit 4595f9620cda8a1e973588e743cf5f8436dd20c6

can`t you just checkout into a clean/new git-repo ?


let me try putting things together again - this change probably introduced the issue:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4595f9620cda8a1e973588e743cf5f8436dd20c6

and this one fixed it to some degree, but obviously not entirely (maybe one more race being introduced?) :
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5766b842b23c6b40935a5f3bd435b2bcdaff2143

and these ones are more fixes for commit 4595f9620cda8a1e973588e743cf5f8436dd20c6:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=694aa960608d2976666d850bd4ef78053bbd0c84
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a4a0acf8e17e3d08e28b721ceceb898fbc959ceb
Comment 62 Linus Torvalds 2009-08-20 17:15:28 UTC
[ Added more people and bugzilla to the cc, since I have a random patch. 
  Quite frankly, this patch is not really deeply thought out, it's just a 
  "Hmm, that situation could have different behavior on different 
  hardware" kind of random musing ]

On Thu, 20 Aug 2009, Mike Travis wrote:
> 
> I've been quite a ways away from this code for a while but I'll look closer
> at it today, especially your observations.  Unfortunately, a 32-bit test
> machine is a hard thing to find around here.  (Even my laptop runs a 64-bit
> kernel [w/NR_CPUS=4096 of course].)

Well, even then, some indications seem to be that it's mainly older 
machines. I'm not seeing any Core 2's or even P4's. Of course, it might 
be timing (and need a slower CPU), but there are no celerons or Atoms 
there either (not that there are all that many reports, so it might be 
just pure bad luck).

So it could literally be some interaction issue with "older APIC" or 
similar.

Anyway, the whole "empty mask" thing does strike me as a special case, and 
something that I could imagine different hardware does different thigns 
for, so what happens with a patch like this?

		Linus

---
 arch/x86/kernel/apic/ipi.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index dbf5445..6ef00ba 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -106,6 +106,9 @@ void default_send_IPI_mask_logical(const struct cpumask *cpumask, int vector)
 	unsigned long mask = cpumask_bits(cpumask)[0];
 	unsigned long flags;
 
+	if (WARN_ONCE(!mask, "empty IPI mask"))
+		return;
+
 	local_irq_save(flags);
 	WARN_ON(mask & ~cpumask_bits(cpu_online_mask)[0]);
 	__default_send_IPI_dest_field(mask, vector, apic->dest_logical);
Comment 63 Martin Rogge 2009-08-20 19:06:08 UTC
(In reply to comment #62)
> +    if (WARN_ONCE(!mask, "empty IPI mask"))
> +        return;
> +

Testing it right now (on untainted 2.6.30.4). It might take a while to trigger.
Comment 64 Linus Torvalds 2009-08-20 19:39:49 UTC
On Thu, 20 Aug 2009, Linus Torvalds wrote:
> 
> Anyway, the whole "empty mask" thing does strike me as a special case, and 
> something that I could imagine different hardware does different thigns 
> for, so what happens with a patch like this?

Just a quick note: commit 694aa960608d2976666d850bd4ef78053bbd0c84 seems 
to imply that this "empty CPUmask" really does happen, and confused the 
xen_flush_tlb_others() code. 

I do suspect that if this really is it (ie that WARN_ON() actually 
triggers, and returning early from default_send_IPI_mask_logical() fixes 
the hang), then we should fix it at a higher level, rather than in the 
actual IPI code.

It looks trivial to make 'bitmask_and[not]()' return whether the result 
has any bits set or not, and then we could do something like this 
instead..

However, this is only relevant if my previous hacky patch actually 
triggers. But the fact that Xen had issues with empty CPUmasks does seem 
to indicate that it really could trigger.

The patch below is totally untested. It may or may not compile, much less 
actually work. Caveat emptor.

		Linus

---
 arch/x86/mm/tlb.c          |   21 ++++++++++-----------
 include/linux/bitmap.h     |   18 ++++++++----------
 include/linux/cpumask.h    |   20 ++++++++++----------
 lib/bitmap.c               |   12 ++++++++----
 4 files changed, 36 insertions(+), 35 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 821e970..c814e14 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -183,18 +183,17 @@ static void flush_tlb_others_ipi(const struct cpumask *cpumask,
 
 	f->flush_mm = mm;
 	f->flush_va = va;
-	cpumask_andnot(to_cpumask(f->flush_cpumask),
-		       cpumask, cpumask_of(smp_processor_id()));
-
-	/*
-	 * We have to send the IPI only to
-	 * CPUs affected.
-	 */
-	apic->send_IPI_mask(to_cpumask(f->flush_cpumask),
-		      INVALIDATE_TLB_VECTOR_START + sender);
+	if (cpumask_andnot(to_cpumask(f->flush_cpumask), cpumask, cpumask_of(smp_processor_id()))) {
+		/*
+		 * We have to send the IPI only to
+		 * CPUs affected.
+		 */
+		apic->send_IPI_mask(to_cpumask(f->flush_cpumask),
+			      INVALIDATE_TLB_VECTOR_START + sender);
 
-	while (!cpumask_empty(to_cpumask(f->flush_cpumask)))
-		cpu_relax();
+		while (!cpumask_empty(to_cpumask(f->flush_cpumask)))
+			cpu_relax();
+	}
 
 	f->flush_mm = NULL;
 	f->flush_va = 0;
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 2878811..756d78b 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -94,13 +94,13 @@ extern void __bitmap_shift_right(unsigned long *dst,
                         const unsigned long *src, int shift, int bits);
 extern void __bitmap_shift_left(unsigned long *dst,
                         const unsigned long *src, int shift, int bits);
-extern void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
+extern int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
 			const unsigned long *bitmap2, int bits);
 extern void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 			const unsigned long *bitmap2, int bits);
 extern void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
 			const unsigned long *bitmap2, int bits);
-extern void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+extern int __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
 			const unsigned long *bitmap2, int bits);
 extern int __bitmap_intersects(const unsigned long *bitmap1,
 			const unsigned long *bitmap2, int bits);
@@ -171,13 +171,12 @@ static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
 	}
 }
 
-static inline void bitmap_and(unsigned long *dst, const unsigned long *src1,
+static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
 	if (small_const_nbits(nbits))
-		*dst = *src1 & *src2;
-	else
-		__bitmap_and(dst, src1, src2, nbits);
+		return (*dst = *src1 & *src2) != 0;
+	return __bitmap_and(dst, src1, src2, nbits);
 }
 
 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
@@ -198,13 +197,12 @@ static inline void bitmap_xor(unsigned long *dst, const unsigned long *src1,
 		__bitmap_xor(dst, src1, src2, nbits);
 }
 
-static inline void bitmap_andnot(unsigned long *dst, const unsigned long *src1,
+static inline int bitmap_andnot(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, int nbits)
 {
 	if (small_const_nbits(nbits))
-		*dst = *src1 & ~(*src2);
-	else
-		__bitmap_andnot(dst, src1, src2, nbits);
+		return (*dst = *src1 & ~(*src2)) != 0;
+	return __bitmap_andnot(dst, src1, src2, nbits);
 }
 
 static inline void bitmap_complement(unsigned long *dst, const unsigned long *src,
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index c5ac87c..796df12 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -43,10 +43,10 @@
  * int cpu_isset(cpu, mask)		true iff bit 'cpu' set in mask
  * int cpu_test_and_set(cpu, mask)	test and set bit 'cpu' in mask
  *
- * void cpus_and(dst, src1, src2)	dst = src1 & src2  [intersection]
+ * int cpus_and(dst, src1, src2)	dst = src1 & src2  [intersection]
  * void cpus_or(dst, src1, src2)	dst = src1 | src2  [union]
  * void cpus_xor(dst, src1, src2)	dst = src1 ^ src2
- * void cpus_andnot(dst, src1, src2)	dst = src1 & ~src2
+ * int cpus_andnot(dst, src1, src2)	dst = src1 & ~src2
  * void cpus_complement(dst, src)	dst = ~src
  *
  * int cpus_equal(mask1, mask2)		Does mask1 == mask2?
@@ -179,10 +179,10 @@ static inline int __cpu_test_and_set(int cpu, cpumask_t *addr)
 }
 
 #define cpus_and(dst, src1, src2) __cpus_and(&(dst), &(src1), &(src2), NR_CPUS)
-static inline void __cpus_and(cpumask_t *dstp, const cpumask_t *src1p,
+static inline int __cpus_and(cpumask_t *dstp, const cpumask_t *src1p,
 					const cpumask_t *src2p, int nbits)
 {
-	bitmap_and(dstp->bits, src1p->bits, src2p->bits, nbits);
+	return bitmap_and(dstp->bits, src1p->bits, src2p->bits, nbits);
 }
 
 #define cpus_or(dst, src1, src2) __cpus_or(&(dst), &(src1), &(src2), NR_CPUS)
@@ -201,10 +201,10 @@ static inline void __cpus_xor(cpumask_t *dstp, const cpumask_t *src1p,
 
 #define cpus_andnot(dst, src1, src2) \
 				__cpus_andnot(&(dst), &(src1), &(src2), NR_CPUS)
-static inline void __cpus_andnot(cpumask_t *dstp, const cpumask_t *src1p,
+static inline int __cpus_andnot(cpumask_t *dstp, const cpumask_t *src1p,
 					const cpumask_t *src2p, int nbits)
 {
-	bitmap_andnot(dstp->bits, src1p->bits, src2p->bits, nbits);
+	return bitmap_andnot(dstp->bits, src1p->bits, src2p->bits, nbits);
 }
 
 #define cpus_complement(dst, src) __cpus_complement(&(dst), &(src), NR_CPUS)
@@ -738,11 +738,11 @@ static inline void cpumask_clear(struct cpumask *dstp)
  * @src1p: the first input
  * @src2p: the second input
  */
-static inline void cpumask_and(struct cpumask *dstp,
+static inline int cpumask_and(struct cpumask *dstp,
 			       const struct cpumask *src1p,
 			       const struct cpumask *src2p)
 {
-	bitmap_and(cpumask_bits(dstp), cpumask_bits(src1p),
+	return bitmap_and(cpumask_bits(dstp), cpumask_bits(src1p),
 				       cpumask_bits(src2p), nr_cpumask_bits);
 }
 
@@ -779,11 +779,11 @@ static inline void cpumask_xor(struct cpumask *dstp,
  * @src1p: the first input
  * @src2p: the second input
  */
-static inline void cpumask_andnot(struct cpumask *dstp,
+static inline int cpumask_andnot(struct cpumask *dstp,
 				  const struct cpumask *src1p,
 				  const struct cpumask *src2p)
 {
-	bitmap_andnot(cpumask_bits(dstp), cpumask_bits(src1p),
+	return bitmap_andnot(cpumask_bits(dstp), cpumask_bits(src1p),
 					  cpumask_bits(src2p), nr_cpumask_bits);
 }
 
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 35a1f7f..ec221a7 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -179,14 +179,16 @@ void __bitmap_shift_left(unsigned long *dst,
 }
 EXPORT_SYMBOL(__bitmap_shift_left);
 
-void __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
+int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
 				const unsigned long *bitmap2, int bits)
 {
 	int k;
 	int nr = BITS_TO_LONGS(bits);
+	unsigned long result = 0;
 
 	for (k = 0; k < nr; k++)
-		dst[k] = bitmap1[k] & bitmap2[k];
+		result |= (dst[k] = bitmap1[k] & bitmap2[k]);
+	return result != 0;
 }
 EXPORT_SYMBOL(__bitmap_and);
 
@@ -212,14 +214,16 @@ void __bitmap_xor(unsigned long *dst, const unsigned long *bitmap1,
 }
 EXPORT_SYMBOL(__bitmap_xor);
 
-void __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
+int __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
 				const unsigned long *bitmap2, int bits)
 {
 	int k;
 	int nr = BITS_TO_LONGS(bits);
+	unsigned long result = 0;
 
 	for (k = 0; k < nr; k++)
-		dst[k] = bitmap1[k] & ~bitmap2[k];
+		result |= dst[k] = bitmap1[k] & ~bitmap2[k];
+	return result != 0;
 }
 EXPORT_SYMBOL(__bitmap_andnot);
Comment 65 Thomas Björnell 2009-08-20 21:42:11 UTC
(In reply to comment #62)
> +    if (WARN_ONCE(!mask, "empty IPI mask"))
> +        return;
> +

Also applied it to my 2.6.30.4 source tree, recompiled and tested a pretty surefire way to hang the box, which is starting Opera with a bunch saved tabs, and got

[  154.157381] ------------[ cut here ]------------
[  154.157420] WARNING: at arch/x86/kernel/apic/ipi.c:109 default_send_IPI_mask_logical+0x2f/0xdd()
[  154.157430] Hardware name: MS-6501
[  154.157437] empty IPI maskModules linked in: fuse netconsole snd_au8820 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_mpu401 ac97_bus snd_mpu401_uart snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_rawmidi snd_timer snd_seq_device snd i2c_amd756 i2c_core ns558 parport_pc parport gameport soundcore evdev usbhid ohci_hcd e100 usbcore
[  154.157667] Pid: 5907, comm: opera Not tainted 2.6.30.4 #2
[  154.157673] Call Trace:
[  154.157686]  [<b012454c>] warn_slowpath_common+0x60/0x90
[  154.157704]  [<b01245b0>] warn_slowpath_fmt+0x24/0x27
[  154.157712]  [<b011046f>] default_send_IPI_mask_logical+0x2f/0xdd
[  154.157726]  [<b0117c15>] flush_tlb_others_ipi+0x87/0xb4
[  154.157746]  [<b0117db8>] flush_tlb_mm+0x59/0x5d
[  154.157756]  [<b016b19e>] mprotect_fixup+0x212/0x296
[  154.157764]  [<b016b38c>] sys_mprotect+0x16a/0x1c6
[  154.157776]  [<b0102958>] sysenter_do_call+0x12/0x36
[  154.157793] ---[ end trace ced042bf780fb0a5 ]---

And returning early from the funtion means my box is still alive.
Comment 66 Martin Rogge 2009-08-20 23:35:06 UTC
(In reply to comment #63)
> (In reply to comment #62)
> > +    if (WARN_ONCE(!mask, "empty IPI mask"))
> > +        return;
> > +
> 
> Testing it right now (on untainted 2.6.30.4). It might take a while to
> trigger.

caught a warning, machine survived:

Aug 21 01:14:01 arnold kernel: ------------[ cut here ]------------
Aug 21 01:14:01 arnold kernel: WARNING: at arch/x86/kernel/apic/ipi.c:109 default_send_IPI_mask_logical+0x2a/0xb0()
Aug 21 01:14:01 arnold kernel: Hardware name: VT8653-8233
Aug 21 01:14:01 arnold kernel: empty IPI maskModules linked in: via_agp agpgart
Aug 21 01:14:01 arnold kernel: Pid: 2561, comm: ktorrent Not tainted 2.6.30.4 #6
Aug 21 01:14:01 arnold kernel: Call Trace:
Aug 21 01:14:01 arnold kernel:  [<c0122994>] ? warn_slowpath_common+0x5e/0x8a
Aug 21 01:14:01 arnold kernel:  [<c01229f2>] ? warn_slowpath_fmt+0x26/0x2a
Aug 21 01:14:01 arnold kernel:  [<c010f63e>] ? default_send_IPI_mask_logical+0x2a/0xb0
Aug 21 01:14:01 arnold kernel:  [<c011640d>] ? flush_tlb_others_ipi+0x83/0xad
Aug 21 01:14:01 arnold kernel:  [<c0116517>] ? flush_tlb_mm+0x60/0x7a
Aug 21 01:14:01 arnold kernel:  [<c0159a06>] ? unmap_region+0xe4/0x118
Aug 21 01:14:01 arnold kernel:  [<c015a667>] ? do_munmap+0x1de/0x228
Aug 21 01:14:01 arnold kernel:  [<c015a6d8>] ? sys_munmap+0x27/0x35
Aug 21 01:14:01 arnold kernel:  [<c0102941>] ? syscall_call+0x7/0xb
Aug 21 01:14:01 arnold kernel: ---[ end trace c311cb19727383c1 ]---
Comment 67 Linus Torvalds 2009-08-21 17:04:54 UTC
Ok, current -git now has all the commits, and marked with cc: stable@kernel.org. 
Commits:

  b04e637 x86: don't call '->send_IPI_mask()' with an empty mask
  f4b0373 Make bitmask 'and' operators return a result code
  83d349f x86: don't send an IPI to the empty set of CPU's

so this bug entry should probably be closed once people have double-checked it,
Comment 68 Ingo Molnar 2009-08-21 19:37:23 UTC
upstream commits resolving this bugzilla are:

 b04e637: x86: don't call '->send_IPI_mask()' with an empty mask
 f4b0373: Make bitmask 'and' operators return a result code
 83d349f: x86: don't send an IPI to the empty set of CPU's
Comment 69 Martin Rogge 2009-08-21 21:52:59 UTC
I shall keep running 83d349f just to make sure no more lockups occur. 

Thanks to everybody who contributed. Good teamwork. ;-)
Comment 70 Roger 2009-08-24 20:14:21 UTC
*** Bug 13991 has been marked as a duplicate of this bug. ***
Comment 71 Ognjan Iordanov 2009-09-07 06:31:37 UTC
I tried the patches from Linus and it works on my Tyan S1834/Tiger 133 mainboard with P3 1000Mhz processors, but i still have freezes on Asus P2B-D Mainboard with 600Mhz PIII processors.
Comment 72 Ognjan Iordanov 2009-09-08 07:15:10 UTC
I try 2.6.25.9 kernel, but it freezes too.
Comment 73 Ryan Underwood 2009-09-10 22:35:45 UTC
Ognjan, this was a problem introduced around 2.6.30, so if you are having problems with 2.6.25.9, there is a different problem.  Try booting with acpi=off and/or noapic, and also check all of the cylindrical capacitors on the motherboard for tops that are not flat indicating they have gone bad.  Also consider running memtest86+ overnight.
Comment 74 Linus Torvalds 2009-09-10 22:39:23 UTC
On Mon, 7 Sep 2009, bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> I tried the patches from Linus and it works on my Tyan S1834/Tiger 133
> mainboard with P3 1000Mhz processors, but i still have freezes on Asus P2B-D
> Mainboard with 600Mhz PIII processors.

I suspect your 600MHz P-III board has some other issues. The fact that the 
problems with that board go back to 2.6.25.9 also implies that - the TLB 
flush IPI problem was new to 2.6.30.

So your PIII lockup is different, and should probably get a bugzilla of 
its own rather than be mixed up with this one.

Feel free to open a new bugzilla entry, but before you do that, can you 
try enabling the NMI watchdog and try to see if you can get it to hang in 
text-mode so that the NMI watchdog has a chance to trigger and show 
anything? (See Documentation/nmi_watchdog.txt for details)

Also, one word of warning: how sure are you about the stability of that 
machine in general? Hangs under load can easily be due to borderline power 
supplies (which includes things like the capacitors on the motherboard, 
not just the PSU unit itself) causing CPU power brownouts etc.

			Linus
Comment 75 Roger 2009-09-11 08:35:00 UTC
The "Tyan S1834/Tiger 133" is a VIA chipset.  The "Asus P2B-D" is an Intel 440BX chipset motherboard.  IMO, the 440BX boards are pretty darn stable, in which, I was working on the LinuxBIOS/Coreboot project with three of them here.

The common problem on these (or any) boards, dust in the RAM, PCI and/or CPU slots -- doesn't look like you have slot style CPU slots on the P2B-D.  Getting the dust out and re-seating the boards always seems to solve the problem(s) here.

I have a Tyan Tiger 100 440BX along with two other cheaper 440BX boards around and they tend to be rock solid (aside from the dust issues).
Comment 76 Ognjan Iordanov 2009-09-11 10:06:31 UTC
Sorry for the confusion. It was my fault. It seems to work now with noapic and acpi = off. Thanks a lot.
Comment 77 richie 2009-09-11 15:51:03 UTC
hi;

Just to offer some information that may help track this issue down;

I'm running/on a Dual XEON (old style) MP system here, a (Dell Precision WorkStation 530). and i do _NOT_ get these lockups using a Debian
2.6.30-1-686 sid kernel. 
Uptime ==  15:21:19 up 31 days,  8:17,  1 user,  load average: 0.09, 0.05, 0.01

* cat /proc/version
Linux version 2.6.30-1-686 (Debian 2.6.30-4) (waldi@debian.org) (gcc version 4.3.3 (Debian 4.3.3-15) ) #1 SMP Thu Jul 30 14:45:30 UTC 2009 
(note the gcc version as someone above alluded to as perhaps part of the issue).
 
* Full System specs ; 
http://pompone.cs.ucsb.edu/admin/530_Workstation/00:00.0 Host bridge [0600]: 

* lspci -nn output;
Intel Corporation 82860 860 (Wombat) Chipset Host Bridge (MCH) [8086:2531] (rev 04)
00:01.0 PCI bridge [0604]: Intel Corporation 82850 850 (Tehama) Chipset AGP Bridge [8086:2532] (rev 04)
00:02.0 PCI bridge [0604]: Intel Corporation 82860 860 (Wombat) Chipset AGP Bridge [8086:2533] (rev 04)
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev 04)
00:1f.0 ISA bridge [0601]: Intel Corporation 82801BA ISA Bridge (LPC) [8086:2440] (rev 04)
00:1f.1 IDE interface [0101]: Intel Corporation 82801BA IDE U100 Controller [8086:244b] (rev 04)
00:1f.2 USB Controller [0c03]: Intel Corporation 82801BA/BAM USB Controller #1 [8086:2442] (rev 04)
00:1f.3 SMBus [0c05]: Intel Corporation 82801BA/BAM SMBus Controller [8086:2443] (rev 04)
00:1f.4 USB Controller [0c03]: Intel Corporation 82801BA/BAM USB Controller #1 [8086:2444] (rev 04)
00:1f.5 Multimedia audio controller [0401]: Intel Corporation 82801BA/BAM AC'97 Audio Controller [8086:2445] (rev 04)
01:00.0 VGA compatible controller [0300]: nVidia Corporation NV34 [GeForce FX 5500] [10de:0326] (rev a1)
02:1f.0 PCI bridge [0604]: Intel Corporation 82806AA PCI64 Hub PCI Bridge [8086:1360] (rev 03)
03:00.0 PIC [0800]: Intel Corporation 82806AA PCI64 Hub Advanced Programmable Interrupt Controller [8086:1161] (rev 01)
04:0b.0 Ethernet controller [0200]: 3Com Corporation 3c905C-TX/TX-M [Tornado] [10b7:9200] (rev 78)
04:0c.0 FireWire (IEEE 1394) [0c00]: Texas Instruments TSB12LV26 IEEE-1394 Controller (Link) [104c:8020]
1specs.htm

cat /proc/cmdline;
root=/dev/hda2 ro acpi=force vga=79

cat /proc/cpuinfo;
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 0
model name      : Intel(R) Xeon(TM) CPU 1700MHz
stepping        : 10
cpu MHz         : 1695.037
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pebs bts
bogomips        : 3390.07
clflush size    : 64
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 15
model           : 0
model name      : Intel(R) Xeon(TM) CPU 1700MHz
stepping        : 10
cpu MHz         : 1695.037
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pebs bts
bogomips        : 3389.93
clflush size    : 64
power management:

* free -m
             total       used       free     shared    buffers     cached
Mem:           502        486         15          0         48        124
-/+ buffers/cache:        313        189
Swap:          964         82        882


You can see the CPU versions/family and Chipset from info above.
If you can use any other relevant info, i'd be glad to post it.
p.s. I also have a pentiumII _single_ CPU system, it's using a debian
.29 kernel, i have yet to upgrade that system, however, AIUI, this
issue does not manifest itself in non-MP ppro/pII/pIII systems.

rich
Comment 78 richie 2009-09-11 16:07:15 UTC
hi again;

just some correction and additional info (and i _didn't_ explicitly/intentionally do that cc mailing stuff i see when posting).

* root=/dev/hda2 ro acpi=force vga=791  <--corrected
(i forget if acpi=force is even needed, it's a carryover from my pII system, prior to 2.6.18 - seems the pII needed it to enable ACPI - and my kernel line
for the pII also contains 'lapic' ..it's now running a .29 though).

* even though you see the 'ht' cpuflag above, there is no BIOS setting for it to enable/disable, and i found that in anything less than 1.8GHz Xeons, even though the flag is present, the CPUs don't have HT ability;
http://lists.us.dell.com/pipermail/linux-poweredge/2002-June/003037.html

$ egrep 'X86_HT|HT_IRQ' /boot/config-2.6.30-1-686
CONFIG_X86_HT=y
CONFIG_HT_IRQ=y

Also; i'm using only the Xorg 'nv' driver, so _not_ the proprietary nvidia one.
Also; apologies about the horrible lspci output formatting .. uff.

rich
Comment 79 richie 2009-09-11 16:15:00 UTC
crap;

one more fix; broken URL for System specs;

http://pompone.cs.ucsb.edu/admin/530_Workstation/1specs.htm  <--corrected

sorry about so much noise and needed correction.

rich