Bug 7727

Summary: spinlock CPU recursion
Product: Other Reporter: support
Component: OtherAssignee: David Howells (dhowells)
Status: RESOLVED CODE_FIX    
Severity: normal CC: klute, s
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.18-1.2239 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Patch to fix the key serial number collision problem

Description support 2006-12-21 10:59:24 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.14
Distribution:Fedora Core 5
Hardware Environment:pogolinux PW1465A Dual opteron server with mirrored 160 Gig SATA drives
Software Environment:Fedora Core 5; Linux Kernel 2.16.18-1.2239 for x86_64; Apache 2.2.3; php 5.1
Problem Description:
We would like to report an error we received from one of our web servers.  We are hesitantly suggesting 
that this is a software issue:  we have an identical machine which has not exhibited this error.  A line 
from the error log seems to provide some insight:

Dec 12 10:13:01 clio kernel:  <0>BUG: spinlock cpu recursion on CPU#1, suexec/27413 (Not tainted)

the complete text of the error log is as follows:

Dec 12 10:13:00 clio kernel: Unable to handle kernel NULL pointer dereference at 0000000000000010 
RIP: Dec 12 10:13:00 clio kernel:  [<ffffffff80225942>] __rb_rotate_left+0x7/0x5b
Dec 12 10:13:00 clio kernel: PGD 3a828067 PUD 3d934067 PMD 0 Dec 12 10:13:00 clio kernel: Oops: 
0000 [1] SMP Dec 12 10:13:00 clio kernel: last sysfs file: /block/hdb/size
Dec 12 10:13:00 clio kernel: CPU 1 Dec 12 10:13:00 clio kernel: Modules linked in: ipv6 nfs lockd 
fscache nfs_acl rfcomm l2cap bluetooth sunrpc dm_mirror dm_mod video sbs i2c_ec i2c_core button 
battery asu
s_acpi ac lp parport_pc parport sg tg3 ide_cd cdrom shpchp k8_edac edac_mc ohci_hcd serio_raw 
floppy ehci_hcd pcspkr raid1 ext3 jbd sata_svw libata sd_mod scsi_mod
Dec 12 10:13:00 clio kernel: Pid: 27406, comm: suexec Not tainted 2.6.18-1.2239.fc5 #1
Dec 12 10:13:00 clio kernel: RIP: 0010:[<ffffffff80225942>]  [<ffffffff80225942>] __rb_rotate_left
+0x7/0x5b
Dec 12 10:13:00 clio kernel: RSP: 0018:ffff810151397df0  EFLAGS: 00010282
Dec 12 10:13:00 clio kernel: RAX: ffff81005a1ded48 RBX: ffff810102505508 RCX: 0000000000000000
Dec 12 10:13:00 clio kernel: RDX: 0000000000000000 RSI: ffffffff806de5e0 RDI: ffff810203166088
Dec 12 10:13:00 clio kernel: RBP: ffff810203166088 R08: ffff8102031668c8 R09: 0000000000000000
Dec 12 10:13:00 clio kernel: R10: 000000005e4ae5f3 R11: ffff810151397c70 R12: ffff810102505508
Dec 12 10:13:00 clio kernel: R13: ffff81005a1ded48 R14: ffffffff806de5e0 R15: 0000000000000026
Dec 12 10:13:00 clio kernel: FS:  00002aaaaaabb850(0000) GS:ffff810103c3b1c0(0000) knlGS:
00000000f7fee8d0
Dec 12 10:13:00 clio kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 12 10:13:00 clio kernel: CR2: 0000000000000010 CR3: 00000000da38b000 CR4: 
00000000000006e0
Dec 12 10:13:00 clio kernel: Process suexec (pid: 27406, threadinfo ffff810151396000, task 
ffff8101d7cf5080)
Dec 12 10:13:00 clio kernel: Stack:  ffffffff80212aff ffff81005a1ded40 ffff810102505518 
0000000000000000
Dec 12 10:13:00 clio kernel:  ffff81005a1ded40 ffff810151397eb8 ffffffff80312779 
0000000046f0a978
Dec 12 10:13:00 clio kernel:  0000000000000000 1f3f0000aa8adfff ffff8101d7cf5080 000003eaffffffff
Dec 12 10:13:00 clio kernel: Call Trace:
Dec 12 10:13:00 clio kernel:  [<ffffffff80212aff>] rb_insert_color+0xb2/0xda
Dec 12 10:13:00 clio kernel:  [<ffffffff80312779>] key_alloc+0x2b0/0x384
Dec 12 10:13:00 clio kernel:  [<ffffffff8031377b>] keyring_alloc+0x29/0x5f
Dec 12 10:13:00 clio kernel:  [<ffffffff80314ea2>] alloc_uid_keyring+0x3d/0xa6
Dec 12 10:13:00 clio kernel:  [<ffffffff80293a5c>] alloc_uid+0xa9/0x16f
Dec 12 10:13:00 clio kernel:  [<ffffffff802963d6>] set_user+0xf/0x97
Dec 12 10:13:00 clio kernel:  [<ffffffff80297b5c>] sys_setuid+0x7d/0x154
Dec 12 10:13:00 clio kernel:  [<ffffffff8025c00e>] system_call+0x7e/0x83
Dec 12 10:13:00 clio kernel: DWARF2 unwinder stuck at system_call+0x7e/0x83
Dec 12 10:13:00 clio kernel: Leftover inexact backtrace:
Dec 12 10:13:00 clio kernel: Dec 12 10:13:00 clio kernel: Dec 12 10:13:00 clio kernel: Code: 48 8b 51 
10 49 83 e0 fc 48 85 d2 48 89 57 08 74 0c 48 8b 02 Dec 12 10:13:00 clio kernel: RIP  
[<ffffffff80225942>] __rb_rotate_left+0x7/0x5b
Dec 12 10:13:00 clio kernel:  RSP <ffff810151397df0>
Dec 12 10:13:00 clio kernel: CR2: 0000000000000010
Dec 12 10:13:01 clio kernel:  <0>BUG: spinlock cpu recursion on CPU#1, suexec/27413 (Not tainted)
Dec 12 10:13:01 clio kernel:  lock: ffffffff80566200, .magic: dead4ead, .owner: hazel.cgi/
27408, .owner_cpu: 1
Dec 12 10:13:01 clio kernel: Dec 12 10:13:01 clio kernel: Call Trace:
Dec 12 10:13:01 clio kernel:  [<ffffffff802691d9>] show_trace+0x34/0x47
Dec 12 10:13:01 clio kernel:  [<ffffffff802691fe>] dump_stack+0x12/0x17
Dec 12 10:13:01 clio kernel:  [<ffffffff80207749>] _raw_spin_lock+0x5d/0xf3
Dec 12 10:13:01 clio kernel:  [<ffffffff803126dd>] key_alloc+0x214/0x384
Dec 12 10:13:01 clio kernel:  [<ffffffff8031377b>] keyring_alloc+0x29/0x5f
Dec 12 10:13:01 clio kernel:  [<ffffffff80314ea2>] alloc_uid_keyring+0x3d/0xa6
Dec 12 10:13:01 clio kernel:  [<ffffffff80293a5c>] alloc_uid+0xa9/0x16f
Dec 12 10:13:01 clio kernel:  [<ffffffff802963d6>] set_user+0xf/0x97
Dec 12 10:13:01 clio kernel:  [<ffffffff80297b5c>] sys_setuid+0x7d/0x154
Dec 12 10:13:01 clio kernel:  [<ffffffff8025c00e>] system_call+0x7e/0x83
Dec 12 10:13:01 clio kernel: DWARF2 unwinder stuck at system_call+0x7e/0x83
Dec 12 10:13:01 clio kernel: Leftover inexact backtrace:
Dec 12 10:13:01 clio kernel: 
Steps to reproduce:
Comment 1 David Woodhouse 2007-02-05 02:49:09 UTC
How often does this happen, and does it manifest itself in other ways?

Can we discount the possibility of a simple RAM problem? 
Comment 2 David Howells 2007-02-05 02:52:17 UTC
I would say that the spinlock recursion bug is almost certainly a dead herring 
as it occurs after the NULL-pointer deref message occurs.  It'll be due to the 
key management spinlock not having been released because the process that was 
holding it got killed, and so CPU#1 still holds the lock.
Comment 3 David Howells 2007-02-05 03:44:07 UTC
This is where is looks to be:

<__rb_rotate_left+0>:        mov    0x8(%rdi),%rcx
<__rb_rotate_left+4>:        mov    (%rdi),%r8
<__rb_rotate_left+7>:        mov    0x10(%rcx),%rdx               <----
<__rb_rotate_left+11>:       and    $0xfffffffffffffffc,%r8

So, it looks like:

static void __rb_rotate_left(struct rb_node *node, struct rb_root *root)
{
	struct rb_node *right = node->rb_right;
	struct rb_node *parent = rb_parent(node);

	if ((node->rb_right = right->rb_left))    <---

in the source code.  %RDI contains the "node" argument on function entry.  
The "right" variable lurks in %RCX from the first insn.  So "right" is zero at 
the time of the crash - ie: when the third instruction tries to dereference it 
to get "right->rb_left".
Comment 4 David Howells 2007-02-05 04:15:25 UTC
Can you enter this in the Red Hat bugzilla system?
Comment 5 David Howells 2007-02-05 09:01:38 UTC
Can you try running this script to see if this causes the fault to occur?  It 
should exercise the creation and destruction of keys.

#!/bin/bash

NR_CPUS=2

if [ "$1" = "go" ]
then
    for ((i=1; i<=500; i++))
    do
	for ((i=1; i<=42; i++))
	do
	    keyctl add user a$i a @s >/dev/null || exit $?
	done
	keyctl clear @s || exit $?
    done
    exit 0
fi

for ((i=1; i<=$NR_CPUS; i++))
do
    keyctl session - $0 go &
done

wait
Comment 6 support 2007-02-05 10:39:44 UTC
We will have to schedule some downtime for this machine so we don't disrupt its
normal traffic - sometime in the evening - and post further information as we
have it.

This machine has only seen this occur the one time - and at that time it was
under heavy load - which is why we feel confident that this is not a hardware issue.

This is being entered into the bugzilla.redhat system as well.
Comment 7 David Howells 2007-02-06 02:23:34 UTC
Thanks.

I left that script running on my testbox overnight and it crashed too, though 
I managed to lose the original fault report (it was followed up by many many 
spinlock retake fault reports).
Comment 8 David Howells 2007-02-06 03:12:05 UTC
Okay... Found it: the key serial number collision avoidance code is wrong.

This didn't use to be a problem as the key serial numbers were allocated from 
a simple incremented counter, and you'd have to go through 2 billion keys 
before encountering a collision.

However, now that random numbers are used instead, collisions are much more 
likely.
Comment 9 David Howells 2007-02-06 05:37:39 UTC
Created attachment 10312 [details]
Patch to fix the key serial number collision problem
Comment 10 Andrew Morton 2007-02-06 15:42:20 UTC
merged a patch from David into -mm.
Comment 11 Adrian Bunk 2007-02-24 09:53:56 UTC
*** Bug 8067 has been marked as a duplicate of this bug. ***
Comment 12 David Howells 2007-05-23 14:09:59 UTC
*** Bug 7915 has been marked as a duplicate of this bug. ***