Bug 11796

Summary: SYSV ipc lock
Product: Other Reporter: Clement Calmels (cboulte)
Component: OtherAssignee: other_other
Status: CLOSED OBSOLETE    
Severity: high CC: alan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27 Subsystem:
Regression: No Bisected commit-id:
Attachments: A simple test that create/delete and sysv message queue, during msgctl request.

Description Clement Calmels 2008-10-20 11:33:06 UTC
Latest working kernel version:
Earliest failing kernel version:
Distribution: Debian Lenny
Hardware Environment: HP Proliant DL140
Software Environment:
Problem Description:

When playing with SYSV ipc, lock happens. 

Steps to reproduce:

I will attach a "simple" program. Launch it several times. The first weird kernel output looks like:

[  390.327265] general protection fault: 0000 [1] SMP
[  390.329004] CPU 2
[  390.329004] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys
[  390.329004] Pid: 2858, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #2
[  390.329004] RIP: 0010:[<ffffffff8025b021>]  [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a
[  390.329004] RSP: 0018:ffff8801455e7d18  EFLAGS: 00010006
[  390.329004] RAX: ffff8801451dada8 RBX: 0000000000000246 RCX: 0000000000000000[  390.329004] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801451dada8[  390.329004] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[  390.329004] R10: 0000000000000000 R11: ffff880144d08200 R12: 0038004000000000[  390.329004] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880144d08200[  390.329004] FS:  00007fe439afd6e0(0000) GS:ffff8801470eebc0(0000) knlGS:0000000000000000
[  390.329004] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  390.329004] CR2: 00007fe439668c70 CR3: 00000001455a3000 CR4: 00000000000006e0[  390.329004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[  390.329004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[  390.329004] Process sysv_test2 (pid: 2858, threadinfo ffff8801455e6000, task ffff880144d08200)
[  390.329004] Stack:  ffff880144d08200 000000000000003f 0000000080993830 0000000000000002
[  390.329004]  0000000000000000 ffff8801451dada8 0000000041b06f50 0000000200000001
[  390.329004]  0000000000000000 ffffffff80598980 0000000000000000 0000000180255e60
[  390.329004] Call Trace:
[  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
[  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
[  390.329004]  [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2
[  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[  390.329004]  [<ffffffff80458637>] _spin_lock+0x2d/0x59
[  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[  390.329004]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
[  390.329004]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
[  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
[  390.329004]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
[  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
[  390.329004]  [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a
[  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
[  390.329004]  [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a
[  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
[  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
[  390.329004]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
[  390.329004]
[  390.329004]
[  390.329004] Code: 41 39 54 24 28 74 24 e8 7e 42 0e 00 85 c0 0f 84 d6 0f 00 00 83 3d a7 8c c8 00 00 be 47 03 00 00 0f 85 c4 0f 00 00 e9 eb 0e 00 00 <f0> 41 ff 84 24 38 01 00 00 45 8b b7 10 06 00 00 41 83 fe 2f 76
[  390.329004] RIP  [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a
[  390.329004]  RSP <ffff8801455e7d18>
[  390.329004] ---[ end trace 507bf06c1bae10b7 ]---

If the program is launched again and again,the cpu finaly locks:

[ 1332.517502] BUG: soft lockup - CPU#3 stuck for 61s! [sysv_test2:3224]
[ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys
[ 1332.517505] irq event stamp: 0
[ 1332.517505] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[ 1332.517505] hardirqs last disabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4
[ 1332.517505] softirqs last  enabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4
[ 1332.517505] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 1332.517505] CPU 3:
[ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys
[ 1332.517505] Pid: 3224, comm: sysv_test2 Tainted: G      D   2.6.27-ipc_lock #2
[ 1332.517505] RIP: 0010:[<ffffffff80213529>]  [<ffffffff80213529>] native_read_tsc+0x8/0x1d
[ 1332.517505] RSP: 0018:ffff88014502bdb8  EFLAGS: 00000206
[ 1332.517505] RAX: 00000000e234ed82 RBX: ffffffffe234ecb7 RCX: 00000000e234ecb7[ 1332.517505] RDX: 00000000000003bd RSI: ffffffff80301e0a RDI: 0000000000000001[ 1332.517505] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[ 1332.517505] R10: 0000000000000000 R11: ffffffff8033e409 R12: ffff88002f62a160[ 1332.517505] R13: ffff8800aee01000 R14: ffff88014502a000 R15: ffffffff80829810[ 1332.517505] FS:  00007f4ab15246e0(0000) GS:ffff8801470ee340(0000) knlGS:0000000000000000
[ 1332.517505] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1332.517505] CR2: 00007f4ab108fc70 CR3: 00000001441cf000 CR4: 00000000000006e0[ 1332.517505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 1332.517505] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 1332.517505]
[ 1332.517505] Call Trace:
[ 1332.517505]  [<ffffffff8033e42a>] delay_tsc+0x21/0x51
[ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[ 1332.517505]  [<ffffffff80220ec9>] __ticket_spin_trylock+0x11/0x18
[ 1332.517505]  [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102
[ 1332.517505]  [<ffffffff80458658>] _spin_lock+0x4e/0x59
[ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[ 1332.517505]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
[ 1332.517505]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
[ 1332.517505]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
[ 1332.517505]  [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95
[ 1332.517505]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 1332.517505]  [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95
[ 1332.517505]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 1332.517505]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
[ 1332.517505]

Every 'ipcs' command will lock a cpu and the node become completely unusable.
A different output:

[ 1473.176004] BUG: soft lockup - CPU#0 stuck for 61s! [ipcs:3287]
[ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys
[ 1473.176008] irq event stamp: 0
[ 1473.176008] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[ 1473.176008] hardirqs last disabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4
[ 1473.176008] softirqs last  enabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4
[ 1473.176008] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 1473.176008] CPU 0:
[ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys
[ 1473.176008] Pid: 3287, comm: ipcs Tainted: G      D   2.6.27-ipc_lock #2
[ 1473.176008] RIP: 0010:[<ffffffff80213527>]  [<ffffffff80213527>] native_read_tsc+0x6/0x1d
[ 1473.176008] RSP: 0018:ffff880141033dd8  EFLAGS: 00000202
[ 1473.176008] RAX: 00000419948a6e9b RBX: ffffffff948a6e9b RCX: 00000000948a6e9b[ 1473.176008] RDX: 00000000948a6e9b RSI: ffffffff80301e0a RDI: 0000000000000001[ 1473.176008] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[ 1473.176008] R10: 8101010101010100 R11: ffffffff8033e409 R12: 0000000000000001[ 1473.176008] R13: ffff8800ae95d000 R14: ffff880141032000 R15: ffffffff80829810[ 1473.176008] FS:  00007ff8a89866e0(0000) GS:ffffffff80624a80(0000) knlGS:0000000000000000
[ 1473.176008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1473.176008] CR2: 00007ff8a8486ac0 CR3: 000000014102b000 CR4: 00000000000006e0[ 1473.176008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 1473.176008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 1473.176008]
[ 1473.176008] Call Trace:
[ 1473.176008]  [<ffffffff8033e42a>] delay_tsc+0x21/0x51
[ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[ 1473.176008]  [<ffffffff80220ebb>] __ticket_spin_trylock+0x3/0x18
[ 1473.176008]  [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102
[ 1473.176008]  [<ffffffff80458658>] _spin_lock+0x4e/0x59
[ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
[ 1473.176008]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
[ 1473.176008]  [<ffffffff802ed00a>] dnotify_parent+0x1f/0x6d
[ 1473.176008]  [<ffffffff8030333d>] sys_msgctl+0x16f/0x42f
[ 1473.176008]  [<ffffffff8034163d>] _raw_spin_trylock+0xd/0x31
[ 1473.176008]  [<ffffffff802b1c46>] vfs_write+0x121/0x156
[ 1473.176008]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 1473.176008]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
[ 1473.176008]

These locks are also in the kernel 2.6.27.2.
Comment 1 Clement Calmels 2008-10-20 11:35:49 UTC
Created attachment 18380 [details]
A simple test that create/delete and sysv message queue, during msgctl request.

compile:
$ gcc -g -O2 -Wall sysv_test2.c -o sysv_test2
launch
$ while [ 1 ]; do ./sysv_test2 ; done
Comment 2 Anonymous Emailer 2008-10-20 12:27:36 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

(Pierre and Nadia don't appear to have bugzilla accounts?)


On Mon, 20 Oct 2008 11:33:08 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11796
> 
>            Summary: SYSV ipc lock
>            Product: Other
>            Version: 2.5
>      KernelVersion: 2.6.27
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: other_other@kernel-bugs.osdl.org
>         ReportedBy: cboulte@gmail.com
> 
> 
> Latest working kernel version:
> Earliest failing kernel version:
> Distribution: Debian Lenny
> Hardware Environment: HP Proliant DL140
> Software Environment:
> Problem Description:
> 
> When playing with SYSV ipc, lock happens. 

OK, this is bad.  Thanks for reporting it, and thanks heaps for the
test case.

Nadia, Pierre: would you have time to look into this one?

Thanks.

> Steps to reproduce:
> 
> I will attach a "simple" program. Launch it several times. The first weird
> kernel output looks like:
> 
> [  390.327265] general protection fault: 0000 [1] SMP
> [  390.329004] CPU 2
> [  390.329004] Modules linked in: autofs4 video output ac battery ipv6 nfs
> lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev
> ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> thermal_sys
> [  390.329004] Pid: 2858, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #2
> [  390.329004] RIP: 0010:[<ffffffff8025b021>]  [<ffffffff8025b021>]
> __lock_acquire+0x4f4/0x153a
> [  390.329004] RSP: 0018:ffff8801455e7d18  EFLAGS: 00010006
> [  390.329004] RAX: ffff8801451dada8 RBX: 0000000000000246 RCX:
> 0000000000000000[  390.329004] RDX: 0000000000000000 RSI: 0000000000000000
> RDI:
> ffff8801451dada8[  390.329004] RBP: 0000000000000000 R08: 0000000000000002
> R09:
> 0000000000000000[  390.329004] R10: 0000000000000000 R11: ffff880144d08200
> R12:
> 0038004000000000[  390.329004] R13: 0000000000000000 R14: 0000000000000000
> R15:
> ffff880144d08200[  390.329004] FS:  00007fe439afd6e0(0000)
> GS:ffff8801470eebc0(0000) knlGS:0000000000000000
> [  390.329004] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  390.329004] CR2: 00007fe439668c70 CR3: 00000001455a3000 CR4:
> 00000000000006e0[  390.329004] DR0: 0000000000000000 DR1: 0000000000000000
> DR2:
> 0000000000000000[  390.329004] DR3: 0000000000000000 DR6: 00000000ffff0ff0
> DR7:
> 0000000000000400[  390.329004] Process sysv_test2 (pid: 2858, threadinfo
> ffff8801455e6000, task ffff880144d08200)
> [  390.329004] Stack:  ffff880144d08200 000000000000003f 0000000080993830
> 0000000000000002
> [  390.329004]  0000000000000000 ffff8801451dada8 0000000041b06f50
> 0000000200000001
> [  390.329004]  0000000000000000 ffffffff80598980 0000000000000000
> 0000000180255e60
> [  390.329004] Call Trace:
> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
> [  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
> [  390.329004]  [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2
> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [  390.329004]  [<ffffffff80458637>] _spin_lock+0x2d/0x59
> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [  390.329004]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
> [  390.329004]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
> [  390.329004]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
> [  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  390.329004]  [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a
> [  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  390.329004]  [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a
> [  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> [  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
> [  390.329004]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
> [  390.329004]
> [  390.329004]
> [  390.329004] Code: 41 39 54 24 28 74 24 e8 7e 42 0e 00 85 c0 0f 84 d6 0f 00
> 00 83 3d a7 8c c8 00 00 be 47 03 00 00 0f 85 c4 0f 00 00 e9 eb 0e 00 00 <f0>
> 41
> ff 84 24 38 01 00 00 45 8b b7 10 06 00 00 41 83 fe 2f 76
> [  390.329004] RIP  [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a
> [  390.329004]  RSP <ffff8801455e7d18>
> [  390.329004] ---[ end trace 507bf06c1bae10b7 ]---
> 
> If the program is launched again and again,the cpu finaly locks:
> 
> [ 1332.517502] BUG: soft lockup - CPU#3 stuck for 61s! [sysv_test2:3224]
> [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs
> lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev
> ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> thermal_sys
> [ 1332.517505] irq event stamp: 0
> [ 1332.517505] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [ 1332.517505] hardirqs last disabled at (0): [<ffffffff80238b37>]
> copy_process+0x55c/0x12e4
> [ 1332.517505] softirqs last  enabled at (0): [<ffffffff80238b37>]
> copy_process+0x55c/0x12e4
> [ 1332.517505] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 1332.517505] CPU 3:
> [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs
> lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev
> ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> thermal_sys
> [ 1332.517505] Pid: 3224, comm: sysv_test2 Tainted: G      D  
> 2.6.27-ipc_lock
> #2
> [ 1332.517505] RIP: 0010:[<ffffffff80213529>]  [<ffffffff80213529>]
> native_read_tsc+0x8/0x1d
> [ 1332.517505] RSP: 0018:ffff88014502bdb8  EFLAGS: 00000206
> [ 1332.517505] RAX: 00000000e234ed82 RBX: ffffffffe234ecb7 RCX:
> 00000000e234ecb7[ 1332.517505] RDX: 00000000000003bd RSI: ffffffff80301e0a
> RDI:
> 0000000000000001[ 1332.517505] RBP: 0000000000000000 R08: 0000000000000002
> R09:
> 0000000000000000[ 1332.517505] R10: 0000000000000000 R11: ffffffff8033e409
> R12:
> ffff88002f62a160[ 1332.517505] R13: ffff8800aee01000 R14: ffff88014502a000
> R15:
> ffffffff80829810[ 1332.517505] FS:  00007f4ab15246e0(0000)
> GS:ffff8801470ee340(0000) knlGS:0000000000000000
> [ 1332.517505] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 1332.517505] CR2: 00007f4ab108fc70 CR3: 00000001441cf000 CR4:
> 00000000000006e0[ 1332.517505] DR0: 0000000000000000 DR1: 0000000000000000
> DR2:
> 0000000000000000[ 1332.517505] DR3: 0000000000000000 DR6: 00000000ffff0ff0
> DR7:
> 0000000000000400[ 1332.517505]
> [ 1332.517505] Call Trace:
> [ 1332.517505]  [<ffffffff8033e42a>] delay_tsc+0x21/0x51
> [ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [ 1332.517505]  [<ffffffff80220ec9>] __ticket_spin_trylock+0x11/0x18
> [ 1332.517505]  [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102
> [ 1332.517505]  [<ffffffff80458658>] _spin_lock+0x4e/0x59
> [ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [ 1332.517505]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
> [ 1332.517505]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
> [ 1332.517505]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
> [ 1332.517505]  [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95
> [ 1332.517505]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> [ 1332.517505]  [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95
> [ 1332.517505]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> [ 1332.517505]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
> [ 1332.517505]
> 
> Every 'ipcs' command will lock a cpu and the node become completely unusable.
> A different output:
> 
> [ 1473.176004] BUG: soft lockup - CPU#0 stuck for 61s! [ipcs:3287]
> [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs
> lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev
> ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> thermal_sys
> [ 1473.176008] irq event stamp: 0
> [ 1473.176008] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> [ 1473.176008] hardirqs last disabled at (0): [<ffffffff80238b37>]
> copy_process+0x55c/0x12e4
> [ 1473.176008] softirqs last  enabled at (0): [<ffffffff80238b37>]
> copy_process+0x55c/0x12e4
> [ 1473.176008] softirqs last disabled at (0): [<0000000000000000>] 0x0
> [ 1473.176008] CPU 0:
> [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs
> lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev
> ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> thermal_sys
> [ 1473.176008] Pid: 3287, comm: ipcs Tainted: G      D   2.6.27-ipc_lock #2
> [ 1473.176008] RIP: 0010:[<ffffffff80213527>]  [<ffffffff80213527>]
> native_read_tsc+0x6/0x1d
> [ 1473.176008] RSP: 0018:ffff880141033dd8  EFLAGS: 00000202
> [ 1473.176008] RAX: 00000419948a6e9b RBX: ffffffff948a6e9b RCX:
> 00000000948a6e9b[ 1473.176008] RDX: 00000000948a6e9b RSI: ffffffff80301e0a
> RDI:
> 0000000000000001[ 1473.176008] RBP: 0000000000000000 R08: 0000000000000002
> R09:
> 0000000000000000[ 1473.176008] R10: 8101010101010100 R11: ffffffff8033e409
> R12:
> 0000000000000001[ 1473.176008] R13: ffff8800ae95d000 R14: ffff880141032000
> R15:
> ffffffff80829810[ 1473.176008] FS:  00007ff8a89866e0(0000)
> GS:ffffffff80624a80(0000) knlGS:0000000000000000
> [ 1473.176008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 1473.176008] CR2: 00007ff8a8486ac0 CR3: 000000014102b000 CR4:
> 00000000000006e0[ 1473.176008] DR0: 0000000000000000 DR1: 0000000000000000
> DR2:
> 0000000000000000[ 1473.176008] DR3: 0000000000000000 DR6: 00000000ffff0ff0
> DR7:
> 0000000000000400[ 1473.176008]
> [ 1473.176008] Call Trace:
> [ 1473.176008]  [<ffffffff8033e42a>] delay_tsc+0x21/0x51
> [ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [ 1473.176008]  [<ffffffff80220ebb>] __ticket_spin_trylock+0x3/0x18
> [ 1473.176008]  [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102
> [ 1473.176008]  [<ffffffff80458658>] _spin_lock+0x4e/0x59
> [ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> [ 1473.176008]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
> [ 1473.176008]  [<ffffffff802ed00a>] dnotify_parent+0x1f/0x6d
> [ 1473.176008]  [<ffffffff8030333d>] sys_msgctl+0x16f/0x42f
> [ 1473.176008]  [<ffffffff8034163d>] _raw_spin_trylock+0xd/0x31
> [ 1473.176008]  [<ffffffff802b1c46>] vfs_write+0x121/0x156
> [ 1473.176008]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> [ 1473.176008]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
> [ 1473.176008]
> 
> These locks are also in the kernel 2.6.27.2.
> 
Comment 3 Anonymous Emailer 2008-10-20 22:54:19 UTC
Reply-To: Nadia.Derbey@bull.net

On Mon, 2008-10-20 at 12:26 -0700, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> (Pierre and Nadia don't appear to have bugzilla accounts?)
> 
> 
> On Mon, 20 Oct 2008 11:33:08 -0700 (PDT)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=11796
> > 
> >            Summary: SYSV ipc lock
> >            Product: Other
> >            Version: 2.5
> >      KernelVersion: 2.6.27
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Other
> >         AssignedTo: other_other@kernel-bugs.osdl.org
> >         ReportedBy: cboulte@gmail.com
> > 
> > 
> > Latest working kernel version:
> > Earliest failing kernel version:
> > Distribution: Debian Lenny
> > Hardware Environment: HP Proliant DL140
> > Software Environment:
> > Problem Description:
> > 
> > When playing with SYSV ipc, lock happens. 
> 
> OK, this is bad.  Thanks for reporting it, and thanks heaps for the
> test case.
> 
> Nadia, Pierre: would you have time to look into this one?

I will have a look at it.

Regards,
Nadia

> 
> Thanks.
> 
> > Steps to reproduce:
> > 
> > I will attach a "simple" program. Launch it several times. The first weird
> > kernel output looks like:
> > 
> > [  390.327265] general protection fault: 0000 [1] SMP
> > [  390.329004] CPU 2
> > [  390.329004] Modules linked in: autofs4 video output ac battery ipv6 nfs
> > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev
> evdev
> > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> > thermal_sys
> > [  390.329004] Pid: 2858, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #2
> > [  390.329004] RIP: 0010:[<ffffffff8025b021>]  [<ffffffff8025b021>]
> > __lock_acquire+0x4f4/0x153a
> > [  390.329004] RSP: 0018:ffff8801455e7d18  EFLAGS: 00010006
> > [  390.329004] RAX: ffff8801451dada8 RBX: 0000000000000246 RCX:
> > 0000000000000000[  390.329004] RDX: 0000000000000000 RSI: 0000000000000000
> RDI:
> > ffff8801451dada8[  390.329004] RBP: 0000000000000000 R08: 0000000000000002
> R09:
> > 0000000000000000[  390.329004] R10: 0000000000000000 R11: ffff880144d08200
> R12:
> > 0038004000000000[  390.329004] R13: 0000000000000000 R14: 0000000000000000
> R15:
> > ffff880144d08200[  390.329004] FS:  00007fe439afd6e0(0000)
> > GS:ffff8801470eebc0(0000) knlGS:0000000000000000
> > [  390.329004] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [  390.329004] CR2: 00007fe439668c70 CR3: 00000001455a3000 CR4:
> > 00000000000006e0[  390.329004] DR0: 0000000000000000 DR1: 0000000000000000
> DR2:
> > 0000000000000000[  390.329004] DR3: 0000000000000000 DR6: 00000000ffff0ff0
> DR7:
> > 0000000000000400[  390.329004] Process sysv_test2 (pid: 2858, threadinfo
> > ffff8801455e6000, task ffff880144d08200)
> > [  390.329004] Stack:  ffff880144d08200 000000000000003f 0000000080993830
> > 0000000000000002
> > [  390.329004]  0000000000000000 ffff8801451dada8 0000000041b06f50
> > 0000000200000001
> > [  390.329004]  0000000000000000 ffffffff80598980 0000000000000000
> > 0000000180255e60
> > [  390.329004] Call Trace:
> > [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
> > [  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
> > [  390.329004]  [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2
> > [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [  390.329004]  [<ffffffff80458637>] _spin_lock+0x2d/0x59
> > [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [  390.329004]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
> > [  390.329004]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
> > [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
> > [  390.329004]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
> > [  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> > [  390.329004]  [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a
> > [  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> > [  390.329004]  [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a
> > [  390.329004]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> > [  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
> > [  390.329004]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
> > [  390.329004]
> > [  390.329004]
> > [  390.329004] Code: 41 39 54 24 28 74 24 e8 7e 42 0e 00 85 c0 0f 84 d6 0f
> 00
> > 00 83 3d a7 8c c8 00 00 be 47 03 00 00 0f 85 c4 0f 00 00 e9 eb 0e 00 00
> <f0> 41
> > ff 84 24 38 01 00 00 45 8b b7 10 06 00 00 41 83 fe 2f 76
> > [  390.329004] RIP  [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a
> > [  390.329004]  RSP <ffff8801455e7d18>
> > [  390.329004] ---[ end trace 507bf06c1bae10b7 ]---
> > 
> > If the program is launched again and again,the cpu finaly locks:
> > 
> > [ 1332.517502] BUG: soft lockup - CPU#3 stuck for 61s! [sysv_test2:3224]
> > [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs
> > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev
> evdev
> > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> > thermal_sys
> > [ 1332.517505] irq event stamp: 0
> > [ 1332.517505] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> > [ 1332.517505] hardirqs last disabled at (0): [<ffffffff80238b37>]
> > copy_process+0x55c/0x12e4
> > [ 1332.517505] softirqs last  enabled at (0): [<ffffffff80238b37>]
> > copy_process+0x55c/0x12e4
> > [ 1332.517505] softirqs last disabled at (0): [<0000000000000000>] 0x0
> > [ 1332.517505] CPU 3:
> > [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs
> > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev
> evdev
> > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> > thermal_sys
> > [ 1332.517505] Pid: 3224, comm: sysv_test2 Tainted: G      D  
> 2.6.27-ipc_lock
> > #2
> > [ 1332.517505] RIP: 0010:[<ffffffff80213529>]  [<ffffffff80213529>]
> > native_read_tsc+0x8/0x1d
> > [ 1332.517505] RSP: 0018:ffff88014502bdb8  EFLAGS: 00000206
> > [ 1332.517505] RAX: 00000000e234ed82 RBX: ffffffffe234ecb7 RCX:
> > 00000000e234ecb7[ 1332.517505] RDX: 00000000000003bd RSI: ffffffff80301e0a
> RDI:
> > 0000000000000001[ 1332.517505] RBP: 0000000000000000 R08: 0000000000000002
> R09:
> > 0000000000000000[ 1332.517505] R10: 0000000000000000 R11: ffffffff8033e409
> R12:
> > ffff88002f62a160[ 1332.517505] R13: ffff8800aee01000 R14: ffff88014502a000
> R15:
> > ffffffff80829810[ 1332.517505] FS:  00007f4ab15246e0(0000)
> > GS:ffff8801470ee340(0000) knlGS:0000000000000000
> > [ 1332.517505] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [ 1332.517505] CR2: 00007f4ab108fc70 CR3: 00000001441cf000 CR4:
> > 00000000000006e0[ 1332.517505] DR0: 0000000000000000 DR1: 0000000000000000
> DR2:
> > 0000000000000000[ 1332.517505] DR3: 0000000000000000 DR6: 00000000ffff0ff0
> DR7:
> > 0000000000000400[ 1332.517505]
> > [ 1332.517505] Call Trace:
> > [ 1332.517505]  [<ffffffff8033e42a>] delay_tsc+0x21/0x51
> > [ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [ 1332.517505]  [<ffffffff80220ec9>] __ticket_spin_trylock+0x11/0x18
> > [ 1332.517505]  [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102
> > [ 1332.517505]  [<ffffffff80458658>] _spin_lock+0x4e/0x59
> > [ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [ 1332.517505]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [ 1332.517505]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
> > [ 1332.517505]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
> > [ 1332.517505]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
> > [ 1332.517505]  [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95
> > [ 1332.517505]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> > [ 1332.517505]  [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95
> > [ 1332.517505]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> > [ 1332.517505]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
> > [ 1332.517505]
> > 
> > Every 'ipcs' command will lock a cpu and the node become completely
> unusable.
> > A different output:
> > 
> > [ 1473.176004] BUG: soft lockup - CPU#0 stuck for 61s! [ipcs:3287]
> > [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs
> > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev
> evdev
> > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> > thermal_sys
> > [ 1473.176008] irq event stamp: 0
> > [ 1473.176008] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
> > [ 1473.176008] hardirqs last disabled at (0): [<ffffffff80238b37>]
> > copy_process+0x55c/0x12e4
> > [ 1473.176008] softirqs last  enabled at (0): [<ffffffff80238b37>]
> > copy_process+0x55c/0x12e4
> > [ 1473.176008] softirqs last disabled at (0): [<0000000000000000>] 0x0
> > [ 1473.176008] CPU 0:
> > [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs
> > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm
> snd_timer
> > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse
> > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev
> evdev
> > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless
> > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic
> scsi_transport_spi
> > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan
> > thermal_sys
> > [ 1473.176008] Pid: 3287, comm: ipcs Tainted: G      D   2.6.27-ipc_lock #2
> > [ 1473.176008] RIP: 0010:[<ffffffff80213527>]  [<ffffffff80213527>]
> > native_read_tsc+0x6/0x1d
> > [ 1473.176008] RSP: 0018:ffff880141033dd8  EFLAGS: 00000202
> > [ 1473.176008] RAX: 00000419948a6e9b RBX: ffffffff948a6e9b RCX:
> > 00000000948a6e9b[ 1473.176008] RDX: 00000000948a6e9b RSI: ffffffff80301e0a
> RDI:
> > 0000000000000001[ 1473.176008] RBP: 0000000000000000 R08: 0000000000000002
> R09:
> > 0000000000000000[ 1473.176008] R10: 8101010101010100 R11: ffffffff8033e409
> R12:
> > 0000000000000001[ 1473.176008] R13: ffff8800ae95d000 R14: ffff880141032000
> R15:
> > ffffffff80829810[ 1473.176008] FS:  00007ff8a89866e0(0000)
> > GS:ffffffff80624a80(0000) knlGS:0000000000000000
> > [ 1473.176008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > [ 1473.176008] CR2: 00007ff8a8486ac0 CR3: 000000014102b000 CR4:
> > 00000000000006e0[ 1473.176008] DR0: 0000000000000000 DR1: 0000000000000000
> DR2:
> > 0000000000000000[ 1473.176008] DR3: 0000000000000000 DR6: 00000000ffff0ff0
> DR7:
> > 0000000000000400[ 1473.176008]
> > [ 1473.176008] Call Trace:
> > [ 1473.176008]  [<ffffffff8033e42a>] delay_tsc+0x21/0x51
> > [ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [ 1473.176008]  [<ffffffff80220ebb>] __ticket_spin_trylock+0x3/0x18
> > [ 1473.176008]  [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102
> > [ 1473.176008]  [<ffffffff80458658>] _spin_lock+0x4e/0x59
> > [ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [ 1473.176008]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> > [ 1473.176008]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
> > [ 1473.176008]  [<ffffffff802ed00a>] dnotify_parent+0x1f/0x6d
> > [ 1473.176008]  [<ffffffff8030333d>] sys_msgctl+0x16f/0x42f
> > [ 1473.176008]  [<ffffffff8034163d>] _raw_spin_trylock+0xd/0x31
> > [ 1473.176008]  [<ffffffff802b1c46>] vfs_write+0x121/0x156
> > [ 1473.176008]  [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f
> > [ 1473.176008]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
> > [ 1473.176008]
> > 
> > These locks are also in the kernel 2.6.27.2.
> > 
> 
> 
Comment 4 Manfred Spraul 2008-10-22 11:10:46 UTC
Nadia Derbey wrote:
>>> [  390.329004] Call Trace:
>>> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
>>> [  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
>>> [  390.329004]  [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2
>>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
>>> [  390.329004]  [<ffffffff80458637>] _spin_lock+0x2d/0x59
>>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
>>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
>>> [  390.329004]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
>>> [  390.329004]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
>>> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
>>> [  390.329004]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
>>>       

IPC_STAT is very simple, it just does ipc_lock(), copy a few numbers, 
ipc_unlock().
Perhaps the idr rcu code (or rcu by itself or idr by itself) causes a 
problem.
A few questions:
- Do you have preempt enabled?
- Do you have CONFIG_NOHZ enabled?
- Do you have any CONFIG_DEBUG options enabled?

If you have some time:
Could you enable CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SPINLOCK and 
perhaps CONFIG_DEBUG_SLAB?

--
    Manfred
Comment 5 Clement Calmels 2008-10-23 01:04:34 UTC
On Wed, Oct 22, 2008 at 8:12 PM, Manfred Spraul
<manfred@colorfullife.com> wrote:
> Nadia Derbey wrote:
>>>>
>>>> [  390.329004] Call Trace:
>>>> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
>>>> [  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
>>>> [  390.329004]  [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2
>>>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
>>>> [  390.329004]  [<ffffffff80458637>] _spin_lock+0x2d/0x59
>>>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
>>>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
>>>> [  390.329004]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
>>>> [  390.329004]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
>>>> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
>>>> [  390.329004]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
>>>>
>
> IPC_STAT is very simple, it just does ipc_lock(), copy a few numbers,
> ipc_unlock().
> Perhaps the idr rcu code (or rcu by itself or idr by itself) causes a
> problem.
> A few questions:
> - Do you have preempt enabled?
> - Do you have CONFIG_NOHZ enabled?
> - Do you have any CONFIG_DEBUG options enabled?
>
> If you have some time:
> Could you enable CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SPINLOCK and
> perhaps CONFIG_DEBUG_SLAB?
>
> --
>   Manfred
>

I attach my config file, it will be cleaner.
gcc version: gcc (Debian 4.3.2-1) 4.3.2

I add CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SLAB but the kernel
output remane the same. I can copy paste it if needed.
I also tried with IPC namespace disable: same behaviour.

c.
Comment 6 Anonymous Emailer 2008-10-23 04:54:05 UTC
Reply-To: Nadia.Derbey@bull.net

On Wed, 2008-10-22 at 20:12 +0200, Manfred Spraul wrote:
> Nadia Derbey wrote:
> >>> [  390.329004] Call Trace:
> >>> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
> >>> [  390.329004]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
> >>> [  390.329004]  [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2
> >>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> >>> [  390.329004]  [<ffffffff80458637>] _spin_lock+0x2d/0x59
> >>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> >>> [  390.329004]  [<ffffffff80301e0a>] ipc_lock+0x63/0x9b
> >>> [  390.329004]  [<ffffffff80301da7>] ipc_lock+0x0/0x9b
> >>> [  390.329004]  [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57
> >>> [  390.329004]  [<ffffffff8020c83e>] restore_args+0x0/0x30
> >>> [  390.329004]  [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f
> >>>       
> 
> IPC_STAT is very simple, it just does ipc_lock(), copy a few numbers, 
> ipc_unlock().
> Perhaps the idr rcu code (or rcu by itself or idr by itself) causes a 
> problem.
> A few questions:
> - Do you have preempt enabled?
> - Do you have CONFIG_NOHZ enabled?
> - Do you have any CONFIG_DEBUG options enabled?
> 
> If you have some time:
> Could you enable CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SPINLOCK and 
> perhaps CONFIG_DEBUG_SLAB?
> 

Hi,

I think, I've got an idea of what's happening: the problem, I think,
comes from msgget() and msgctl() running in parallel:
when we are calling ipc_addid() from msgget() the msq structure is not
completely initialized. So idr_get_new() is inserting a pointer into the
idr tree, and the structure which is pointed to has, among other fields,
its lock uninitialized.
Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the
pointer as soon as it is inserted into the IDR tree. And ipc_lock()
calls spin_lock(&mqs->lock), while we have not initilaized that lock
yet.

I have changed the order in which things are initilaized in msget()
(atctually newque() and ipc_addid()) and the test referenced in bugzilla

while [ 1 ]; do ./sysv_test2 ; done

has not failed yet (while I used to reproduce the problem before)

Regards,
Nadia
Comment 7 Manfred Spraul 2008-11-16 05:18:32 UTC
cboulte@gmail.com wrote:
> I attach my config file, it will be cleaner.
> gcc version: gcc (Debian 4.3.2-1) 4.3.2
>
>   
Thanks.
- classic rcu
- no preempt
- HP Proliant DL140, i.e. Intel Xeon, 64-bit
- with Nadia's patch applied, kernel/lockdep.c complains about a 
non-static key.

Clement, could you apply the attached patch [on top of Nadia's] patch 
and send us the kernel output?
The patch adds a hexdump of the lock structure. Perhaps that helps us to 
figure out what's happening.

Thanks,
--
    Manfred
diff --git a/kernel/lockdep.c b/kernel/lockdep.c
index dbda475..289e476 100644
--- a/kernel/lockdep.c
+++ b/kernel/lockdep.c
@@ -755,11 +755,26 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
 	 * Debug-check: all keys must be persistent!
  	 */
 	if (!static_obj(lock->key)) {
+		int i;
+		unsigned char *p;
+
 		debug_locks_off();
 		printk("INFO: trying to register non-static key.\n");
 		printk("the code is fine but needs lockdep annotation.\n");
 		printk("turning off the locking correctness validator.\n");
 		dump_stack();
+		printk(KERN_INFO "lock %p key %p \n",lock, lock->key);
+		p = (unsigned char*)lock;
+		p = (char*) (((unsigned long)lock)&~0x3f);
+
+		for(i=0;i<128;i++) {
+			if (i%16==0)
+				printk(KERN_ERR "%p:", &p[i]);
+			printk(" %02x", p[i]);
+			if (i%16 == 15)
+				printk("\n");
+		}
+		
 
 		return NULL;
 	}
Comment 8 Clement Calmels 2008-11-17 07:04:31 UTC
On Sun, Nov 16, 2008 at 2:18 PM, Manfred Spraul
<manfred@colorfullife.com> wrote:
> cboulte@gmail.com wrote:
>>
>> I attach my config file, it will be cleaner.
>> gcc version: gcc (Debian 4.3.2-1) 4.3.2
>>
>>
>
> Thanks.
> - classic rcu
> - no preempt
> - HP Proliant DL140, i.e. Intel Xeon, 64-bit
> - with Nadia's patch applied, kernel/lockdep.c complains about a non-static
> key.
>
> Clement, could you apply the attached patch [on top of Nadia's] patch and
> send us the kernel output?
> The patch adds a hexdump of the lock structure. Perhaps that helps us to
> figure out what's happening.
>
> Thanks,
> --
>   Manfred
>
> diff --git a/kernel/lockdep.c b/kernel/lockdep.c
> index dbda475..289e476 100644
> --- a/kernel/lockdep.c
> +++ b/kernel/lockdep.c
> @@ -755,11 +755,26 @@ register_lock_class(struct lockdep_map *lock, unsigned
> int subclass, int force)
>         * Debug-check: all keys must be persistent!
>         */
>        if (!static_obj(lock->key)) {
> +               int i;
> +               unsigned char *p;
> +
>                debug_locks_off();
>                printk("INFO: trying to register non-static key.\n");
>                printk("the code is fine but needs lockdep annotation.\n");
>                printk("turning off the locking correctness validator.\n");
>                dump_stack();
> +               printk(KERN_INFO "lock %p key %p \n",lock, lock->key);
> +               p = (unsigned char*)lock;
> +               p = (char*) (((unsigned long)lock)&~0x3f);
> +
> +               for(i=0;i<128;i++) {
> +                       if (i%16==0)
> +                               printk(KERN_ERR "%p:", &p[i]);
> +                       printk(" %02x", p[i]);
> +                       if (i%16 == 15)
> +                               printk("\n");
> +               }
> +
>
>                return NULL;
>        }
>
>

Hi,

Sorry for the delay, but I was not available before. I tried your
patch on top of Nadya's one on a 2.6.27 kernel and got:

[ 2981.113314] INFO: trying to register non-static key.
[ 2981.116008] the code is fine but needs lockdep annotation.
[ 2981.116008] turning off the locking correctness validator.
[ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20
[ 2981.116008]
[ 2981.116008] Call Trace:
[ 2981.116008]  [<ffffffff802585d2>] static_obj+0x51/0x68
[ 2981.116008]  [<ffffffff8025ad54>] __lock_acquire+0x1e6/0x15bb
[ 2981.116008]  [<ffffffff80212f29>] native_sched_clock+0x8c/0xa5
[ 2981.116008]  [<ffffffff8020c83e>] restore_args+0x0/0x30
[ 2981.116008]  [<ffffffff8025c1be>] lock_acquire+0x95/0xc2
[ 2981.116008]  [<ffffffff803037fa>] ipc_lock+0x63/0x9b
[ 2981.116008]  [<ffffffff80459df7>] _spin_lock+0x2d/0x59
[ 2981.116008]  [<ffffffff803037fa>] ipc_lock+0x63/0x9b
[ 2981.116008]  [<ffffffff803037fa>] ipc_lock+0x63/0x9b
[ 2981.116008]  [<ffffffff80303797>] ipc_lock+0x0/0x9b
[ 2981.116008]  [<ffffffff8030383e>] ipc_lock_check+0xc/0x57
[ 2981.116008]  [<ffffffff8020c83e>] restore_args+0x0/0x30
[ 2981.116008]  [<ffffffff80304c41>] sys_msgctl+0x18e/0x502
[ 2981.116008]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 2981.116008]  [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a
[ 2981.116008]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 2981.116008]  [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a
[ 2981.116008]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 2981.116008]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
[ 2981.116008]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
[ 2981.116008]
[ 2981.116008] lock ffff880144598b88 key ffff880144598b88
[ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88 ff ff
[ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 3046.208003] BUG: soft lockup - CPU#0 stuck for 61s! [sysv_test2:5381]
[ 3046.208006] Modules linked in: autofs4 video output ac battery ipv6
nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core
snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc
psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev
ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata
dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd
mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy
thermal processor fan thermal_sys
[ 3046.208006] irq event stamp: 2111863
[ 3046.208006] hardirqs last  enabled at (2111863):
[<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3046.208006] hardirqs last disabled at (2111862):
[<ffffffff8045989c>] trace_hardirqs_off_thunk+0x3a/0x3c
[ 3046.208006] softirqs last  enabled at (2109196):
[<ffffffff8020da0c>] call_softirq+0x1c/0x28
[ 3046.208006] softirqs last disabled at (2109191):
[<ffffffff8020da0c>] call_softirq+0x1c/0x28
[ 3046.208006] CPU 0:
[ 3046.208006] Modules linked in: autofs4 video output ac battery ipv6
nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core
snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc
psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev
ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata
dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd
mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy
thermal processor fan thermal_sys
[ 3046.208006] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20
[ 3046.208006] RIP: 0010:[<ffffffff80213531>]  [<ffffffff80213531>]
native_read_tsc+0x10/0x1d
[ 3046.208006] RSP: 0018:ffff880144945db8  EFLAGS: 00000286
[ 3046.208006] RAX: 000000001e96df9f RBX: ffff880144598b70 RCX:
000000001e96df9f[ 3046.208006] RDX: 000000000000081b RSI:
ffffffff805425d7 RDI: 0000000000000001[ 3046.208006] RBP:
0000000000000000 R08: 0000000000000002 R09: 0000000000000000[
3046.208006] R10: ffffffff80250954 R11: ffffffff8033fbb9 R12:
ffff88002f186160[ 3046.208006] R13: ffff8800ae95a000 R14:
ffff880144944000 R15: ffffffff8082c810[ 3046.208006] FS:
00007fbd589216e0(0000) GS:ffffffff80625a80(0000)
knlGS:0000000000000000
[ 3046.208006] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 3046.208006] CR2: 00007fbd5848cc70 CR3: 000000014416a000 CR4:
00000000000006e0[ 3046.208006] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000[ 3046.208006] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[
3046.208006]
[ 3046.208006] Call Trace:
[ 3046.208006]  [<ffffffff8033fbd1>] delay_tsc+0x18/0x51
[ 3046.208006]  [<ffffffff80220eb8>] __ticket_spin_trylock+0x0/0x18
[ 3046.208006]  [<ffffffff8034319b>] _raw_spin_lock+0x96/0x102
[ 3046.208006]  [<ffffffff80459e18>] _spin_lock+0x4e/0x59
[ 3046.208006]  [<ffffffff803037fa>] ipc_lock+0x63/0x9b
[ 3046.208006]  [<ffffffff803037fa>] ipc_lock+0x63/0x9b
[ 3046.208006]  [<ffffffff80303797>] ipc_lock+0x0/0x9b
[ 3046.208006]  [<ffffffff8030383e>] ipc_lock_check+0xc/0x57
[ 3046.208006]  [<ffffffff8020c83e>] restore_args+0x0/0x30
[ 3046.208006]  [<ffffffff80304c41>] sys_msgctl+0x18e/0x502
[ 3046.208006]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3046.208006]  [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a
[ 3046.208006]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3046.208006]  [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a
[ 3046.208006]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3046.208006]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
[ 3046.208006]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
[ 3046.208006]
[ 3111.704002] BUG: soft lockup - CPU#0 stuck for 61s! [sysv_test2:5381]
[ 3111.704007] Modules linked in: autofs4 video output ac battery ipv6
nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core
snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc
psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev
ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata
dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd
mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy
thermal processor fan thermal_sys
[ 3111.704007] irq event stamp: 2111863
[ 3111.704007] hardirqs last  enabled at (2111863):
[<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3111.704007] hardirqs last disabled at (2111862):
[<ffffffff8045989c>] trace_hardirqs_off_thunk+0x3a/0x3c
[ 3111.704007] softirqs last  enabled at (2109196):
[<ffffffff8020da0c>] call_softirq+0x1c/0x28
[ 3111.704007] softirqs last disabled at (2109191):
[<ffffffff8020da0c>] call_softirq+0x1c/0x28
[ 3111.704007] CPU 0:
[ 3111.704007] Modules linked in: autofs4 video output ac battery ipv6
nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core
snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc
psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev
ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata
dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd
mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy
thermal processor fan thermal_sys
[ 3111.704007] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20
[ 3111.704007] RIP: 0010:[<ffffffff80213529>]  [<ffffffff80213529>]
native_read_tsc+0x8/0x1d
[ 3111.704007] RSP: 0018:ffff880144945db8  EFLAGS: 00000206
[ 3111.704007] RAX: 00000000d15828f8 RBX: ffffffffd158282d RCX:
00000000d158282d[ 3111.704007] RDX: 0000000000000845 RSI:
ffffffff805425d7 RDI: 0000000000000001[ 3111.704007] RBP:
0000000000000000 R08: 0000000000000002 R09: 0000000000000000[
3111.704007] R10: ffffffff80250954 R11: ffffffff8033fbb9 R12:
ffff88002f186160[ 3111.704007] R13: ffff8800ae95a000 R14:
ffff880144944000 R15: ffffffff8082c810[ 3111.704007] FS:
00007fbd589216e0(0000) GS:ffffffff80625a80(0000)
knlGS:0000000000000000
[ 3111.704007] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 3111.704007] CR2: 00007fbd5848cc70 CR3: 000000014416a000 CR4:
00000000000006e0[ 3111.704007] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000[ 3111.704007] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[
3111.704007]
[ 3111.704007] Call Trace:
[ 3111.704007]  [<ffffffff8033fbda>] delay_tsc+0x21/0x51
[ 3111.704007]  [<ffffffff8033fc16>] __delay+0x0/0xa
[ 3111.704007]  [<ffffffff8034319b>] _raw_spin_lock+0x96/0x102
[ 3111.704007]  [<ffffffff80459e18>] _spin_lock+0x4e/0x59
[ 3111.704007]  [<ffffffff803037fa>] ipc_lock+0x63/0x9b
[ 3111.704007]  [<ffffffff803037fa>] ipc_lock+0x63/0x9b
[ 3111.704007]  [<ffffffff80303797>] ipc_lock+0x0/0x9b
[ 3111.704007]  [<ffffffff8030383e>] ipc_lock_check+0xc/0x57
[ 3111.704007]  [<ffffffff8020c83e>] restore_args+0x0/0x30
[ 3111.704007]  [<ffffffff80304c41>] sys_msgctl+0x18e/0x502
[ 3111.704007]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3111.704007]  [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a
[ 3111.704007]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3111.704007]  [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a
[ 3111.704007]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[ 3111.704007]  [<ffffffff80212dd1>] sched_clock+0x5/0x7
[ 3111.704007]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
[ 3111.704007]

Hope this can help.

Best regards, Clement.
Comment 9 Manfred Spraul 2008-11-17 11:31:27 UTC
cboulte@gmail.com wrote:
> Sorry for the delay, but I was not available before. I tried your
> patch on top of Nadya's one on a 2.6.27 kernel and got:
>
>   
Thank you!
> [ 2981.113314] INFO: trying to register non-static key.
> [ 2981.116008] the code is fine but needs lockdep annotation.
> [ 2981.116008] turning off the locking correctness validator.
> [ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20
> [ 2981.116008]
> [snip]
> [ 2981.116008] lock ffff880144598b88 key ffff880144598b88
> [ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88 ff
> ff
> [ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00
> [ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00
> [ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00
> [ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00
> [ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00
> [ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00
> [ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00
>   
Ok.
basically, everything is 0x00. lock->key points to it self.
But that was done by lookup_lock_class():
 >       if (unlikely(!lock->key))
 >                lock->key = (void *)lock;
Thus lock->key was 0x00, too.

How is that possible? the msg queue structure is allocated with 
ipc_rcu_alloc.
It doesn't perform a memset.

Perhaps the problem is in idr_find? what happens if we have concurrently
- idr_find() on an empty idr
- idr_get_empty_slot()

Nadia: Could you explain how updates to idp->layers and idp->top are 
synchronized?

 > static int idr_get_empty_slot(struct idr *idp, int starting_id,
 >                               struct idr_layer **pa)
[snip]
 >        rcu_assign_pointer(idp->top, p);
 >        idp->layers = layers;

rcu_assign_pointer() performs an smb_wmb() before assigning the pointer.
But there is no smp_wmb() after assigning the pointer.
Thus a cpu could update rdp->layers before or after modifying idp->top.

> void *idr_find(struct idr *idp, int id)
> {
>         int n;
>         struct idr_layer *p;
>
>         n = idp->layers * IDR_BITS;
>         p = rcu_dereference(idp->top);
Same problem here:
rcu_dereference is only an implied memory barrier that only protects 
data referenced by p.
We could get any combination from the old and the new idp->layers, 
idp->top values.

Thus:
idr_find probably reads 0 for idp->layers.
idr_find probably reads a non-NULL idp->top.
Then idr_find returns idp->top.
the content of idp->top is 0x00, because the constructor contains a 
memset().

Clement: could you try the attached patch? It should oops.
That would prove that my analysis is correct.

Nadia: Do you have an idea how the bug could be fixed? Perhaps 
idp->layers could be embedded into idp->top
i.e.
idr_find would
    p=rcu_dereference(idp->top)
    if (p == NULL)
        return NULL;
    layers = p->layers.

But I simply don't know enough about the idr code to figure out if that 
would work.
Perhaps memory barriers are sufficient.

--
    Manfred
diff --git a/lib/idr.c b/lib/idr.c
index e728c7f..b6af844 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -501,6 +508,7 @@ void *idr_find(struct idr *idp, int id)
 
 	if (id >= (1 << n))
 		return NULL;
+	BUG_ON(n == 0);
 
 	while (n > 0 && p) {
 		n -= IDR_BITS;
Comment 10 Anonymous Emailer 2008-11-18 01:35:52 UTC
Reply-To: Nadia.Derbey@bull.net

On Mon, 2008-11-17 at 20:18 +0100, Manfred Spraul wrote:
> cboulte@gmail.com wrote:
> > Sorry for the delay, but I was not available before. I tried your
> > patch on top of Nadya's one on a 2.6.27 kernel and got:
> >
> >   
> Thank you!
> > [ 2981.113314] INFO: trying to register non-static key.
> > [ 2981.116008] the code is fine but needs lockdep annotation.
> > [ 2981.116008] turning off the locking correctness validator.
> > [ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20
> > [ 2981.116008]
> > [snip]
> > [ 2981.116008] lock ffff880144598b88 key ffff880144598b88
> > [ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88
> ff ff
> > [ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> > [ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> > [ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> > [ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> > [ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> > [ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> > [ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> 00 00
> >   
> Ok.
> basically, everything is 0x00. lock->key points to it self.
> But that was done by lookup_lock_class():
>  >       if (unlikely(!lock->key))
>  >                lock->key = (void *)lock;
> Thus lock->key was 0x00, too.
> 
> How is that possible? the msg queue structure is allocated with 
> ipc_rcu_alloc.
> It doesn't perform a memset.
> 
> Perhaps the problem is in idr_find? what happens if we have concurrently
> - idr_find() on an empty idr
> - idr_get_empty_slot()
> 
> Nadia: Could you explain how updates to idp->layers and idp->top are 
> synchronized?
> 
>  > static int idr_get_empty_slot(struct idr *idp, int starting_id,
>  >                               struct idr_layer **pa)
> [snip]
>  >        rcu_assign_pointer(idp->top, p);
>  >        idp->layers = layers;
> 
> rcu_assign_pointer() performs an smb_wmb() before assigning the pointer.
> But there is no smp_wmb() after assigning the pointer.
> Thus a cpu could update rdp->layers before or after modifying idp->top.
> 
> > void *idr_find(struct idr *idp, int id)
> > {
> >         int n;
> >         struct idr_layer *p;
> >
> >         n = idp->layers * IDR_BITS;
> >         p = rcu_dereference(idp->top);
> Same problem here:
> rcu_dereference is only an implied memory barrier that only protects 
> data referenced by p.
> We could get any combination from the old and the new idp->layers, 
> idp->top values.


Manfred,

First thanks for the analysis and I'm afraid you might be right.

Changes into the idr tree are supposed to be done under a lock, but it's
true that we might be accessing 2 incoherent fields, since idr_find on
its side may be called without a lock.

One solution here would be to have a .layers field inside each idr_layer
structure. Actually that's how the radix tree code manages the height:
when doing a lookup it's the .height inside the radix_tree_node that is
read after the radix_tree_node has been rcu dereferenced.

So I think that could be a way to go. I'll try to see how big could be
the impact. Unfortunatly, I don't know the name of an "official"
maintainer for the idr...

Regards,
Nadia

> 
> Thus:
> idr_find probably reads 0 for idp->layers.
> idr_find probably reads a non-NULL idp->top.
> Then idr_find returns idp->top.
> the content of idp->top is 0x00, because the constructor contains a 
> memset().
> 
> Clement: could you try the attached patch? It should oops.
> That would prove that my analysis is correct.
> 
> Nadia: Do you have an idea how the bug could be fixed? Perhaps 
> idp->layers could be embedded into idp->top
> i.e.
> idr_find would
>     p=rcu_dereference(idp->top)
>     if (p == NULL)
>         return NULL;
>     layers = p->layers.
> 
> But I simply don't know enough about the idr code to figure out if that 
> would work.
> Perhaps memory barriers are sufficient.
> --
>     Manfred
> plain text document attachment (patch-lib-idr)
> diff --git a/lib/idr.c b/lib/idr.c
> index e728c7f..b6af844 100644
> --- a/lib/idr.c
> +++ b/lib/idr.c
> @@ -501,6 +508,7 @@ void *idr_find(struct idr *idp, int id)
>  
>       if (id >= (1 << n))
>               return NULL;
> +     BUG_ON(n == 0);
>  
>       while (n > 0 && p) {
>               n -= IDR_BITS;
Comment 11 Clement Calmels 2008-11-18 02:41:05 UTC
2008/11/17 Manfred Spraul <manfred@colorfullife.com>:
> cboulte@gmail.com wrote:
>>
>> Sorry for the delay, but I was not available before. I tried your
>> patch on top of Nadya's one on a 2.6.27 kernel and got:
>>
>>
>
> Thank you!
>>
>> [ 2981.113314] INFO: trying to register non-static key.
>> [ 2981.116008] the code is fine but needs lockdep annotation.
>> [ 2981.116008] turning off the locking correctness validator.
>> [ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20
>> [ 2981.116008]
>> [snip]
>> [ 2981.116008] lock ffff880144598b88 key ffff880144598b88
>> [ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88
>> ff ff
>> [ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 00 00
>> [ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 00 00
>> [ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 00 00
>> [ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 00 00
>> [ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 00 00
>> [ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 00 00
>> [ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> 00 00
>>
>
> Ok.
> basically, everything is 0x00. lock->key points to it self.
> But that was done by lookup_lock_class():
>>       if (unlikely(!lock->key))
>>                lock->key = (void *)lock;
> Thus lock->key was 0x00, too.
>
> How is that possible? the msg queue structure is allocated with
> ipc_rcu_alloc.
> It doesn't perform a memset.
>
> Perhaps the problem is in idr_find? what happens if we have concurrently
> - idr_find() on an empty idr
> - idr_get_empty_slot()
>
> Nadia: Could you explain how updates to idp->layers and idp->top are
> synchronized?
>
>> static int idr_get_empty_slot(struct idr *idp, int starting_id,
>>                               struct idr_layer **pa)
> [snip]
>>        rcu_assign_pointer(idp->top, p);
>>        idp->layers = layers;
>
> rcu_assign_pointer() performs an smb_wmb() before assigning the pointer.
> But there is no smp_wmb() after assigning the pointer.
> Thus a cpu could update rdp->layers before or after modifying idp->top.
>
>> void *idr_find(struct idr *idp, int id)
>> {
>>        int n;
>>        struct idr_layer *p;
>>
>>        n = idp->layers * IDR_BITS;
>>        p = rcu_dereference(idp->top);
>
> Same problem here:
> rcu_dereference is only an implied memory barrier that only protects data
> referenced by p.
> We could get any combination from the old and the new idp->layers, idp->top
> values.
>
> Thus:
> idr_find probably reads 0 for idp->layers.
> idr_find probably reads a non-NULL idp->top.
> Then idr_find returns idp->top.
> the content of idp->top is 0x00, because the constructor contains a
> memset().
>
> Clement: could you try the attached patch? It should oops.
> That would prove that my analysis is correct.
>
> Nadia: Do you have an idea how the bug could be fixed? Perhaps idp->layers
> could be embedded into idp->top
> i.e.
> idr_find would
>   p=rcu_dereference(idp->top)
>   if (p == NULL)
>       return NULL;
>   layers = p->layers.
>
> But I simply don't know enough about the idr code to figure out if that
> would work.
> Perhaps memory barriers are sufficient.
>
> --
>   Manfred
>
> diff --git a/lib/idr.c b/lib/idr.c
> index e728c7f..b6af844 100644
> --- a/lib/idr.c
> +++ b/lib/idr.c
> @@ -501,6 +508,7 @@ void *idr_find(struct idr *idp, int id)
>
>        if (id >= (1 << n))
>                return NULL;
> +       BUG_ON(n == 0);
>
>        while (n > 0 && p) {
>                n -= IDR_BITS;
>
>

It looks like you're right...

[  403.683434] kernel BUG at /home/ccalmels/WORK/bug-sysv/lib/idr.c:504!
[  403.683434] invalid opcode: 0000 [121] SMP
[  403.683434] CPU 3
[  403.683434] Modules linked in: autofs4 video output ac battery ipv6
nfs lockd nfs_acl sunrpc loop i2c_i801 serio_raw pcspkr snd_pcm
snd_timer psmouse rng_core i2c_core snd soundcore snd_page_alloc
iTCO_wdt button e752x_edac edac_core shpchp pci_hotplug joydev evdev
ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid
ff_memless ata_piix libata dock mptspi mptscsih mptbase
ide_pci_generic scsi_transport_spi ehci_hcd uhci_hcd ide_core scsi_mod
tg3 libphy thermal processor fan thermal_sys
[  403.683434] Pid: 2922, comm: sysv_test2 Tainted: G      D
2.6.27-ipc_lock #21
[  403.683434] RIP: 0010:[<ffffffff80339859>]  [<ffffffff80339859>]
idr_find+0x21/0x44
[  403.683434] RSP: 0018:ffff880145513e60  EFLAGS: 00010246
[  403.683434] RAX: 0000000000000001 RBX: ffffffff805a3328 RCX:
0000000000000000[  403.683434] RDX: 0000000000000000 RSI:
0000000000000000 RDI: 0000000000000000[  403.683434] RBP:
0000000000000000 R08: 0000000000000001 R09: 0000000000000000[
403.683434] R10: 0000000000000000 R11: 0000000000000206 R12:
0000000000000000[  403.683434] R13: ffffffff805a31e0 R14:
00007fff45700b50 R15: 0000000000000000[  403.683434] FS:
00007fc83d6f76e0(0000) GS:ffff88014707fb90(0000)
knlGS:0000000000000000
[  403.683434] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  403.683434] CR2: 00007fc83d262c70 CR3: 0000000144102000 CR4:
00000000000006e0[  403.683434] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000[  403.683434] DR3:
0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[
403.683434] Process sysv_test2 (pid: 2922, threadinfo
ffff880145512000, task ffff880144832680)
[  403.683434] Stack:  ffffffff803037e3 ffffffff80303797
ffffffff8033d8db 0000000000000001
[  403.683434]  0000000000000000 0000000000400650 ffffffff8030383e
0000000000000403
[  403.683434]  0000000000000001 0000000000000002 ffffffff80304c41
ffff88014444adb8
[  403.683434] Call Trace:
[  403.683434]  [<ffffffff803037e3>] ipc_lock+0x4c/0x9b
[  403.683434]  [<ffffffff80303797>] ipc_lock+0x0/0x9b
[  403.683434]  [<ffffffff8033d8db>] __up_read+0x13/0x8a
[  403.683434]  [<ffffffff8030383e>] ipc_lock_check+0xc/0x57
[  403.683434]  [<ffffffff80304c41>] sys_msgctl+0x18e/0x502
[  403.683434]  [<ffffffff8024fb64>] hrtimer_get_next_event+0x30/0xa8
[  403.683434]  [<ffffffff804598d3>] lockdep_sys_exit_thunk+0x35/0x67
[  403.683434]  [<ffffffff804598d3>] lockdep_sys_exit_thunk+0x35/0x67
[  403.683434]  [<ffffffff8033d8db>] __up_read+0x13/0x8a
[  403.683434]  [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f
[  403.683434]  [<ffffffff8033d8db>] __up_read+0x13/0x8a
[  403.683434]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b
[  403.683434]
[  403.683434]
[  403.683434] Code: 0f bc c0 48 01 c2 48 89 d0 c3 6b 4f 10 06 b8 01
00 00 00 81 e6 ff ff ff 7f 31 d2 48 8b 3f d3 e0 39 c6 7d 26 85 c9 48
89 fa 75 16 <0f> 0b eb fe 83 e9 06 89 f0 d3 f8 83 e0 3f 48 8d 44 c2 08
48 8b
[  403.683434] RIP  [<ffffffff80339859>] idr_find+0x21/0x44
[  403.683434]  RSP <ffff880145513e60>

Regards, c.