Latest working kernel version: Earliest failing kernel version: Distribution: Debian Lenny Hardware Environment: HP Proliant DL140 Software Environment: Problem Description: When playing with SYSV ipc, lock happens. Steps to reproduce: I will attach a "simple" program. Launch it several times. The first weird kernel output looks like: [ 390.327265] general protection fault: 0000 [1] SMP [ 390.329004] CPU 2 [ 390.329004] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys [ 390.329004] Pid: 2858, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #2 [ 390.329004] RIP: 0010:[<ffffffff8025b021>] [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a [ 390.329004] RSP: 0018:ffff8801455e7d18 EFLAGS: 00010006 [ 390.329004] RAX: ffff8801451dada8 RBX: 0000000000000246 RCX: 0000000000000000[ 390.329004] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8801451dada8[ 390.329004] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[ 390.329004] R10: 0000000000000000 R11: ffff880144d08200 R12: 0038004000000000[ 390.329004] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880144d08200[ 390.329004] FS: 00007fe439afd6e0(0000) GS:ffff8801470eebc0(0000) knlGS:0000000000000000 [ 390.329004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 390.329004] CR2: 00007fe439668c70 CR3: 00000001455a3000 CR4: 00000000000006e0[ 390.329004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 390.329004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 390.329004] Process sysv_test2 (pid: 2858, threadinfo ffff8801455e6000, task ffff880144d08200) [ 390.329004] Stack: ffff880144d08200 000000000000003f 0000000080993830 0000000000000002 [ 390.329004] 0000000000000000 ffff8801451dada8 0000000041b06f50 0000000200000001 [ 390.329004] 0000000000000000 ffffffff80598980 0000000000000000 0000000180255e60 [ 390.329004] Call Trace: [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 [ 390.329004] [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2 [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 390.329004] [<ffffffff80458637>] _spin_lock+0x2d/0x59 [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 390.329004] [<ffffffff80301da7>] ipc_lock+0x0/0x9b [ 390.329004] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 [ 390.329004] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 390.329004] [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 390.329004] [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 [ 390.329004] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b [ 390.329004] [ 390.329004] [ 390.329004] Code: 41 39 54 24 28 74 24 e8 7e 42 0e 00 85 c0 0f 84 d6 0f 00 00 83 3d a7 8c c8 00 00 be 47 03 00 00 0f 85 c4 0f 00 00 e9 eb 0e 00 00 <f0> 41 ff 84 24 38 01 00 00 45 8b b7 10 06 00 00 41 83 fe 2f 76 [ 390.329004] RIP [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a [ 390.329004] RSP <ffff8801455e7d18> [ 390.329004] ---[ end trace 507bf06c1bae10b7 ]--- If the program is launched again and again,the cpu finaly locks: [ 1332.517502] BUG: soft lockup - CPU#3 stuck for 61s! [sysv_test2:3224] [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys [ 1332.517505] irq event stamp: 0 [ 1332.517505] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 1332.517505] hardirqs last disabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4 [ 1332.517505] softirqs last enabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4 [ 1332.517505] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 1332.517505] CPU 3: [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys [ 1332.517505] Pid: 3224, comm: sysv_test2 Tainted: G D 2.6.27-ipc_lock #2 [ 1332.517505] RIP: 0010:[<ffffffff80213529>] [<ffffffff80213529>] native_read_tsc+0x8/0x1d [ 1332.517505] RSP: 0018:ffff88014502bdb8 EFLAGS: 00000206 [ 1332.517505] RAX: 00000000e234ed82 RBX: ffffffffe234ecb7 RCX: 00000000e234ecb7[ 1332.517505] RDX: 00000000000003bd RSI: ffffffff80301e0a RDI: 0000000000000001[ 1332.517505] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[ 1332.517505] R10: 0000000000000000 R11: ffffffff8033e409 R12: ffff88002f62a160[ 1332.517505] R13: ffff8800aee01000 R14: ffff88014502a000 R15: ffffffff80829810[ 1332.517505] FS: 00007f4ab15246e0(0000) GS:ffff8801470ee340(0000) knlGS:0000000000000000 [ 1332.517505] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1332.517505] CR2: 00007f4ab108fc70 CR3: 00000001441cf000 CR4: 00000000000006e0[ 1332.517505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 1332.517505] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 1332.517505] [ 1332.517505] Call Trace: [ 1332.517505] [<ffffffff8033e42a>] delay_tsc+0x21/0x51 [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 1332.517505] [<ffffffff80220ec9>] __ticket_spin_trylock+0x11/0x18 [ 1332.517505] [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102 [ 1332.517505] [<ffffffff80458658>] _spin_lock+0x4e/0x59 [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 1332.517505] [<ffffffff80301da7>] ipc_lock+0x0/0x9b [ 1332.517505] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 [ 1332.517505] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f [ 1332.517505] [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95 [ 1332.517505] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 1332.517505] [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95 [ 1332.517505] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 1332.517505] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b [ 1332.517505] Every 'ipcs' command will lock a cpu and the node become completely unusable. A different output: [ 1473.176004] BUG: soft lockup - CPU#0 stuck for 61s! [ipcs:3287] [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys [ 1473.176008] irq event stamp: 0 [ 1473.176008] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 1473.176008] hardirqs last disabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4 [ 1473.176008] softirqs last enabled at (0): [<ffffffff80238b37>] copy_process+0x55c/0x12e4 [ 1473.176008] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 1473.176008] CPU 0: [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm snd_timer snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan thermal_sys [ 1473.176008] Pid: 3287, comm: ipcs Tainted: G D 2.6.27-ipc_lock #2 [ 1473.176008] RIP: 0010:[<ffffffff80213527>] [<ffffffff80213527>] native_read_tsc+0x6/0x1d [ 1473.176008] RSP: 0018:ffff880141033dd8 EFLAGS: 00000202 [ 1473.176008] RAX: 00000419948a6e9b RBX: ffffffff948a6e9b RCX: 00000000948a6e9b[ 1473.176008] RDX: 00000000948a6e9b RSI: ffffffff80301e0a RDI: 0000000000000001[ 1473.176008] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[ 1473.176008] R10: 8101010101010100 R11: ffffffff8033e409 R12: 0000000000000001[ 1473.176008] R13: ffff8800ae95d000 R14: ffff880141032000 R15: ffffffff80829810[ 1473.176008] FS: 00007ff8a89866e0(0000) GS:ffffffff80624a80(0000) knlGS:0000000000000000 [ 1473.176008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 1473.176008] CR2: 00007ff8a8486ac0 CR3: 000000014102b000 CR4: 00000000000006e0[ 1473.176008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 1473.176008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 1473.176008] [ 1473.176008] Call Trace: [ 1473.176008] [<ffffffff8033e42a>] delay_tsc+0x21/0x51 [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 1473.176008] [<ffffffff80220ebb>] __ticket_spin_trylock+0x3/0x18 [ 1473.176008] [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102 [ 1473.176008] [<ffffffff80458658>] _spin_lock+0x4e/0x59 [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b [ 1473.176008] [<ffffffff80301da7>] ipc_lock+0x0/0x9b [ 1473.176008] [<ffffffff802ed00a>] dnotify_parent+0x1f/0x6d [ 1473.176008] [<ffffffff8030333d>] sys_msgctl+0x16f/0x42f [ 1473.176008] [<ffffffff8034163d>] _raw_spin_trylock+0xd/0x31 [ 1473.176008] [<ffffffff802b1c46>] vfs_write+0x121/0x156 [ 1473.176008] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 1473.176008] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b [ 1473.176008] These locks are also in the kernel 2.6.27.2.
Created attachment 18380 [details] A simple test that create/delete and sysv message queue, during msgctl request. compile: $ gcc -g -O2 -Wall sysv_test2.c -o sysv_test2 launch $ while [ 1 ]; do ./sysv_test2 ; done
Reply-To: akpm@linux-foundation.org (switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). (Pierre and Nadia don't appear to have bugzilla accounts?) On Mon, 20 Oct 2008 11:33:08 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=11796 > > Summary: SYSV ipc lock > Product: Other > Version: 2.5 > KernelVersion: 2.6.27 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Other > AssignedTo: other_other@kernel-bugs.osdl.org > ReportedBy: cboulte@gmail.com > > > Latest working kernel version: > Earliest failing kernel version: > Distribution: Debian Lenny > Hardware Environment: HP Proliant DL140 > Software Environment: > Problem Description: > > When playing with SYSV ipc, lock happens. OK, this is bad. Thanks for reporting it, and thanks heaps for the test case. Nadia, Pierre: would you have time to look into this one? Thanks. > Steps to reproduce: > > I will attach a "simple" program. Launch it several times. The first weird > kernel output looks like: > > [ 390.327265] general protection fault: 0000 [1] SMP > [ 390.329004] CPU 2 > [ 390.329004] Modules linked in: autofs4 video output ac battery ipv6 nfs > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > thermal_sys > [ 390.329004] Pid: 2858, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #2 > [ 390.329004] RIP: 0010:[<ffffffff8025b021>] [<ffffffff8025b021>] > __lock_acquire+0x4f4/0x153a > [ 390.329004] RSP: 0018:ffff8801455e7d18 EFLAGS: 00010006 > [ 390.329004] RAX: ffff8801451dada8 RBX: 0000000000000246 RCX: > 0000000000000000[ 390.329004] RDX: 0000000000000000 RSI: 0000000000000000 > RDI: > ffff8801451dada8[ 390.329004] RBP: 0000000000000000 R08: 0000000000000002 > R09: > 0000000000000000[ 390.329004] R10: 0000000000000000 R11: ffff880144d08200 > R12: > 0038004000000000[ 390.329004] R13: 0000000000000000 R14: 0000000000000000 > R15: > ffff880144d08200[ 390.329004] FS: 00007fe439afd6e0(0000) > GS:ffff8801470eebc0(0000) knlGS:0000000000000000 > [ 390.329004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 390.329004] CR2: 00007fe439668c70 CR3: 00000001455a3000 CR4: > 00000000000006e0[ 390.329004] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: > 0000000000000000[ 390.329004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 > DR7: > 0000000000000400[ 390.329004] Process sysv_test2 (pid: 2858, threadinfo > ffff8801455e6000, task ffff880144d08200) > [ 390.329004] Stack: ffff880144d08200 000000000000003f 0000000080993830 > 0000000000000002 > [ 390.329004] 0000000000000000 ffff8801451dada8 0000000041b06f50 > 0000000200000001 > [ 390.329004] 0000000000000000 ffffffff80598980 0000000000000000 > 0000000180255e60 > [ 390.329004] Call Trace: > [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 > [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 > [ 390.329004] [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2 > [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 390.329004] [<ffffffff80458637>] _spin_lock+0x2d/0x59 > [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 390.329004] [<ffffffff80301da7>] ipc_lock+0x0/0x9b > [ 390.329004] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 > [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 > [ 390.329004] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f > [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > [ 390.329004] [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a > [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > [ 390.329004] [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a > [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 > [ 390.329004] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b > [ 390.329004] > [ 390.329004] > [ 390.329004] Code: 41 39 54 24 28 74 24 e8 7e 42 0e 00 85 c0 0f 84 d6 0f 00 > 00 83 3d a7 8c c8 00 00 be 47 03 00 00 0f 85 c4 0f 00 00 e9 eb 0e 00 00 <f0> > 41 > ff 84 24 38 01 00 00 45 8b b7 10 06 00 00 41 83 fe 2f 76 > [ 390.329004] RIP [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a > [ 390.329004] RSP <ffff8801455e7d18> > [ 390.329004] ---[ end trace 507bf06c1bae10b7 ]--- > > If the program is launched again and again,the cpu finaly locks: > > [ 1332.517502] BUG: soft lockup - CPU#3 stuck for 61s! [sysv_test2:3224] > [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > thermal_sys > [ 1332.517505] irq event stamp: 0 > [ 1332.517505] hardirqs last enabled at (0): [<0000000000000000>] 0x0 > [ 1332.517505] hardirqs last disabled at (0): [<ffffffff80238b37>] > copy_process+0x55c/0x12e4 > [ 1332.517505] softirqs last enabled at (0): [<ffffffff80238b37>] > copy_process+0x55c/0x12e4 > [ 1332.517505] softirqs last disabled at (0): [<0000000000000000>] 0x0 > [ 1332.517505] CPU 3: > [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > thermal_sys > [ 1332.517505] Pid: 3224, comm: sysv_test2 Tainted: G D > 2.6.27-ipc_lock > #2 > [ 1332.517505] RIP: 0010:[<ffffffff80213529>] [<ffffffff80213529>] > native_read_tsc+0x8/0x1d > [ 1332.517505] RSP: 0018:ffff88014502bdb8 EFLAGS: 00000206 > [ 1332.517505] RAX: 00000000e234ed82 RBX: ffffffffe234ecb7 RCX: > 00000000e234ecb7[ 1332.517505] RDX: 00000000000003bd RSI: ffffffff80301e0a > RDI: > 0000000000000001[ 1332.517505] RBP: 0000000000000000 R08: 0000000000000002 > R09: > 0000000000000000[ 1332.517505] R10: 0000000000000000 R11: ffffffff8033e409 > R12: > ffff88002f62a160[ 1332.517505] R13: ffff8800aee01000 R14: ffff88014502a000 > R15: > ffffffff80829810[ 1332.517505] FS: 00007f4ab15246e0(0000) > GS:ffff8801470ee340(0000) knlGS:0000000000000000 > [ 1332.517505] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 1332.517505] CR2: 00007f4ab108fc70 CR3: 00000001441cf000 CR4: > 00000000000006e0[ 1332.517505] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: > 0000000000000000[ 1332.517505] DR3: 0000000000000000 DR6: 00000000ffff0ff0 > DR7: > 0000000000000400[ 1332.517505] > [ 1332.517505] Call Trace: > [ 1332.517505] [<ffffffff8033e42a>] delay_tsc+0x21/0x51 > [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 1332.517505] [<ffffffff80220ec9>] __ticket_spin_trylock+0x11/0x18 > [ 1332.517505] [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102 > [ 1332.517505] [<ffffffff80458658>] _spin_lock+0x4e/0x59 > [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 1332.517505] [<ffffffff80301da7>] ipc_lock+0x0/0x9b > [ 1332.517505] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 > [ 1332.517505] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f > [ 1332.517505] [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95 > [ 1332.517505] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > [ 1332.517505] [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95 > [ 1332.517505] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > [ 1332.517505] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b > [ 1332.517505] > > Every 'ipcs' command will lock a cpu and the node become completely unusable. > A different output: > > [ 1473.176004] BUG: soft lockup - CPU#0 stuck for 61s! [ipcs:3287] > [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > thermal_sys > [ 1473.176008] irq event stamp: 0 > [ 1473.176008] hardirqs last enabled at (0): [<0000000000000000>] 0x0 > [ 1473.176008] hardirqs last disabled at (0): [<ffffffff80238b37>] > copy_process+0x55c/0x12e4 > [ 1473.176008] softirqs last enabled at (0): [<ffffffff80238b37>] > copy_process+0x55c/0x12e4 > [ 1473.176008] softirqs last disabled at (0): [<0000000000000000>] 0x0 > [ 1473.176008] CPU 0: > [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev evdev > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > thermal_sys > [ 1473.176008] Pid: 3287, comm: ipcs Tainted: G D 2.6.27-ipc_lock #2 > [ 1473.176008] RIP: 0010:[<ffffffff80213527>] [<ffffffff80213527>] > native_read_tsc+0x6/0x1d > [ 1473.176008] RSP: 0018:ffff880141033dd8 EFLAGS: 00000202 > [ 1473.176008] RAX: 00000419948a6e9b RBX: ffffffff948a6e9b RCX: > 00000000948a6e9b[ 1473.176008] RDX: 00000000948a6e9b RSI: ffffffff80301e0a > RDI: > 0000000000000001[ 1473.176008] RBP: 0000000000000000 R08: 0000000000000002 > R09: > 0000000000000000[ 1473.176008] R10: 8101010101010100 R11: ffffffff8033e409 > R12: > 0000000000000001[ 1473.176008] R13: ffff8800ae95d000 R14: ffff880141032000 > R15: > ffffffff80829810[ 1473.176008] FS: 00007ff8a89866e0(0000) > GS:ffffffff80624a80(0000) knlGS:0000000000000000 > [ 1473.176008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 1473.176008] CR2: 00007ff8a8486ac0 CR3: 000000014102b000 CR4: > 00000000000006e0[ 1473.176008] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: > 0000000000000000[ 1473.176008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 > DR7: > 0000000000000400[ 1473.176008] > [ 1473.176008] Call Trace: > [ 1473.176008] [<ffffffff8033e42a>] delay_tsc+0x21/0x51 > [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 1473.176008] [<ffffffff80220ebb>] __ticket_spin_trylock+0x3/0x18 > [ 1473.176008] [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102 > [ 1473.176008] [<ffffffff80458658>] _spin_lock+0x4e/0x59 > [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > [ 1473.176008] [<ffffffff80301da7>] ipc_lock+0x0/0x9b > [ 1473.176008] [<ffffffff802ed00a>] dnotify_parent+0x1f/0x6d > [ 1473.176008] [<ffffffff8030333d>] sys_msgctl+0x16f/0x42f > [ 1473.176008] [<ffffffff8034163d>] _raw_spin_trylock+0xd/0x31 > [ 1473.176008] [<ffffffff802b1c46>] vfs_write+0x121/0x156 > [ 1473.176008] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > [ 1473.176008] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b > [ 1473.176008] > > These locks are also in the kernel 2.6.27.2. >
Reply-To: Nadia.Derbey@bull.net On Mon, 2008-10-20 at 12:26 -0700, Andrew Morton wrote: > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > (Pierre and Nadia don't appear to have bugzilla accounts?) > > > On Mon, 20 Oct 2008 11:33:08 -0700 (PDT) > bugme-daemon@bugzilla.kernel.org wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=11796 > > > > Summary: SYSV ipc lock > > Product: Other > > Version: 2.5 > > KernelVersion: 2.6.27 > > Platform: All > > OS/Version: Linux > > Tree: Mainline > > Status: NEW > > Severity: high > > Priority: P1 > > Component: Other > > AssignedTo: other_other@kernel-bugs.osdl.org > > ReportedBy: cboulte@gmail.com > > > > > > Latest working kernel version: > > Earliest failing kernel version: > > Distribution: Debian Lenny > > Hardware Environment: HP Proliant DL140 > > Software Environment: > > Problem Description: > > > > When playing with SYSV ipc, lock happens. > > OK, this is bad. Thanks for reporting it, and thanks heaps for the > test case. > > Nadia, Pierre: would you have time to look into this one? I will have a look at it. Regards, Nadia > > Thanks. > > > Steps to reproduce: > > > > I will attach a "simple" program. Launch it several times. The first weird > > kernel output looks like: > > > > [ 390.327265] general protection fault: 0000 [1] SMP > > [ 390.329004] CPU 2 > > [ 390.329004] Modules linked in: autofs4 video output ac battery ipv6 nfs > > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev > evdev > > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > > thermal_sys > > [ 390.329004] Pid: 2858, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #2 > > [ 390.329004] RIP: 0010:[<ffffffff8025b021>] [<ffffffff8025b021>] > > __lock_acquire+0x4f4/0x153a > > [ 390.329004] RSP: 0018:ffff8801455e7d18 EFLAGS: 00010006 > > [ 390.329004] RAX: ffff8801451dada8 RBX: 0000000000000246 RCX: > > 0000000000000000[ 390.329004] RDX: 0000000000000000 RSI: 0000000000000000 > RDI: > > ffff8801451dada8[ 390.329004] RBP: 0000000000000000 R08: 0000000000000002 > R09: > > 0000000000000000[ 390.329004] R10: 0000000000000000 R11: ffff880144d08200 > R12: > > 0038004000000000[ 390.329004] R13: 0000000000000000 R14: 0000000000000000 > R15: > > ffff880144d08200[ 390.329004] FS: 00007fe439afd6e0(0000) > > GS:ffff8801470eebc0(0000) knlGS:0000000000000000 > > [ 390.329004] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 390.329004] CR2: 00007fe439668c70 CR3: 00000001455a3000 CR4: > > 00000000000006e0[ 390.329004] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: > > 0000000000000000[ 390.329004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 > DR7: > > 0000000000000400[ 390.329004] Process sysv_test2 (pid: 2858, threadinfo > > ffff8801455e6000, task ffff880144d08200) > > [ 390.329004] Stack: ffff880144d08200 000000000000003f 0000000080993830 > > 0000000000000002 > > [ 390.329004] 0000000000000000 ffff8801451dada8 0000000041b06f50 > > 0000000200000001 > > [ 390.329004] 0000000000000000 ffffffff80598980 0000000000000000 > > 0000000180255e60 > > [ 390.329004] Call Trace: > > [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 > > [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 > > [ 390.329004] [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2 > > [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 390.329004] [<ffffffff80458637>] _spin_lock+0x2d/0x59 > > [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 390.329004] [<ffffffff80301da7>] ipc_lock+0x0/0x9b > > [ 390.329004] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 > > [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 > > [ 390.329004] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f > > [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > > [ 390.329004] [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a > > [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > > [ 390.329004] [<ffffffff8025a225>] trace_hardirqs_on_caller+0xff/0x12a > > [ 390.329004] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > > [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 > > [ 390.329004] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b > > [ 390.329004] > > [ 390.329004] > > [ 390.329004] Code: 41 39 54 24 28 74 24 e8 7e 42 0e 00 85 c0 0f 84 d6 0f > 00 > > 00 83 3d a7 8c c8 00 00 be 47 03 00 00 0f 85 c4 0f 00 00 e9 eb 0e 00 00 > <f0> 41 > > ff 84 24 38 01 00 00 45 8b b7 10 06 00 00 41 83 fe 2f 76 > > [ 390.329004] RIP [<ffffffff8025b021>] __lock_acquire+0x4f4/0x153a > > [ 390.329004] RSP <ffff8801455e7d18> > > [ 390.329004] ---[ end trace 507bf06c1bae10b7 ]--- > > > > If the program is launched again and again,the cpu finaly locks: > > > > [ 1332.517502] BUG: soft lockup - CPU#3 stuck for 61s! [sysv_test2:3224] > > [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs > > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev > evdev > > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > > thermal_sys > > [ 1332.517505] irq event stamp: 0 > > [ 1332.517505] hardirqs last enabled at (0): [<0000000000000000>] 0x0 > > [ 1332.517505] hardirqs last disabled at (0): [<ffffffff80238b37>] > > copy_process+0x55c/0x12e4 > > [ 1332.517505] softirqs last enabled at (0): [<ffffffff80238b37>] > > copy_process+0x55c/0x12e4 > > [ 1332.517505] softirqs last disabled at (0): [<0000000000000000>] 0x0 > > [ 1332.517505] CPU 3: > > [ 1332.517505] Modules linked in: autofs4 video output ac battery ipv6 nfs > > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev > evdev > > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > > thermal_sys > > [ 1332.517505] Pid: 3224, comm: sysv_test2 Tainted: G D > 2.6.27-ipc_lock > > #2 > > [ 1332.517505] RIP: 0010:[<ffffffff80213529>] [<ffffffff80213529>] > > native_read_tsc+0x8/0x1d > > [ 1332.517505] RSP: 0018:ffff88014502bdb8 EFLAGS: 00000206 > > [ 1332.517505] RAX: 00000000e234ed82 RBX: ffffffffe234ecb7 RCX: > > 00000000e234ecb7[ 1332.517505] RDX: 00000000000003bd RSI: ffffffff80301e0a > RDI: > > 0000000000000001[ 1332.517505] RBP: 0000000000000000 R08: 0000000000000002 > R09: > > 0000000000000000[ 1332.517505] R10: 0000000000000000 R11: ffffffff8033e409 > R12: > > ffff88002f62a160[ 1332.517505] R13: ffff8800aee01000 R14: ffff88014502a000 > R15: > > ffffffff80829810[ 1332.517505] FS: 00007f4ab15246e0(0000) > > GS:ffff8801470ee340(0000) knlGS:0000000000000000 > > [ 1332.517505] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 1332.517505] CR2: 00007f4ab108fc70 CR3: 00000001441cf000 CR4: > > 00000000000006e0[ 1332.517505] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: > > 0000000000000000[ 1332.517505] DR3: 0000000000000000 DR6: 00000000ffff0ff0 > DR7: > > 0000000000000400[ 1332.517505] > > [ 1332.517505] Call Trace: > > [ 1332.517505] [<ffffffff8033e42a>] delay_tsc+0x21/0x51 > > [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 1332.517505] [<ffffffff80220ec9>] __ticket_spin_trylock+0x11/0x18 > > [ 1332.517505] [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102 > > [ 1332.517505] [<ffffffff80458658>] _spin_lock+0x4e/0x59 > > [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 1332.517505] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 1332.517505] [<ffffffff80301da7>] ipc_lock+0x0/0x9b > > [ 1332.517505] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 > > [ 1332.517505] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f > > [ 1332.517505] [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95 > > [ 1332.517505] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > > [ 1332.517505] [<ffffffff80255e60>] tick_dev_program_event+0x2d/0x95 > > [ 1332.517505] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > > [ 1332.517505] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b > > [ 1332.517505] > > > > Every 'ipcs' command will lock a cpu and the node become completely > unusable. > > A different output: > > > > [ 1473.176004] BUG: soft lockup - CPU#0 stuck for 61s! [ipcs:3287] > > [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs > > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev > evdev > > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > > thermal_sys > > [ 1473.176008] irq event stamp: 0 > > [ 1473.176008] hardirqs last enabled at (0): [<0000000000000000>] 0x0 > > [ 1473.176008] hardirqs last disabled at (0): [<ffffffff80238b37>] > > copy_process+0x55c/0x12e4 > > [ 1473.176008] softirqs last enabled at (0): [<ffffffff80238b37>] > > copy_process+0x55c/0x12e4 > > [ 1473.176008] softirqs last disabled at (0): [<0000000000000000>] 0x0 > > [ 1473.176008] CPU 0: > > [ 1473.176008] Modules linked in: autofs4 video output ac battery ipv6 nfs > > lockd nfs_acl sunrpc dm_snapshot dm_mirror dm_log dm_mod loop snd_pcm > snd_timer > > snd soundcore i2c_i801 snd_page_alloc pcspkr serio_raw i2c_core psmouse > > rng_core iTCO_wdt button e752x_edac shpchp pci_hotplug edac_core joydev > evdev > > ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless > > ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic > scsi_transport_spi > > uhci_hcd ehci_hcd tg3 ide_core scsi_mod libphy thermal processor fan > > thermal_sys > > [ 1473.176008] Pid: 3287, comm: ipcs Tainted: G D 2.6.27-ipc_lock #2 > > [ 1473.176008] RIP: 0010:[<ffffffff80213527>] [<ffffffff80213527>] > > native_read_tsc+0x6/0x1d > > [ 1473.176008] RSP: 0018:ffff880141033dd8 EFLAGS: 00000202 > > [ 1473.176008] RAX: 00000419948a6e9b RBX: ffffffff948a6e9b RCX: > > 00000000948a6e9b[ 1473.176008] RDX: 00000000948a6e9b RSI: ffffffff80301e0a > RDI: > > 0000000000000001[ 1473.176008] RBP: 0000000000000000 R08: 0000000000000002 > R09: > > 0000000000000000[ 1473.176008] R10: 8101010101010100 R11: ffffffff8033e409 > R12: > > 0000000000000001[ 1473.176008] R13: ffff8800ae95d000 R14: ffff880141032000 > R15: > > ffffffff80829810[ 1473.176008] FS: 00007ff8a89866e0(0000) > > GS:ffffffff80624a80(0000) knlGS:0000000000000000 > > [ 1473.176008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > [ 1473.176008] CR2: 00007ff8a8486ac0 CR3: 000000014102b000 CR4: > > 00000000000006e0[ 1473.176008] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: > > 0000000000000000[ 1473.176008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 > DR7: > > 0000000000000400[ 1473.176008] > > [ 1473.176008] Call Trace: > > [ 1473.176008] [<ffffffff8033e42a>] delay_tsc+0x21/0x51 > > [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 1473.176008] [<ffffffff80220ebb>] __ticket_spin_trylock+0x3/0x18 > > [ 1473.176008] [<ffffffff803419eb>] _raw_spin_lock+0x96/0x102 > > [ 1473.176008] [<ffffffff80458658>] _spin_lock+0x4e/0x59 > > [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 1473.176008] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > > [ 1473.176008] [<ffffffff80301da7>] ipc_lock+0x0/0x9b > > [ 1473.176008] [<ffffffff802ed00a>] dnotify_parent+0x1f/0x6d > > [ 1473.176008] [<ffffffff8030333d>] sys_msgctl+0x16f/0x42f > > [ 1473.176008] [<ffffffff8034163d>] _raw_spin_trylock+0xd/0x31 > > [ 1473.176008] [<ffffffff802b1c46>] vfs_write+0x121/0x156 > > [ 1473.176008] [<ffffffff8045809d>] trace_hardirqs_on_thunk+0x3a/0x3f > > [ 1473.176008] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b > > [ 1473.176008] > > > > These locks are also in the kernel 2.6.27.2. > > > >
Nadia Derbey wrote: >>> [ 390.329004] Call Trace: >>> [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 >>> [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 >>> [ 390.329004] [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2 >>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b >>> [ 390.329004] [<ffffffff80458637>] _spin_lock+0x2d/0x59 >>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b >>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b >>> [ 390.329004] [<ffffffff80301da7>] ipc_lock+0x0/0x9b >>> [ 390.329004] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 >>> [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 >>> [ 390.329004] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f >>> IPC_STAT is very simple, it just does ipc_lock(), copy a few numbers, ipc_unlock(). Perhaps the idr rcu code (or rcu by itself or idr by itself) causes a problem. A few questions: - Do you have preempt enabled? - Do you have CONFIG_NOHZ enabled? - Do you have any CONFIG_DEBUG options enabled? If you have some time: Could you enable CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SPINLOCK and perhaps CONFIG_DEBUG_SLAB? -- Manfred
On Wed, Oct 22, 2008 at 8:12 PM, Manfred Spraul <manfred@colorfullife.com> wrote: > Nadia Derbey wrote: >>>> >>>> [ 390.329004] Call Trace: >>>> [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 >>>> [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 >>>> [ 390.329004] [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2 >>>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b >>>> [ 390.329004] [<ffffffff80458637>] _spin_lock+0x2d/0x59 >>>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b >>>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b >>>> [ 390.329004] [<ffffffff80301da7>] ipc_lock+0x0/0x9b >>>> [ 390.329004] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 >>>> [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 >>>> [ 390.329004] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f >>>> > > IPC_STAT is very simple, it just does ipc_lock(), copy a few numbers, > ipc_unlock(). > Perhaps the idr rcu code (or rcu by itself or idr by itself) causes a > problem. > A few questions: > - Do you have preempt enabled? > - Do you have CONFIG_NOHZ enabled? > - Do you have any CONFIG_DEBUG options enabled? > > If you have some time: > Could you enable CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SPINLOCK and > perhaps CONFIG_DEBUG_SLAB? > > -- > Manfred > I attach my config file, it will be cleaner. gcc version: gcc (Debian 4.3.2-1) 4.3.2 I add CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SLAB but the kernel output remane the same. I can copy paste it if needed. I also tried with IPC namespace disable: same behaviour. c.
Reply-To: Nadia.Derbey@bull.net On Wed, 2008-10-22 at 20:12 +0200, Manfred Spraul wrote: > Nadia Derbey wrote: > >>> [ 390.329004] Call Trace: > >>> [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 > >>> [ 390.329004] [<ffffffff80212dd1>] sched_clock+0x5/0x7 > >>> [ 390.329004] [<ffffffff8025c0fc>] lock_acquire+0x95/0xc2 > >>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > >>> [ 390.329004] [<ffffffff80458637>] _spin_lock+0x2d/0x59 > >>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > >>> [ 390.329004] [<ffffffff80301e0a>] ipc_lock+0x63/0x9b > >>> [ 390.329004] [<ffffffff80301da7>] ipc_lock+0x0/0x9b > >>> [ 390.329004] [<ffffffff80301e4e>] ipc_lock_check+0xc/0x57 > >>> [ 390.329004] [<ffffffff8020c83e>] restore_args+0x0/0x30 > >>> [ 390.329004] [<ffffffff8030335c>] sys_msgctl+0x18e/0x42f > >>> > > IPC_STAT is very simple, it just does ipc_lock(), copy a few numbers, > ipc_unlock(). > Perhaps the idr rcu code (or rcu by itself or idr by itself) causes a > problem. > A few questions: > - Do you have preempt enabled? > - Do you have CONFIG_NOHZ enabled? > - Do you have any CONFIG_DEBUG options enabled? > > If you have some time: > Could you enable CONFIG_DEBUG_LOCK_ALLOC and CONFIG_DEBUG_SPINLOCK and > perhaps CONFIG_DEBUG_SLAB? > Hi, I think, I've got an idea of what's happening: the problem, I think, comes from msgget() and msgctl() running in parallel: when we are calling ipc_addid() from msgget() the msq structure is not completely initialized. So idr_get_new() is inserting a pointer into the idr tree, and the structure which is pointed to has, among other fields, its lock uninitialized. Since msgctl(IPC_STAT) is looping while (1), idr_find() returns the pointer as soon as it is inserted into the IDR tree. And ipc_lock() calls spin_lock(&mqs->lock), while we have not initilaized that lock yet. I have changed the order in which things are initilaized in msget() (atctually newque() and ipc_addid()) and the test referenced in bugzilla while [ 1 ]; do ./sysv_test2 ; done has not failed yet (while I used to reproduce the problem before) Regards, Nadia
cboulte@gmail.com wrote: > I attach my config file, it will be cleaner. > gcc version: gcc (Debian 4.3.2-1) 4.3.2 > > Thanks. - classic rcu - no preempt - HP Proliant DL140, i.e. Intel Xeon, 64-bit - with Nadia's patch applied, kernel/lockdep.c complains about a non-static key. Clement, could you apply the attached patch [on top of Nadia's] patch and send us the kernel output? The patch adds a hexdump of the lock structure. Perhaps that helps us to figure out what's happening. Thanks, -- Manfred diff --git a/kernel/lockdep.c b/kernel/lockdep.c index dbda475..289e476 100644 --- a/kernel/lockdep.c +++ b/kernel/lockdep.c @@ -755,11 +755,26 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force) * Debug-check: all keys must be persistent! */ if (!static_obj(lock->key)) { + int i; + unsigned char *p; + debug_locks_off(); printk("INFO: trying to register non-static key.\n"); printk("the code is fine but needs lockdep annotation.\n"); printk("turning off the locking correctness validator.\n"); dump_stack(); + printk(KERN_INFO "lock %p key %p \n",lock, lock->key); + p = (unsigned char*)lock; + p = (char*) (((unsigned long)lock)&~0x3f); + + for(i=0;i<128;i++) { + if (i%16==0) + printk(KERN_ERR "%p:", &p[i]); + printk(" %02x", p[i]); + if (i%16 == 15) + printk("\n"); + } + return NULL; }
On Sun, Nov 16, 2008 at 2:18 PM, Manfred Spraul <manfred@colorfullife.com> wrote: > cboulte@gmail.com wrote: >> >> I attach my config file, it will be cleaner. >> gcc version: gcc (Debian 4.3.2-1) 4.3.2 >> >> > > Thanks. > - classic rcu > - no preempt > - HP Proliant DL140, i.e. Intel Xeon, 64-bit > - with Nadia's patch applied, kernel/lockdep.c complains about a non-static > key. > > Clement, could you apply the attached patch [on top of Nadia's] patch and > send us the kernel output? > The patch adds a hexdump of the lock structure. Perhaps that helps us to > figure out what's happening. > > Thanks, > -- > Manfred > > diff --git a/kernel/lockdep.c b/kernel/lockdep.c > index dbda475..289e476 100644 > --- a/kernel/lockdep.c > +++ b/kernel/lockdep.c > @@ -755,11 +755,26 @@ register_lock_class(struct lockdep_map *lock, unsigned > int subclass, int force) > * Debug-check: all keys must be persistent! > */ > if (!static_obj(lock->key)) { > + int i; > + unsigned char *p; > + > debug_locks_off(); > printk("INFO: trying to register non-static key.\n"); > printk("the code is fine but needs lockdep annotation.\n"); > printk("turning off the locking correctness validator.\n"); > dump_stack(); > + printk(KERN_INFO "lock %p key %p \n",lock, lock->key); > + p = (unsigned char*)lock; > + p = (char*) (((unsigned long)lock)&~0x3f); > + > + for(i=0;i<128;i++) { > + if (i%16==0) > + printk(KERN_ERR "%p:", &p[i]); > + printk(" %02x", p[i]); > + if (i%16 == 15) > + printk("\n"); > + } > + > > return NULL; > } > > Hi, Sorry for the delay, but I was not available before. I tried your patch on top of Nadya's one on a 2.6.27 kernel and got: [ 2981.113314] INFO: trying to register non-static key. [ 2981.116008] the code is fine but needs lockdep annotation. [ 2981.116008] turning off the locking correctness validator. [ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20 [ 2981.116008] [ 2981.116008] Call Trace: [ 2981.116008] [<ffffffff802585d2>] static_obj+0x51/0x68 [ 2981.116008] [<ffffffff8025ad54>] __lock_acquire+0x1e6/0x15bb [ 2981.116008] [<ffffffff80212f29>] native_sched_clock+0x8c/0xa5 [ 2981.116008] [<ffffffff8020c83e>] restore_args+0x0/0x30 [ 2981.116008] [<ffffffff8025c1be>] lock_acquire+0x95/0xc2 [ 2981.116008] [<ffffffff803037fa>] ipc_lock+0x63/0x9b [ 2981.116008] [<ffffffff80459df7>] _spin_lock+0x2d/0x59 [ 2981.116008] [<ffffffff803037fa>] ipc_lock+0x63/0x9b [ 2981.116008] [<ffffffff803037fa>] ipc_lock+0x63/0x9b [ 2981.116008] [<ffffffff80303797>] ipc_lock+0x0/0x9b [ 2981.116008] [<ffffffff8030383e>] ipc_lock_check+0xc/0x57 [ 2981.116008] [<ffffffff8020c83e>] restore_args+0x0/0x30 [ 2981.116008] [<ffffffff80304c41>] sys_msgctl+0x18e/0x502 [ 2981.116008] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 2981.116008] [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a [ 2981.116008] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 2981.116008] [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a [ 2981.116008] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 2981.116008] [<ffffffff80212dd1>] sched_clock+0x5/0x7 [ 2981.116008] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b [ 2981.116008] [ 2981.116008] lock ffff880144598b88 key ffff880144598b88 [ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88 ff ff [ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 3046.208003] BUG: soft lockup - CPU#0 stuck for 61s! [sysv_test2:5381] [ 3046.208006] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy thermal processor fan thermal_sys [ 3046.208006] irq event stamp: 2111863 [ 3046.208006] hardirqs last enabled at (2111863): [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3046.208006] hardirqs last disabled at (2111862): [<ffffffff8045989c>] trace_hardirqs_off_thunk+0x3a/0x3c [ 3046.208006] softirqs last enabled at (2109196): [<ffffffff8020da0c>] call_softirq+0x1c/0x28 [ 3046.208006] softirqs last disabled at (2109191): [<ffffffff8020da0c>] call_softirq+0x1c/0x28 [ 3046.208006] CPU 0: [ 3046.208006] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy thermal processor fan thermal_sys [ 3046.208006] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20 [ 3046.208006] RIP: 0010:[<ffffffff80213531>] [<ffffffff80213531>] native_read_tsc+0x10/0x1d [ 3046.208006] RSP: 0018:ffff880144945db8 EFLAGS: 00000286 [ 3046.208006] RAX: 000000001e96df9f RBX: ffff880144598b70 RCX: 000000001e96df9f[ 3046.208006] RDX: 000000000000081b RSI: ffffffff805425d7 RDI: 0000000000000001[ 3046.208006] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[ 3046.208006] R10: ffffffff80250954 R11: ffffffff8033fbb9 R12: ffff88002f186160[ 3046.208006] R13: ffff8800ae95a000 R14: ffff880144944000 R15: ffffffff8082c810[ 3046.208006] FS: 00007fbd589216e0(0000) GS:ffffffff80625a80(0000) knlGS:0000000000000000 [ 3046.208006] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3046.208006] CR2: 00007fbd5848cc70 CR3: 000000014416a000 CR4: 00000000000006e0[ 3046.208006] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 3046.208006] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 3046.208006] [ 3046.208006] Call Trace: [ 3046.208006] [<ffffffff8033fbd1>] delay_tsc+0x18/0x51 [ 3046.208006] [<ffffffff80220eb8>] __ticket_spin_trylock+0x0/0x18 [ 3046.208006] [<ffffffff8034319b>] _raw_spin_lock+0x96/0x102 [ 3046.208006] [<ffffffff80459e18>] _spin_lock+0x4e/0x59 [ 3046.208006] [<ffffffff803037fa>] ipc_lock+0x63/0x9b [ 3046.208006] [<ffffffff803037fa>] ipc_lock+0x63/0x9b [ 3046.208006] [<ffffffff80303797>] ipc_lock+0x0/0x9b [ 3046.208006] [<ffffffff8030383e>] ipc_lock_check+0xc/0x57 [ 3046.208006] [<ffffffff8020c83e>] restore_args+0x0/0x30 [ 3046.208006] [<ffffffff80304c41>] sys_msgctl+0x18e/0x502 [ 3046.208006] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3046.208006] [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a [ 3046.208006] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3046.208006] [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a [ 3046.208006] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3046.208006] [<ffffffff80212dd1>] sched_clock+0x5/0x7 [ 3046.208006] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b [ 3046.208006] [ 3111.704002] BUG: soft lockup - CPU#0 stuck for 61s! [sysv_test2:5381] [ 3111.704007] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy thermal processor fan thermal_sys [ 3111.704007] irq event stamp: 2111863 [ 3111.704007] hardirqs last enabled at (2111863): [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3111.704007] hardirqs last disabled at (2111862): [<ffffffff8045989c>] trace_hardirqs_off_thunk+0x3a/0x3c [ 3111.704007] softirqs last enabled at (2109196): [<ffffffff8020da0c>] call_softirq+0x1c/0x28 [ 3111.704007] softirqs last disabled at (2109191): [<ffffffff8020da0c>] call_softirq+0x1c/0x28 [ 3111.704007] CPU 0: [ 3111.704007] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc loop rng_core iTCO_wdt i2c_i801 i2c_core snd_pcsp serio_raw snd_pcm snd_timer snd soundcore snd_page_alloc psmouse shpchp pci_hotplug button e752x_edac joydev edac_core evdev ext3 jbd mbcache sg sr_mod cdrom ata_generic sd_mod ata_piix libata dock usbhid hid ff_memless ide_pci_generic ide_core mptspi ehci_hcd mptscsih mptbase scsi_transport_spi scsi_mod uhci_hcd tg3 libphy thermal processor fan thermal_sys [ 3111.704007] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20 [ 3111.704007] RIP: 0010:[<ffffffff80213529>] [<ffffffff80213529>] native_read_tsc+0x8/0x1d [ 3111.704007] RSP: 0018:ffff880144945db8 EFLAGS: 00000206 [ 3111.704007] RAX: 00000000d15828f8 RBX: ffffffffd158282d RCX: 00000000d158282d[ 3111.704007] RDX: 0000000000000845 RSI: ffffffff805425d7 RDI: 0000000000000001[ 3111.704007] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000[ 3111.704007] R10: ffffffff80250954 R11: ffffffff8033fbb9 R12: ffff88002f186160[ 3111.704007] R13: ffff8800ae95a000 R14: ffff880144944000 R15: ffffffff8082c810[ 3111.704007] FS: 00007fbd589216e0(0000) GS:ffffffff80625a80(0000) knlGS:0000000000000000 [ 3111.704007] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3111.704007] CR2: 00007fbd5848cc70 CR3: 000000014416a000 CR4: 00000000000006e0[ 3111.704007] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 3111.704007] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 3111.704007] [ 3111.704007] Call Trace: [ 3111.704007] [<ffffffff8033fbda>] delay_tsc+0x21/0x51 [ 3111.704007] [<ffffffff8033fc16>] __delay+0x0/0xa [ 3111.704007] [<ffffffff8034319b>] _raw_spin_lock+0x96/0x102 [ 3111.704007] [<ffffffff80459e18>] _spin_lock+0x4e/0x59 [ 3111.704007] [<ffffffff803037fa>] ipc_lock+0x63/0x9b [ 3111.704007] [<ffffffff803037fa>] ipc_lock+0x63/0x9b [ 3111.704007] [<ffffffff80303797>] ipc_lock+0x0/0x9b [ 3111.704007] [<ffffffff8030383e>] ipc_lock_check+0xc/0x57 [ 3111.704007] [<ffffffff8020c83e>] restore_args+0x0/0x30 [ 3111.704007] [<ffffffff80304c41>] sys_msgctl+0x18e/0x502 [ 3111.704007] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3111.704007] [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a [ 3111.704007] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3111.704007] [<ffffffff8025a266>] trace_hardirqs_on_caller+0xff/0x12a [ 3111.704007] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 3111.704007] [<ffffffff80212dd1>] sched_clock+0x5/0x7 [ 3111.704007] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b [ 3111.704007] Hope this can help. Best regards, Clement.
cboulte@gmail.com wrote: > Sorry for the delay, but I was not available before. I tried your > patch on top of Nadya's one on a 2.6.27 kernel and got: > > Thank you! > [ 2981.113314] INFO: trying to register non-static key. > [ 2981.116008] the code is fine but needs lockdep annotation. > [ 2981.116008] turning off the locking correctness validator. > [ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20 > [ 2981.116008] > [snip] > [ 2981.116008] lock ffff880144598b88 key ffff880144598b88 > [ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88 ff > ff > [ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > [ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > [ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > [ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > [ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > [ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > [ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 > Ok. basically, everything is 0x00. lock->key points to it self. But that was done by lookup_lock_class(): > if (unlikely(!lock->key)) > lock->key = (void *)lock; Thus lock->key was 0x00, too. How is that possible? the msg queue structure is allocated with ipc_rcu_alloc. It doesn't perform a memset. Perhaps the problem is in idr_find? what happens if we have concurrently - idr_find() on an empty idr - idr_get_empty_slot() Nadia: Could you explain how updates to idp->layers and idp->top are synchronized? > static int idr_get_empty_slot(struct idr *idp, int starting_id, > struct idr_layer **pa) [snip] > rcu_assign_pointer(idp->top, p); > idp->layers = layers; rcu_assign_pointer() performs an smb_wmb() before assigning the pointer. But there is no smp_wmb() after assigning the pointer. Thus a cpu could update rdp->layers before or after modifying idp->top. > void *idr_find(struct idr *idp, int id) > { > int n; > struct idr_layer *p; > > n = idp->layers * IDR_BITS; > p = rcu_dereference(idp->top); Same problem here: rcu_dereference is only an implied memory barrier that only protects data referenced by p. We could get any combination from the old and the new idp->layers, idp->top values. Thus: idr_find probably reads 0 for idp->layers. idr_find probably reads a non-NULL idp->top. Then idr_find returns idp->top. the content of idp->top is 0x00, because the constructor contains a memset(). Clement: could you try the attached patch? It should oops. That would prove that my analysis is correct. Nadia: Do you have an idea how the bug could be fixed? Perhaps idp->layers could be embedded into idp->top i.e. idr_find would p=rcu_dereference(idp->top) if (p == NULL) return NULL; layers = p->layers. But I simply don't know enough about the idr code to figure out if that would work. Perhaps memory barriers are sufficient. -- Manfred diff --git a/lib/idr.c b/lib/idr.c index e728c7f..b6af844 100644 --- a/lib/idr.c +++ b/lib/idr.c @@ -501,6 +508,7 @@ void *idr_find(struct idr *idp, int id) if (id >= (1 << n)) return NULL; + BUG_ON(n == 0); while (n > 0 && p) { n -= IDR_BITS;
Reply-To: Nadia.Derbey@bull.net On Mon, 2008-11-17 at 20:18 +0100, Manfred Spraul wrote: > cboulte@gmail.com wrote: > > Sorry for the delay, but I was not available before. I tried your > > patch on top of Nadya's one on a 2.6.27 kernel and got: > > > > > Thank you! > > [ 2981.113314] INFO: trying to register non-static key. > > [ 2981.116008] the code is fine but needs lockdep annotation. > > [ 2981.116008] turning off the locking correctness validator. > > [ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20 > > [ 2981.116008] > > [snip] > > [ 2981.116008] lock ffff880144598b88 key ffff880144598b88 > > [ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88 > ff ff > > [ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > > [ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > > [ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > > [ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > > [ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > > [ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > > [ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00 00 > > > Ok. > basically, everything is 0x00. lock->key points to it self. > But that was done by lookup_lock_class(): > > if (unlikely(!lock->key)) > > lock->key = (void *)lock; > Thus lock->key was 0x00, too. > > How is that possible? the msg queue structure is allocated with > ipc_rcu_alloc. > It doesn't perform a memset. > > Perhaps the problem is in idr_find? what happens if we have concurrently > - idr_find() on an empty idr > - idr_get_empty_slot() > > Nadia: Could you explain how updates to idp->layers and idp->top are > synchronized? > > > static int idr_get_empty_slot(struct idr *idp, int starting_id, > > struct idr_layer **pa) > [snip] > > rcu_assign_pointer(idp->top, p); > > idp->layers = layers; > > rcu_assign_pointer() performs an smb_wmb() before assigning the pointer. > But there is no smp_wmb() after assigning the pointer. > Thus a cpu could update rdp->layers before or after modifying idp->top. > > > void *idr_find(struct idr *idp, int id) > > { > > int n; > > struct idr_layer *p; > > > > n = idp->layers * IDR_BITS; > > p = rcu_dereference(idp->top); > Same problem here: > rcu_dereference is only an implied memory barrier that only protects > data referenced by p. > We could get any combination from the old and the new idp->layers, > idp->top values. Manfred, First thanks for the analysis and I'm afraid you might be right. Changes into the idr tree are supposed to be done under a lock, but it's true that we might be accessing 2 incoherent fields, since idr_find on its side may be called without a lock. One solution here would be to have a .layers field inside each idr_layer structure. Actually that's how the radix tree code manages the height: when doing a lookup it's the .height inside the radix_tree_node that is read after the radix_tree_node has been rcu dereferenced. So I think that could be a way to go. I'll try to see how big could be the impact. Unfortunatly, I don't know the name of an "official" maintainer for the idr... Regards, Nadia > > Thus: > idr_find probably reads 0 for idp->layers. > idr_find probably reads a non-NULL idp->top. > Then idr_find returns idp->top. > the content of idp->top is 0x00, because the constructor contains a > memset(). > > Clement: could you try the attached patch? It should oops. > That would prove that my analysis is correct. > > Nadia: Do you have an idea how the bug could be fixed? Perhaps > idp->layers could be embedded into idp->top > i.e. > idr_find would > p=rcu_dereference(idp->top) > if (p == NULL) > return NULL; > layers = p->layers. > > But I simply don't know enough about the idr code to figure out if that > would work. > Perhaps memory barriers are sufficient. > -- > Manfred > plain text document attachment (patch-lib-idr) > diff --git a/lib/idr.c b/lib/idr.c > index e728c7f..b6af844 100644 > --- a/lib/idr.c > +++ b/lib/idr.c > @@ -501,6 +508,7 @@ void *idr_find(struct idr *idp, int id) > > if (id >= (1 << n)) > return NULL; > + BUG_ON(n == 0); > > while (n > 0 && p) { > n -= IDR_BITS;
2008/11/17 Manfred Spraul <manfred@colorfullife.com>: > cboulte@gmail.com wrote: >> >> Sorry for the delay, but I was not available before. I tried your >> patch on top of Nadya's one on a 2.6.27 kernel and got: >> >> > > Thank you! >> >> [ 2981.113314] INFO: trying to register non-static key. >> [ 2981.116008] the code is fine but needs lockdep annotation. >> [ 2981.116008] turning off the locking correctness validator. >> [ 2981.116008] Pid: 5381, comm: sysv_test2 Not tainted 2.6.27-ipc_lock #20 >> [ 2981.116008] >> [snip] >> [ 2981.116008] lock ffff880144598b88 key ffff880144598b88 >> [ 2981.116008] ffff880144598b80: 00 00 00 00 00 00 00 00 88 8b 59 44 01 88 >> ff ff >> [ 2981.116008] ffff880144598b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 >> [ 2981.116008] ffff880144598ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 >> [ 2981.116008] ffff880144598bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 >> [ 2981.116008] ffff880144598bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 >> [ 2981.116008] ffff880144598bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 >> [ 2981.116008] ffff880144598be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 >> [ 2981.116008] ffff880144598bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 00 00 >> > > Ok. > basically, everything is 0x00. lock->key points to it self. > But that was done by lookup_lock_class(): >> if (unlikely(!lock->key)) >> lock->key = (void *)lock; > Thus lock->key was 0x00, too. > > How is that possible? the msg queue structure is allocated with > ipc_rcu_alloc. > It doesn't perform a memset. > > Perhaps the problem is in idr_find? what happens if we have concurrently > - idr_find() on an empty idr > - idr_get_empty_slot() > > Nadia: Could you explain how updates to idp->layers and idp->top are > synchronized? > >> static int idr_get_empty_slot(struct idr *idp, int starting_id, >> struct idr_layer **pa) > [snip] >> rcu_assign_pointer(idp->top, p); >> idp->layers = layers; > > rcu_assign_pointer() performs an smb_wmb() before assigning the pointer. > But there is no smp_wmb() after assigning the pointer. > Thus a cpu could update rdp->layers before or after modifying idp->top. > >> void *idr_find(struct idr *idp, int id) >> { >> int n; >> struct idr_layer *p; >> >> n = idp->layers * IDR_BITS; >> p = rcu_dereference(idp->top); > > Same problem here: > rcu_dereference is only an implied memory barrier that only protects data > referenced by p. > We could get any combination from the old and the new idp->layers, idp->top > values. > > Thus: > idr_find probably reads 0 for idp->layers. > idr_find probably reads a non-NULL idp->top. > Then idr_find returns idp->top. > the content of idp->top is 0x00, because the constructor contains a > memset(). > > Clement: could you try the attached patch? It should oops. > That would prove that my analysis is correct. > > Nadia: Do you have an idea how the bug could be fixed? Perhaps idp->layers > could be embedded into idp->top > i.e. > idr_find would > p=rcu_dereference(idp->top) > if (p == NULL) > return NULL; > layers = p->layers. > > But I simply don't know enough about the idr code to figure out if that > would work. > Perhaps memory barriers are sufficient. > > -- > Manfred > > diff --git a/lib/idr.c b/lib/idr.c > index e728c7f..b6af844 100644 > --- a/lib/idr.c > +++ b/lib/idr.c > @@ -501,6 +508,7 @@ void *idr_find(struct idr *idp, int id) > > if (id >= (1 << n)) > return NULL; > + BUG_ON(n == 0); > > while (n > 0 && p) { > n -= IDR_BITS; > > It looks like you're right... [ 403.683434] kernel BUG at /home/ccalmels/WORK/bug-sysv/lib/idr.c:504! [ 403.683434] invalid opcode: 0000 [121] SMP [ 403.683434] CPU 3 [ 403.683434] Modules linked in: autofs4 video output ac battery ipv6 nfs lockd nfs_acl sunrpc loop i2c_i801 serio_raw pcspkr snd_pcm snd_timer psmouse rng_core i2c_core snd soundcore snd_page_alloc iTCO_wdt button e752x_edac edac_core shpchp pci_hotplug joydev evdev ext3 jbd mbcache sd_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock mptspi mptscsih mptbase ide_pci_generic scsi_transport_spi ehci_hcd uhci_hcd ide_core scsi_mod tg3 libphy thermal processor fan thermal_sys [ 403.683434] Pid: 2922, comm: sysv_test2 Tainted: G D 2.6.27-ipc_lock #21 [ 403.683434] RIP: 0010:[<ffffffff80339859>] [<ffffffff80339859>] idr_find+0x21/0x44 [ 403.683434] RSP: 0018:ffff880145513e60 EFLAGS: 00010246 [ 403.683434] RAX: 0000000000000001 RBX: ffffffff805a3328 RCX: 0000000000000000[ 403.683434] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000[ 403.683434] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000[ 403.683434] R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000[ 403.683434] R13: ffffffff805a31e0 R14: 00007fff45700b50 R15: 0000000000000000[ 403.683434] FS: 00007fc83d6f76e0(0000) GS:ffff88014707fb90(0000) knlGS:0000000000000000 [ 403.683434] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 403.683434] CR2: 00007fc83d262c70 CR3: 0000000144102000 CR4: 00000000000006e0[ 403.683434] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 403.683434] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 403.683434] Process sysv_test2 (pid: 2922, threadinfo ffff880145512000, task ffff880144832680) [ 403.683434] Stack: ffffffff803037e3 ffffffff80303797 ffffffff8033d8db 0000000000000001 [ 403.683434] 0000000000000000 0000000000400650 ffffffff8030383e 0000000000000403 [ 403.683434] 0000000000000001 0000000000000002 ffffffff80304c41 ffff88014444adb8 [ 403.683434] Call Trace: [ 403.683434] [<ffffffff803037e3>] ipc_lock+0x4c/0x9b [ 403.683434] [<ffffffff80303797>] ipc_lock+0x0/0x9b [ 403.683434] [<ffffffff8033d8db>] __up_read+0x13/0x8a [ 403.683434] [<ffffffff8030383e>] ipc_lock_check+0xc/0x57 [ 403.683434] [<ffffffff80304c41>] sys_msgctl+0x18e/0x502 [ 403.683434] [<ffffffff8024fb64>] hrtimer_get_next_event+0x30/0xa8 [ 403.683434] [<ffffffff804598d3>] lockdep_sys_exit_thunk+0x35/0x67 [ 403.683434] [<ffffffff804598d3>] lockdep_sys_exit_thunk+0x35/0x67 [ 403.683434] [<ffffffff8033d8db>] __up_read+0x13/0x8a [ 403.683434] [<ffffffff8045985d>] trace_hardirqs_on_thunk+0x3a/0x3f [ 403.683434] [<ffffffff8033d8db>] __up_read+0x13/0x8a [ 403.683434] [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b [ 403.683434] [ 403.683434] [ 403.683434] Code: 0f bc c0 48 01 c2 48 89 d0 c3 6b 4f 10 06 b8 01 00 00 00 81 e6 ff ff ff 7f 31 d2 48 8b 3f d3 e0 39 c6 7d 26 85 c9 48 89 fa 75 16 <0f> 0b eb fe 83 e9 06 89 f0 d3 f8 83 e0 3f 48 8d 44 c2 08 48 8b [ 403.683434] RIP [<ffffffff80339859>] idr_find+0x21/0x44 [ 403.683434] RSP <ffff880145513e60> Regards, c.