Most recent kernel where this bug did not occur: 2.6.17 Hardware Environment: ARM PXA270, TL16C754 UART Problem Description: When having the console on the 8250 UART, a SysRQ over the serial line causes the following deadlock: --> create break (e.g. ^A f s) | serial8250_handle_port() | spin_lock(&up->port.lock); | receive_chars(...); | uart_handle_sysrq_char(...); | ... | serial8250_console_write(...) | spin_lock_irqsave(&up->port.lock, flags); <<<<<<< Applying | --- a/drivers/serial/8250.c | +++ b/drivers/serial/8250.c | @@ -2241,7 +2241,7 @@ serial8250_console_write(struct console | | touch_nmi_watchdog(); | | - if (oops_in_progress) { | + if (oops_in_progress || up->port.sysrq) { | locked = spin_trylock_irqsave(&up->port.lock, flags); | } else | spin_lock_irqsave(&up->port.lock, flags); performs the wanted action but creates a | / # SysRq : <0>BUG: spinlock trylock failure on UP on CPU#0, swapper/0 | lock: c029df84, .magic: dead4ead, .owner: swapper/0, .owner_cpu: 0 | [<c00250a0>] (dump_stack+0x0/0x14) from [<c00f396c>] (spin_bug+0x90/0xd0) | [<c00f38dc>] (spin_bug+0x0/0xd0) from [<c00f39f4>] (_raw_spin_trylock+0x48/0x54) | r6 = C02850D2 r5 = 00000008 r4 = 00000000 | [<c00f39ac>] (_raw_spin_trylock+0x0/0x54) from [<c01e716c>] (_spin_trylock+0x10/0x1c) | r4 = C029DF84 | [<c01e715c>] (_spin_trylock+0x0/0x1c) from [<c0122518>] (serial8250_console_write+0x58/0x174) | [<c01224c0>] (serial8250_console_write+0x0/0x174) from [<c0035cbc>] (__call_console_drivers+0x60/0x7c) | [<c0035c5c>] (__call_console_drivers+0x0/0x7c) from [<c0035d54>] (_call_console_drivers+0x7c/0x8c) | r6 = 0000174E r5 = C022CE68 r4 = 0000174E | [<c0035cd8>] (_call_console_drivers+0x0/0x8c) from [<c0035fd8>] (release_console_sem+0x180/0x248) | r5 = 00001746 r4 = 0000174E | [<c0035e58>] (release_console_sem+0x0/0x248) from [<c003685c>] (vprintk+0x2bc/0x330) | [<c00365a0>] (vprintk+0x0/0x330) from [<c00368f8>] (printk+0x28/0x30) | [<c00368d0>] (printk+0x0/0x30) from [<c011d79c>] (__handle_sysrq+0x40/0x14c) | r3 = 00000007 r2 = C022CE50 r1 = C0227F5C r0 = C01FF6C0 | [<c011d75c>] (__handle_sysrq+0x0/0x14c) from [<c011d8cc>] (handle_sysrq+0x24/0x2c) | [<c011d8a8>] (handle_sysrq+0x0/0x2c) from [<c0123ce4>] (receive_chars+0x17c/0x2ec) | [<c0123b68>] (receive_chars+0x0/0x2ec) from [<c0124b5c>] (serial8250_interrupt+0x7c/0x100) | [<c0124ae0>] (serial8250_interrupt+0x0/0x100) from [<c0020e74>] (__do_irq+0x54/0x94) | [<c0020e20>] (__do_irq+0x0/0x94) from [<c0020f68>] (do_edge_IRQ+0xb4/0x130) stacktrace. A better patch (which avoids this spin_bug() message but contains races) would be --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -411,7 +411,12 @@ uart_handle_sysrq_char(struct uart_port #ifdef SUPPORT_SYSRQ if (port->sysrq) { if (ch && time_before(jiffies, port->sysrq)) { + int is_locked = spin_is_locked(&port->lock); + if (is_locked) + spin_unlock(&port->lock); handle_sysrq(ch, regs, NULL); + if (is_locked) + spin_lock(&port->lock); port->sysrq = 0; return 1; } ------
How about this? --- a/drivers/serial/8250.c~serial-8250-sysrq-deadlock-fix +++ a/drivers/serial/8250.c @@ -2254,7 +2254,10 @@ serial8250_console_write(struct console touch_nmi_watchdog(); - if (oops_in_progress) { + if (port->sysrq) { + /* serial8250_handle_port() already took the lock */ + locked = 0; + } else if (oops_in_progress) { locked = spin_trylock_irqsave(&up->port.lock, flags); } else spin_lock_irqsave(&up->port.lock, flags);
That should solve it - Enrico, care to test Andrew's patch?
Yes; patch works. But whole locking looks somehow unclean; e.g. | receive_chars(struct uart_8250_port *up, int *status, struct pt_regs *regs) | ... | DEBUG_INTR("handling break...."); or manual debug printk's would trigger the deadlock too. But these are debugging situations only which can be probably ignored.
A similar fix is in -rc2. Can this bug be closed now?
Ping?
Please reopen this bug if it's still present in 2.6.18-rc5.