Bug 5958

Summary: CF bluetooth card oopses machine when removed while in use
Product: Networking Reporter: Pavel Machek (pavel)
Component: OtherAssignee: Arnaldo Carvalho de Melo (acme)
Status: RESOLVED CODE_FIX    
Severity: normal CC: akpm, alan, marcel, matt
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.16-rc1 Subsystem:
Regression: --- Bisected commit-id:

Description Pavel Machek 2006-01-25 11:45:19 UTC
Most recent kernel where this bug did not occur: it occured even in 2.6.9 or
something that old
Distribution: Debian
Hardware Environment: i386
Software Environment:


Steps to reproduce:
sleep .1
setserial /dev/ttyBT baud_base 921600
hciattach -s 921600 /dev/ttyBT bcsp
hciconfig hci0 up
hciconfig hci0 name billionton
hciconfig /dev/ttyUB1
<then launch pppd on /dev/ttyUB0, and yank the CF card>

Following patch can be used to work around the problem, but it is not proper
solution.

--- a/drivers/serial/serial_core.c
+++ b/drivers/serial/serial_core.c
@@ -524,7 +524,9 @@ static void uart_flush_buffer(struct tty
        DPRINTK("uart_flush_buffer(%d) called\n", tty->index);

        spin_lock_irqsave(&port->lock, flags);
-       uart_circ_clear(&state->info->xmit);
+       if (!state->info)
+               printk(KERN_CRIT "no state->info\n");
+       else uart_circ_clear(&state->info->xmit);
        spin_unlock_irqrestore(&port->lock, flags);
        tty_wakeup(tty);
 }
Comment 1 Pavel Machek 2006-01-25 11:46:30 UTC
(Alan Cox claimed this is not the right fix, but bugzilla refuses
alan@redhat.com address)
Comment 2 Russell King 2006-01-25 14:30:22 UTC
Alan's correct.  It is incorrect for the kernel to call the driver
flush_buffer method after the serial port has been hung up or closed.

Here's a copy of the mail I sent in response to this bug:
| Date: Thu, 29 Sep 2005 23:26:38 +0100
| From: Russell King
| Subject: Re: Problems with CF bluetooth (fwd)
| 
| On Fri, Sep 30, 2005 at 12:04:28AM +0200, Pavel Machek wrote:
| > (I had to hack serial driver a bit to default to 921600 bitrate; it
| > does not have setserial).
| >
| > # hciattach -s 921600 /dev/ttyS0 bcsp
| > BCSP initialization timed out
| >                                                               Pavel
| >
| > Sep 29 15:42:34 amd kernel: EIP:    0060:[<c02d4e2c>]    Not tainted
| > VLI
| > Sep 29 15:42:34 amd kernel: EFLAGS: 00010082   (2.6.14-rc2-g5fb2493e)
| > Sep 29 15:42:34 amd kernel: EIP is at uart_flush_buffer+0xc/0x30
|                                                                                 
| Ok, you're in uart_flush_buffer(), and we got there through the
| ldisc being closed.
|                                                                                 
| > Sep 29 15:42:34 amd kernel: Call Trace:
| > Sep 29 15:42:34 amd kernel:  [<c03f9df6>] hci_uart_flush+0x66/0x80
| > Sep 29 15:42:34 amd kernel:  [<c03f9e27>] hci_uart_close+0x17/0x20
| > Sep 29 15:42:34 amd kernel:  [<c03f9f86>] hci_uart_tty_close+0x26/0x60
| > Sep 29 15:42:34 amd kernel:  [<c02a5ed7>] release_dev+0x587/0x670
|  
| Looking at release_dev(), it does things in the following order:
|  
| 1. various sanity checks
| 2. close the driver
| 3. more sanity checks and other housekeeping
| 4. close the ldisc
| 5. reset the ldisc for the tty to N_TTY
|                                                                                 
| This means that bluetooth is calling into a TTY driver with a tty
| for which the driver has already shut down.  No surprise it oopses.
| Bluetooth should not call the driver during it's closedown.

And as you can see, I said back then it's a bluetooth problem, and it
remains a bluetooth problem.
                                                                                 
I went on to say:

| (There also appears to be an interesting race condition between the
| driver shutting down and the ldisc write functions getting to know
| about it... Alan?)

to which I got the response:

) Yes. Its on "the list". The ldisc_ref stuff provides the framework for
) fixing it in theory as we can use that to hold off a writer.

So, in the first instance, someone needs to report this to the bluetooth
folk rather than repeatedly trying to get serial folk involved in fixing
their bugs.
Comment 3 Pavel Machek 2006-02-04 14:33:20 UTC
Any idea who knows bluetooth subsystem and can be added to Cc?
Comment 4 Russell King 2006-02-26 01:48:13 UTC
*** Bug 5944 has been marked as a duplicate of this bug. ***
Comment 5 Andrew Morton 2007-01-31 00:35:56 UTC
[added Marcel]

I think some bluetooth tty fixes have gone in recently.  Can you test 2.6.20-rc7?
Comment 6 Marcel Holtmann 2007-01-31 00:40:17 UTC
The TTY fixes where only for the virtual TTY over RFCOMM. This is the line
discipline for UART or serial port based Bluetooth dongles.
Comment 7 Pavel Machek 2007-01-31 05:33:06 UTC
This is from latest kernel; so I guess this problem is worked around... We'll
still get ugly backtrace, but at least not crash.

So... someone claims that WARN_ON() is still a BUG, but... let's close this one.

        /*
         * This means you called this function _after_ the port was
         * closed.  No cookie for you.
         */
        if (!state || !state->info) {
                WARN_ON(1);
                return;
        }