Most recent kernel where this bug did not occur: UNKNOWN Distribution: any Hardware Environment: x86 Software Environment: n/a Problem Description: kernel oopses or crashes can occur in the n_r3964 line discipline when kmalloc(..GFP_KERNEL..) is invoked in code paths which run in interrupt contexts. I suspect two places of kmalloc where GFP_KERNEL should be replaced by GFP_ATOMIC: 1. function on_receive_block after the comment "/* prepare struct r3964_block_header */" 2. function add_msg after the label "queue_the_message" Steps to reproduce: contact chw@ch-werner.de for test prog.
Created attachment 8269 [details] n_r3964, use GPF_ATOMIC in some kmalloc()s
Does that patch fix everything up? If so, then would I be correct in believing that the "oopses" were in fact "sleeping function called from invalid context" warnings? Thanks.
Well I don't understand the on_receive_block() change - as far as I can tell GFP_KERNEL is legal there. At least, when the caller is from tty_io.c It would really help to understand this problem if we could see the dmesg output showing the call stack up to these two sites, please.
Created attachment 8277 [details] 2 oopses from /var/log/messages These oopses occured on FC5 2.6.16-1.2122_FC5 i686 when two serial lines with n_r3964 where run against each other using a null modem cable some seconds after breaking the serial connection.
OK, thanks. add_msg() is called from timer handler. But what about on_receive_block()?
I think the one in on_receive_block() shouldn't be necessary. Even with tty->low_latency set, we should never end up here in a context where we can't sleep... I think. The other change (in add_msg()) means that we're allocating with GFP_ATOMIC even in the _common_ case, for received data packets. Which is worse? Using GFP_ATOMIC when we don't need to, or something ugly like this... pMsg = kmalloc(sizeof(struct r3964_message), (arg==R3964_MSG_DATA)?GFP_KERNEL:GFP_ATOMIC);
GFP_ATOMIC is unreliable. Given a choice between ugly and unreliable I guess we get ugly.
Created attachment 8279 [details] Patch to use GFP_ATOMIC conditionally Agreed. Let's do it like this then. Christian, please could you confirm that this also fixes the problem? And that you've never actually seen the problem occur in on_receive_block()?
The attachment (id=8279) cured my problem. Tested with two pl2303 USB-serial converters against each other, ran for 30 minutes with null modem plugged and sometimes unplugged. No further oopses were observed.
Thanks for the confirmation.