Bug 14605

Summary: NULL Pointer Dereference drivers/char/n_tty.c
Product: Drivers Reporter: Kyle Bader (kyle.bader)
Component: SerialAssignee: Greg Kroah-Hartman (greg)
Status: RESOLVED CODE_FIX    
Severity: normal CC: akpm, alan, eugeneteo, greg, joe, kyle.bader, rjw, sockpuppet, vbraun
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31.5 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 13615    
Attachments: raw trace from serial console buffer
Decoded trace file
n_tty.c patch provided by Brad Spengler.

Description Kyle Bader 2009-11-14 22:23:29 UTC
I personally am not a kernel programmer however when I showed the attached trace to Brad Spengler he suggested I file a report, he also had this to say:

"read_buf needs to be properly synchronized -- right now, only read_head
is synchronized with a spinlock.  What's happening here is that spinlock is getting held and released in n_tty_flush_buffer() from n_tty_close(), then the read happens (getting past the NULL check, as read_buf hasn't been NULLed yet, though could be in the process of being freed at this point).  Then read_buf gets set to NULL and the tty->read_buf[tty_read_head] = c executes, causing the NULL deref."

I added Joe Peterson as a cc because he was listed by get_maintainer.pl alongside Alan Cox, who to my understanding is not working on tty anymore.  

kernel version: 

2.6.31.5 from git e2984cbfddd5c8fac88b24d7e5f28e1cfb6f3838

additional patches: 

aufs2-2.6/aufs2-31 from git 122bf939bcc705083bd5a43defcd0c3249f7fe88
vserver patch-2.6.31.5-vs2.3.0.36.21.diff

I've tried to be complete however this is my first filing on the kernel bugzilla so if I forgot to include something kindly let me know and I'll get back to you.
Comment 1 Kyle Bader 2009-11-14 22:24:19 UTC
Created attachment 23786 [details]
raw trace from serial console buffer
Comment 2 Kyle Bader 2009-11-14 22:24:52 UTC
Created attachment 23787 [details]
Decoded trace file
Comment 3 Andrew Morton 2009-11-15 00:28:17 UTC
hm, we don't have tty or ldisc categories in bugzilla, so I reassigned this to drivers/serial.

I assume this is really a regression.

I wonder if the bug is still present in 2.6.32.

Perhaps you can persuade Mr Spengler to send us a patch ;)
Comment 4 Sock Puppet 2009-11-30 05:02:28 UTC
Hi Andrew,

I think he already has a patch right here:
http://grsecurity.net/test/grsecurity-2.1.14-2.6.31.6-200911151724.patch
Comment 5 Alan 2009-11-30 08:06:46 UTC
#4 is nothing at all related


The bug looks like a regression, possibly added by some chap called Torvalds when he hacked up the n_tty stuff to call back into the ldisc on read ;) Brad's diagnosis looks odd though - the ldisc is locked down at that point so won't get closed (a set_ldisc would stall)

Assigned to Greg the tty maintainer.
Comment 6 Kyle Bader 2009-12-20 04:00:24 UTC
I've caught this happening one or two more times since filing the bug, traces are available if needed. Brad Spengler has written a patch, If Alan/Greg could review it and make comments it would be greatly appreciated.
Comment 7 Kyle Bader 2009-12-20 04:02:22 UTC
Created attachment 24235 [details]
n_tty.c patch provided by Brad Spengler.
Comment 8 Alan 2009-12-20 11:37:50 UTC
I already did

We fix bugs by understanding them not by randomly applying gunge that tries to hide it.
Comment 9 Kyle Bader 2009-12-20 13:31:38 UTC
I wasn't aware that you had already reviewed the patch, thanks for looking Alan.
Comment 10 Greg Kroah-Hartman 2009-12-20 16:23:03 UTC
I had read it as well, if someone wants a patch reviewed, please have them submit it in the proper format as documented in the file, Documentation/SubmittingPatches so that it could be applied.

That includes a description of what the patch does, and why it does it.

Also, as Alan said, the patch just looks like it papers over the real problem, which
isn't acceptable.
Comment 11 Volker Braun 2010-01-27 21:25:38 UTC
While trying to write a Python program using pexpect to communicate with a subprocess I found a testcase that triggers this bug reliably. It takes only two or three attempts to hit the NULL pointer dereference in put_tty_queue(). 

I don't have a serial console, but took photos of the call trace:

http://www.stp.dias.ie/~vbraun/Oops-tty-1.JPG
http://www.stp.dias.ie/~vbraun/Oops-tty-2.JPG

This is on a Core i7-920 running Fedora 12 kernel 2.6.31.12-174.2.3.fc12.x86_64.

I don't know anything about the tty subsystem but I'd be happy to help in debugging this. Any special logging that I should turn on? Patches to try? Or do I have to git-bisect until I find the offender?
Comment 12 Greg Kroah-Hartman 2010-01-27 21:42:26 UTC
Can you try duplicating this on the 2.6.32.6 kernel release?
Comment 13 Volker Braun 2010-01-27 22:00:04 UTC
Kernel 2.6.32.6 (more precisely, 2.6.32.6-36.fc12.x86_64) indeed fixes it! 

I had tried 2.6.31.9 (oopses), but I wasn't aware that 2.6.32 is out.
Comment 14 Greg Kroah-Hartman 2010-01-27 22:10:21 UTC
Great, thanks for testing, marking closed now.