Bug 5094

Summary: Get an ops message from the kernel with version 2.6.13.x
Product: File System Reporter: Peter Poulsen (peter)
Component: SysFSAssignee: Greg Kroah-Hartman (greg)
Status: REJECTED INSUFFICIENT_DATA    
Severity: normal CC: dipankar, greg, kernel, protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.13-rc6 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: My .config for 2.16.13-rc6

Description Peter Poulsen 2005-08-19 12:28:19 UTC
Distribution: Gentoo
Hardware Environment: Dell Inspiron 8150
Software Environment:
Problem Description: 
When I boot the system, I get the following oops message:
Unable to handle kernel NULL pointer dereference at virtual address 00000010
Oops: 0000 [#1}
PREEMPT SMP
Modules linked in:
CPU:     0
EIP:     0060:[<c019d0f3>]   Not tainted VLI
EFLAGS:  00010286     (2.6.13-rc6)
EIP is at create_dir+0x13/0x1b0
Process swapper
Call Trace:
  __kernel_text_address
  show_trace
  sysfs_create_dir
  create_dir
  kobject_add
  class_device_add
  class_device_create
  vcs_make_devfs
  con_open
  tty_open
  tty_open
  chrdev_open
  dentry_open
  vprintk
  filep_open
  get_unused_fd
  sys_open
  free_initmem
  init
  init
  kernel_thread_helper

I have had this problem with all versions of 2.6.13 (I have tried three or four
of them). I did not have this problem with the 2.6.11.x or 2.6.12.x (I cannot
exactly remember which version of the 2.6.12 I have tried).

Steps to reproduce:

Boot the system
Comment 1 Andrew Morton 2005-08-19 13:01:46 UTC
Greg, Dipankar; this looks a bit like Keith's use-after-free thing.

Is that fix for that in 2.6.13-rc6-mm1?
Comment 2 Greg Kroah-Hartman 2005-08-19 13:08:20 UTC
On Fri, Aug 19, 2005 at 01:01:51PM -0700, bugme-daemon@kernel-bugs.osdl.org wrote:
> Is that fix for that in 2.6.13-rc6-mm1?

Yes it is.
Comment 3 Greg Kroah-Hartman 2005-08-19 13:10:31 UTC
This is something else.

Care to attach your .config?
Comment 4 Peter Poulsen 2005-08-19 13:14:09 UTC
Created attachment 5690 [details]
My .config for 2.16.13-rc6

Here it is
Comment 5 Dipankar Sarma 2005-08-19 13:22:37 UTC
Doesn't look like Keith's attr use after free problem. This one seems to get a
stale dentry or something in create_dir().
Comment 6 Maneesh Soni 2005-08-21 22:36:45 UTC
This looks like, is crashing due to NULL d_inode in the parent directory and it
is not the use-after-free case of devt_attr. 

Considering vcs_make_devfs() in the call trace, I think here we are creating
/sys/class/vc/vcs* or /sys/class/vcs/vcsa* direcotries and probably the parent
directory /sys/class/vc does not exist. As directories are always pinned, so it
is not expected that directory dentry and inode can go away if created once and 
the directory is not deleted. So there could be two possiblites here, one, that
there was some error in creating /sys/class/vc, which was ignored and 
the caller went ahead and was trying to create /sys/class/vc/vcs* or there
is some race between deleting /sys/class/vc and creating /sys/class/vc/vcs*.

Re-building and running the kernel with the debug messages (pr_debug) for
lib/kobject.c and drivers/base/class.c could tell us more about the events
happening.
Comment 7 Peter Poulsen 2005-08-22 07:03:29 UTC
If I could get some instructions on how to do that, it would be awesome. 

PS I do know how to code C, but I have never done kernel hacking.
Comment 8 Maneesh Soni 2005-08-22 07:36:14 UTC
I think you need to enable the following kernel config options

Device Drivers --> Generic Driver options --> Driver Core verbose debug messages

and 

Kernel Hacking --> Kernel Debugging --> kobject Debugging

Also try to capture all boot/console messages, you can use a serial console to
do that.

Comment 9 Peter Poulsen 2005-08-22 07:50:29 UTC
> Also try to capture all boot/console messages, you can use a serial console
> to
> do that.

I'm sorry for asking this, but what is a serial console, and how do I use it?
Comment 10 Maneesh Soni 2005-08-22 09:58:09 UTC
Using serial console you can electronicaly log all the console or boot messages.
There is a simple documentation fot serial console in the kernel source
tree. 

Documentation/serial-console.txt

and a more detailed one at

http://www.tldp.org/HOWTO/Remote-Serial-Console-HOWTO/
Comment 11 Peter Poulsen 2005-09-03 10:56:02 UTC
I finally got the time to look at this "serial-console" (sorry for the delay).
Unfortunately my does neither have a serial nor a parallel port :-(

Any ideas?
Comment 12 Maneesh Soni 2005-09-03 19:44:11 UTC
serial-console is for logging the boot messages which you may miss. But if 
luck favours you may not need the serial-console and could "see" the debug 
messages other than the oops messages. If you use frame buffer, you could fit 
more messages on the screen than the plain VGA. Just try to recreate with the 
debugs options as mentioned earlier.
Comment 13 Peter Poulsen 2005-09-07 23:38:09 UTC
I have now tried enabling the two debugging options. Unfortunately the screen I
end up with is the same as without those options.
Comment 14 Daniel Drake 2005-09-17 10:08:50 UTC
Downstream bug report: http://bugs.gentoo.org/show_bug.cgi?id=101884
Comment 15 Natalie Protasevich 2007-08-25 23:17:02 UTC
The bug mentioned in #14 specifies that the problem has been resolved upstream.
If this is the case and the issue has been addressed then we can close the bug.
Thanks.
Comment 16 Natalie Protasevich 2008-03-10 20:33:24 UTC
Actually it wasn't resolved, I was wrong there. But without more data from the reported it is not possible to resolve, and the rported kernel is hopelessly outdated now.
Closing the bug. Please reopen if the problem still there.