Bug 1671

Summary: when mount smbfs, gnome filemanager die.
Product: File System Reporter: Young-Ho Cha (ganadist)
Component: Samba/SMBAssignee: Zwane Mwaikambo (zwane)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: high CC: christophe, nkukard
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.7-rc3 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Urban's NULL ops readdir patch
readdir patch #2
readdir patch take #3
fix wait_event_condition
readdir oops fix
readdir oops fix

Description Young-Ho Cha 2003-12-11 02:14:03 UTC
Distribution: Gentoo 1.4
Hardware Environment: Athlon-XP 2500+ Nforce2 M/B 1GB ram
Software Environment: Glibc 2.3.2, gcc 3.3.2, binutils 2.14.90.0.7, GNOME cvs HEAD,
Problem Description: when I mount smbfs, nautilus crashed and kernel says like this.
 <5>smb_lookup: find //.Trash-ganadist failed, error=-5
Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
00000000
*pde = 00000000
Oops: 0000 [#2]
CPU:    0
EIP:    0060:[<00000000>]    Tainted: P
EFLAGS: 00010246
EIP is at 0x0
eax: e739c280   ebx: f5b63f34   ecx: c0169780   edx: d7e7a280
esi: f5b63fa0   edi: c181adf8   ebp: d5dffc80   esp: f5b63f00
ds: 007b   es: 007b   ss: 0068
Process nautilus (pid: 24522, threadinfo=f5b62000 task=f5b68cc0)
Stack: f8e44c8d d5dffc80 f5b63fa0 c0169780 f5b63f34 00000000 00000002 00000004
       d0cae124 00000000 f3df3000 d7e7a280 d0df2180 007b7e7b 017656f2 007c7f7c
       00000000 00000000 f3df3000 00000002 00000000 00000000 00000001 00000004
Call Trace:
 [<f8e44c8d>] smb_readdir+0x4cd/0x6b0 [smbfs]
 [<c0169780>] filldir64+0x0/0x110
 [<c016949c>] vfs_readdir+0x9c/0xa0
 [<c0169780>] filldir64+0x0/0x110
 [<c01698ff>] sys_getdents64+0x6f/0xa9
 [<c0169780>] filldir64+0x0/0x110
 [<c010abc7>] syscall_call+0x7/0xb
 
Code:  Bad EIP value.

Steps to reproduce:
1. login gnome desktop
2. open terminal
3. mount smbfs
Comment 1 Zwane Mwaikambo 2004-03-10 09:58:58 UTC
Created attachment 2310 [details]
Urban's NULL ops readdir patch
Comment 2 Zwane Mwaikambo 2004-03-10 11:09:45 UTC
Created attachment 2311 [details]
readdir patch #2
Comment 3 Zwane Mwaikambo 2004-03-10 21:40:00 UTC
Created attachment 2314 [details]
readdir patch take #3
Comment 4 Zwane Mwaikambo 2004-03-10 22:09:31 UTC
Created attachment 2315 [details]
fix wait_event_condition
Comment 5 Christophe Saout 2004-06-18 12:27:14 UTC
1. mount /mnt/server
2. mount /mnt/server/nw/c

mount hangs, but can be aborted using Ctrl-C

SysRQ-T shows hald and nautilus waiting for something:

Jun 18 22:21:07 leto hald          S C36C498C     0  7799      1          7856 
7798 (NOTLB)
Jun 18 22:21:07 leto ce089d34 00000082 00000000 c36c498c c018cc79 00000096
ce089d24 00000096
Jun 18 22:21:07 leto 00000000 00000000 ce4577f0 00000000 337c19c0 000f4239
cf5c2928 0000031f
Jun 18 22:21:07 leto ce089d48 cfed88b8 cfed8928 c037fd60 ce088000 00000000
00000286 c043abb8
Jun 18 22:21:07 leto Call Trace:
Jun 18 22:21:07 leto [<c018cc79>] init_inode+0x1a9/0x3e0
Jun 18 22:21:07 leto [<c037fd60>] schedule_timeout+0x60/0xc0
Jun 18 22:21:07 leto [<c0123ca0>] process_timeout+0x0/0x10
Jun 18 22:21:07 leto [<d2cdb909>] smb_add_request+0x249/0x2e0 [smbfs]
Jun 18 22:21:07 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:21:07 leto [<d2cd3689>] smb_encode_path+0x29/0x60 [smbfs]
Jun 18 22:21:07 leto [<d2cd6329>] smb_proc_getattr_trans2+0x79/0xe0 [smbfs]
Jun 18 22:21:07 leto [<d2cd65e2>] smb_proc_getattr_unix+0x42/0x80 [smbfs]
Jun 18 22:21:07 leto [<d2cd6895>] smb_proc_getattr+0x35/0x60 [smbfs]
Jun 18 22:21:07 leto [<d2cd7c71>] smb_lookup+0x41/0x130 [smbfs]
Jun 18 22:21:07 leto [<c016642b>] d_lookup+0x1b/0x40
Jun 18 22:21:07 leto [<c015c8a3>] real_lookup+0xd3/0x100
Jun 18 22:21:07 leto [<c015cafe>] do_lookup+0x7e/0x90
Jun 18 22:21:07 leto [<c015d025>] link_path_walk+0x515/0x9e0
Jun 18 22:21:07 leto [<c015d732>] path_lookup+0x72/0x130
Jun 18 22:21:07 leto [<c015d969>] __user_walk+0x49/0x60
Jun 18 22:21:07 leto [<c0158bfd>] vfs_stat+0x1d/0x50
Jun 18 22:21:07 leto [<c0159292>] sys_stat64+0x12/0x30
Jun 18 22:21:07 leto [<c0105f09>] sysenter_past_esp+0x52/0x71
Jun 18 22:21:07 leto
Jun 18 22:21:07 leto nautilus      S 00000092     0  8988   8835          8989 
8958 (NOTLB)
Jun 18 22:21:07 leto c6da7dd0 00000082 00000000 00000092 c92dc0b4 00000092
c6da7dc0 00000092
Jun 18 22:21:07 leto 337c19c0 000f4239 cf5c3830 00000000 337c19c0 000f4239
cbf858d8 0000031f
Jun 18 22:21:07 leto c6da7de4 cfed88b8 cfed8928 c037fd60 c6da6000 00000000
00000292 ce089d48
Jun 18 22:21:07 leto Call Trace:
Jun 18 22:21:07 leto [<c037fd60>] schedule_timeout+0x60/0xc0
Jun 18 22:21:07 leto [<c0123ca0>] process_timeout+0x0/0x10
Jun 18 22:21:07 leto [<d2cdb909>] smb_add_request+0x249/0x2e0 [smbfs]
Jun 18 22:21:07 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:21:07 leto [<d2cd3689>] smb_encode_path+0x29/0x60 [smbfs]
Jun 18 22:21:07 leto [<d2cd6329>] smb_proc_getattr_trans2+0x79/0xe0 [smbfs]
Jun 18 22:21:07 leto [<d2cd65e2>] smb_proc_getattr_unix+0x42/0x80 [smbfs]
Jun 18 22:21:07 leto [<d2cd6895>] smb_proc_getattr+0x35/0x60 [smbfs]
Jun 18 22:21:07 leto [<d2cd93e3>] smb_refresh_inode+0x23/0x120 [smbfs]
Jun 18 22:21:07 leto [<c01654cc>] dput+0x1c/0x200
Jun 18 22:21:07 leto [<d2cd952a>] smb_revalidate_inode+0x4a/0x80 [smbfs]
Jun 18 22:21:07 leto [<d2cd9dc9>] smb_getattr+0x19/0x40 [smbfs]
Jun 18 22:21:07 leto [<d2cd9db0>] smb_getattr+0x0/0x40 [smbfs]
Jun 18 22:21:07 leto [<c0158b7a>] vfs_getattr+0x2a/0x90
Jun 18 22:21:07 leto [<c0158cad>] vfs_fstat+0x2d/0x50
Jun 18 22:21:07 leto [<c01592f2>] sys_fstat64+0x12/0x30
Jun 18 22:21:07 leto [<c014f82b>] sys_open+0x6b/0x80
Jun 18 22:21:07 leto [<c0105f09>] sysenter_past_esp+0x52/0x71
Jun 18 22:21:07 leto
Jun 18 22:21:07 leto nautilus      S C7038C5C     0  8990   8835          8991 
8989 (NOTLB)
Jun 18 22:21:07 leto c6de1e98 00000082 0000000e c7038c5c c6b56e0c c01613d0
c6de1fa0 c6ddacc0
Jun 18 22:21:07 leto c7038c5c 0000000e cbe8b1e0 00000000 80d28600 000f422a
cbd76e78 fffffff5
Jun 18 22:21:07 leto 7fffffff c6de0000 08209630 c037fdb1 c6de1f04 c012d492
c03e3624 00000000
Jun 18 22:21:07 leto Call Trace:
Jun 18 22:21:07 leto [<c01613d0>] filldir64+0x0/0xf0
Jun 18 22:21:07 leto [<c037fdb1>] schedule_timeout+0xb1/0xc0
Jun 18 22:21:07 leto [<c012d492>] get_futex_key+0x42/0x170
Jun 18 22:21:07 leto [<c012d995>] queue_me+0x35/0x80
Jun 18 22:21:07 leto [<c012dbda>] futex_wait+0x13a/0x1b0
Jun 18 22:21:07 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:21:07 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:21:07 leto [<c012dec5>] do_futex+0x35/0x90
Jun 18 22:21:07 leto [<c012e00a>] sys_futex+0xea/0x100
Jun 18 22:21:07 leto [<c014f8bf>] filp_close+0x4f/0x80
Jun 18 22:21:07 leto [<c0105f09>] sysenter_past_esp+0x52/0x71
Jun 18 22:21:07 leto
Jun 18 22:21:07 leto smbiod        S CEF82200     0  9053      1               
9024 (L-TLB)
Jun 18 22:21:07 leto c3633fb0 00000046 00000000 cef82200 c014f8bf 00000000
00000001 c03e2820
Jun 18 22:21:07 leto c3632000 c01240f1 cc1f23f0 00000000 337c19c0 000f4239
cbba09a8 c3632000
Jun 18 22:21:07 leto c3633fc0 c3632000 cfed88b8 d2cdb232 d2cdcab5 c3632000
00000000 cbba0800
Jun 18 22:21:07 leto Call Trace:
Jun 18 22:21:07 leto [<c014f8bf>] filp_close+0x4f/0x80
Jun 18 22:21:07 leto [<c01240f1>] free_uid+0x11/0x60
Jun 18 22:21:07 leto [<d2cdb232>] smbiod+0x192/0x1b1 [smbfs]
Jun 18 22:21:07 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:21:07 leto [<d2cdb0a0>] smbiod+0x0/0x1b1 [smbfs]
Jun 18 22:21:07 leto [<c010429d>] kernel_thread_helper+0x5/0x18
Jun 18 22:21:07 leto
Jun 18 22:21:07 leto mount         D C03E3624     0  9057   9039               
     (NOTLB)
Jun 18 22:21:07 leto c37dfe84 00000086 c03e3624 c03e3624 c37de000 00000000
00000001 c0138563
Jun 18 22:21:07 leto 337c19c0 000f4239 cbba0800 00000000 337c19c0 000f4239
ce457998 c366be88
Jun 18 22:21:07 leto 00000286 c37de000 ce4577f0 c037f255 c366be90 00000001
ce4577f0 c0119250
Jun 18 22:21:07 leto Call Trace:
Jun 18 22:21:07 leto [<c0138563>] __alloc_pages+0x2f3/0x340
Jun 18 22:21:07 leto [<c037f255>] __down+0x95/0x110
Jun 18 22:21:07 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:21:07 leto [<c037f418>] __down_failed+0x8/0xc
Jun 18 22:21:07 leto [<c015fe49>] .text.lock.namei+0x5/0x16c
Jun 18 22:21:07 leto [<c015cafe>] do_lookup+0x7e/0x90
Jun 18 22:21:07 leto [<c015d025>] link_path_walk+0x515/0x9e0
Jun 18 22:21:07 leto [<c0117a54>] do_page_fault+0x344/0x52e
Jun 18 22:21:07 leto [<c015d732>] path_lookup+0x72/0x130
Jun 18 22:21:07 leto [<c015d969>] __user_walk+0x49/0x60
Jun 18 22:21:07 leto [<c015910f>] sys_readlink+0x2f/0x90
Jun 18 22:21:07 leto [<c01592db>] sys_lstat64+0x2b/0x30
Jun 18 22:21:07 leto [<c0105f09>] sysenter_past_esp+0x52/0x71

-------------------------------------------------------------------------

After some seconds the waiting process get out of the kernel with a timeout:

Jun 18 21:09:31 leto smb_add_request: request [c3698e40, mid=1] timed out!
Jun 18 21:09:31 leto smb_lookup: find //.Trash-chtephan failed, error=-5

There's also a message that finding //nw fails some seconds later (probably the
second mount).

Trying to mount /mnt/server/nw/c a second time shows:

Jun 18 22:22:06 leto nautilus      S C7038C5C     0  8990   8835          8991 
8989 (NOTLB)
Jun 18 22:22:06 leto c6de1e98 00000082 0000000e c7038c5c c6b56e0c c01613d0
c6de1fa0 c6ddacc0
Jun 18 22:22:06 leto c7038c5c 0000000e cbe8b1e0 00000000 80d28600 000f422a
cbd76e78 fffffff5
Jun 18 22:22:06 leto 7fffffff c6de0000 08209630 c037fdb1 c6de1f04 c012d492
c03e3624 00000000
Jun 18 22:22:06 leto Call Trace:
Jun 18 22:22:06 leto [<c01613d0>] filldir64+0x0/0xf0
Jun 18 22:22:06 leto [<c037fdb1>] schedule_timeout+0xb1/0xc0
Jun 18 22:22:06 leto [<c012d492>] get_futex_key+0x42/0x170
Jun 18 22:22:06 leto [<c012d995>] queue_me+0x35/0x80
Jun 18 22:22:06 leto [<c012dbda>] futex_wait+0x13a/0x1b0
Jun 18 22:22:06 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:22:06 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:22:06 leto [<c012dec5>] do_futex+0x35/0x90
Jun 18 22:22:06 leto [<c012e00a>] sys_futex+0xea/0x100
Jun 18 22:22:06 leto [<c014f8bf>] filp_close+0x4f/0x80
Jun 18 22:22:06 leto [<c0105f09>] sysenter_past_esp+0x52/0x71
Jun 18 22:22:06 leto
Jun 18 22:22:06 leto smbiod        S CEF82200     0  9053      1               
9024 (L-TLB)
Jun 18 22:22:06 leto c3633fb0 00000046 00000000 cef82200 c014f8bf 00000000
00000001 c03e2820
Jun 18 22:22:06 leto 2f9fc5c0 000f4240 cf5c2780 00000000 f79b0180 000f4244
cbba09a8 c3632000
Jun 18 22:22:06 leto c3633fc0 c3632000 cfed88b8 d2cdb232 d2cdcab5 c3632000
00000000 cbba0800
Jun 18 22:22:06 leto Call Trace:
Jun 18 22:22:06 leto [<c014f8bf>] filp_close+0x4f/0x80
Jun 18 22:22:06 leto [<d2cdb232>] smbiod+0x192/0x1b1 [smbfs]
Jun 18 22:22:06 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:22:06 leto [<d2cdb0a0>] smbiod+0x0/0x1b1 [smbfs]
Jun 18 22:22:06 leto [<c010429d>] kernel_thread_helper+0x5/0x18
Jun 18 22:22:06 leto
Jun 18 22:22:06 leto mount         S 00000001     0  9137   9039               
     (NOTLB)
Jun 18 22:22:06 leto c37dfd78 00000086 c04b03e0 00000001 cbba0800 00000001
c37dfd68 c0118967
Jun 18 22:22:06 leto f79b0180 000f4244 cbba0800 00000000 f79b0180 000f4244
ce457998 0000c886
Jun 18 22:22:06 leto c37dfd8c cfed88b8 cfed8928 c037fd60 c37de000 00000000
00000282 c04d2420
Jun 18 22:22:06 leto Call Trace:
Jun 18 22:22:06 leto [<c0118967>] try_to_wake_up+0xa7/0xc0
Jun 18 22:22:06 leto [<c037fd60>] schedule_timeout+0x60/0xc0
Jun 18 22:22:06 leto [<c0123ca0>] process_timeout+0x0/0x10
Jun 18 22:22:06 leto [<d2cdb909>] smb_add_request+0x249/0x2e0 [smbfs]
Jun 18 22:22:06 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 22:22:06 leto [<d2cd3689>] smb_encode_path+0x29/0x60 [smbfs]
Jun 18 22:22:06 leto [<d2cd6329>] smb_proc_getattr_trans2+0x79/0xe0 [smbfs]
Jun 18 22:22:06 leto [<d2cd65e2>] smb_proc_getattr_unix+0x42/0x80 [smbfs]
Jun 18 22:22:06 leto [<d2cd6895>] smb_proc_getattr+0x35/0x60 [smbfs]
Jun 18 22:22:06 leto [<d2cd7c71>] smb_lookup+0x41/0x130 [smbfs]
Jun 18 22:22:06 leto [<c016642b>] d_lookup+0x1b/0x40
Jun 18 22:22:06 leto [<c015c8a3>] real_lookup+0xd3/0x100
Jun 18 22:22:06 leto [<c015cafe>] do_lookup+0x7e/0x90
Jun 18 22:22:06 leto [<c015d025>] link_path_walk+0x515/0x9e0
Jun 18 22:22:06 leto [<c0117a54>] do_page_fault+0x344/0x52e
Jun 18 22:22:06 leto [<c015d732>] path_lookup+0x72/0x130
Jun 18 22:22:06 leto [<c015d969>] __user_walk+0x49/0x60
Jun 18 22:22:06 leto [<c015910f>] sys_readlink+0x2f/0x90
Jun 18 22:22:06 leto [<c01592db>] sys_lstat64+0x2b/0x30
Jun 18 22:22:06 leto [<c0105f09>] sysenter_past_esp+0x52/0x71
Comment 6 Zwane Mwaikambo 2004-06-18 12:38:52 UTC
Those look ok, they are the timeouts added which give the server a bit of time
to setup the connection. So does Nautilus ever recover after the timeout? Is it
usable afterwards?
Comment 7 Christophe Saout 2004-06-18 13:00:32 UTC
Yes, all clients applications recover after some seconds.

But they still hang. As you can see smbiod is always stuck in filp_close
(reproducable).

This has BTW nothing to do with trying to mount a subdirectory, it always
happens when trying to mount /mnt/server the first time.

I even managed to get an oops this time. smbiod died:

Jun 18 21:45:48 leto Unable to handle kernel NULL pointer dereference at virtual
address 00000008
Jun 18 21:45:48 leto printing eip:
Jun 18 21:45:48 leto d2cd8b7c
Jun 18 21:45:48 leto *pde = 00000000
Jun 18 21:45:48 leto Oops: 0000 [#1]
Jun 18 21:45:48 leto PREEMPT
Jun 18 21:45:48 leto Modules linked in: smbfs ipv6 radeon ipt_REJECT ipt_state
iptable_filter ip_tables snd_intel8x0 snd_ac97_codec snd_mpu401_uart snd_rawmidi
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_pcm
snd_page_alloc snd_timer snd_mixer_oss snd ehci_hcd uhci_hcd ip_conntrack_irc
ip_conntrack_ftp ip_conntrack aes_i586 airo eepro100 mii
Jun 18 21:45:48 leto CPU:    0
Jun 18 21:45:48 leto EIP:    0060:[<d2cd8b7c>]    Not tainted
Jun 18 21:45:48 leto EFLAGS: 00010282   (2.6.7-rc3)
Jun 18 21:45:48 leto EIP is at smb_recv_available+0x2c/0x50 [smbfs]
Jun 18 21:45:48 leto eax: 00000000   ebx: c38a2000   ecx: c38a3f80   edx: 0000541b
Jun 18 21:45:48 leto esi: 00000000   edi: c0000000   ebp: d2ce07d8   esp: c38a3f80
Jun 18 21:45:48 leto ds: 007b   es: 007b   ss: 0068
Jun 18 21:45:48 leto Process smbiod (pid: 9584, threadinfo=c38a2000 task=ce2552e0)
Jun 18 21:45:48 leto Stack: 00000001 cfeb5738 00000000 00000000 d2cdbf3a
00000246 00000000 cfeb5738
Jun 18 21:45:48 leto 00000007 d2cdb027 cfeb57a8 cfeb5738 c38a2000 d2cdb165
d2cdcab5 c38a2000
Jun 18 21:45:48 leto 00000000 ce2552e0 c0119250 00100100 00200200 00000000
00000000 00000000
Jun 18 21:45:48 leto Call Trace:
Jun 18 21:45:48 leto [<d2cdbf3a>] smb_request_recv+0x1a/0x1ed [smbfs]
Jun 18 21:45:48 leto [<d2cdb027>] smbiod_doio+0x27/0xa0 [smbfs]
Jun 18 21:45:48 leto [<d2cdb165>] smbiod+0xc5/0x1b1 [smbfs]
Jun 18 21:45:48 leto [<c0119250>] default_wake_function+0x0/0x10
Jun 18 21:45:48 leto [<d2cdb0a0>] smbiod+0x0/0x1b1 [smbfs]
Jun 18 21:45:48 leto [<c010429d>] kernel_thread_helper+0x5/0x18
Jun 18 21:45:48 leto
Jun 18 21:45:48 leto Code: 8b 70 08 ff 56 24 89 7b 18 8b 5c 24 04 8b 74 24 08 85
c0 8b
Comment 8 Zwane Mwaikambo 2004-06-19 13:26:49 UTC
Created attachment 3204 [details]
readdir oops fix

null getattr doesn't wait but returns -EIO
Comment 9 Zwane Mwaikambo 2004-06-19 17:22:54 UTC
Created attachment 3205 [details]
readdir oops fix