[Previously posted at http://article.gmane.org/gmane.linux.nfs/38949 including a patch] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa05b5e08>] xs_tcp_setup_socket+0x348/0x4a0 [sunrpc] PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:00.0/irq CPU 0 Modules linked in: netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 sunrpc be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi cxgb3 ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ext2 dm_mirror dm_region_hash dm_log dm_mod vhost_net macvtap macvlan tun kvm_intel kvm uinput bnx2 sg dcdbas serio_raw pcspkr iTCO_wdt iTCO_vendor_support i5k_amb i5000_edac edac_core ioatdma dca sfc mtd mdio shpchp ext3 jbd mbcache sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptsas mptscsih mptbase scsi_transport_sas radeon ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core [last unloaded: speedstep_lib] Pid: 10, comm: kworker/0:1 Not tainted 2.6.38-rc5 #3 Dell Inc. PowerEdge 2950/0CX396 RIP: 0010:[<ffffffffa05b5e08>] [<ffffffffa05b5e08>] xs_tcp_setup_socket+0x348/0x4a0 [sunrpc] RSP: 0018:ffff880126c11da0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8801220ee000 RCX: 0000000100016909 RDX: 000000000000001e RSI: ffff880123065a80 RDI: 0000000000000000 RBP: ffff880126c11df0 R08: f018000000000000 R09: febef9edc98abe03 R10: 0000000000000480 R11: 0000000000000000 R12: ffff8801220ee680 R13: ffffe8ffffc0dd00 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8800cf800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 000000012191f000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process kworker/0:1 (pid: 10, threadinfo ffff880126c10000, task ffff880126c0f560) Stack: 0000000000014e80 ffff880126c0f560 ffff880126c0faf0 00000000c17d7567 ffff880126c0faf8 ffff8801273cbc40 ffff8800cf811040 ffffe8ffffc0dd00 ffffffffa05b5ac0 0000000000000000 ffff880126c11e50 ffffffff8107b884 Call Trace: [<ffffffffa05b5ac0>] ? xs_tcp_setup_socket+0x0/0x4a0 [sunrpc] [<ffffffff8107b884>] process_one_work+0x124/0x430 [<ffffffff8107e1d1>] worker_thread+0x181/0x3c0 [<ffffffff8107e050>] ? worker_thread+0x0/0x3c0 [<ffffffff810828c6>] kthread+0x96/0xa0 [<ffffffff8100cdc4>] kernel_thread_helper+0x4/0x10 [<ffffffff81082830>] ? kthread+0x0/0xa0 [<ffffffff8100cdc0>] ? kernel_thread_helper+0x0/0x10 Code: 0f 1f 00 0f 84 3a ff ff ff e9 4b fe ff ff 0f 1f 44 00 00 41 83 fd 91 0f 85 3c fe ff ff 66 0f 1f 44 00 00 e9 1b ff ff ff 0f 1f 00 <4d> 8b 6e 20 4d 8d bd 68 01 00 00 4c 89 ff e8 45 99 f0 e0 49 8b RIP [<ffffffffa05b5e08>] xs_tcp_setup_socket+0x348/0x4a0 [sunrpc] RSP <ffff880126c11da0> CR2: 0000000000000020 ---[ end trace 6efc43bb9b1264f8 ]--- The code dump alone is pretty useless as the IP is at the start of a block, but having disassembled the entire sunrpc module it appears that it corresponds to: static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) { struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt); if (!transport->inet) { struct sock *sk = sock->sk; /* <-- this line */ write_lock_bh(&sk->sk_callback_lock); I think the bug is that xs_create_sock() returns 0 if xs_bind() fails. This bug appears to have been introduced in 2.6.37 by: commit b65c0310611af73569f94c526a1e2323d99b380a Author: Pavel Emelyanov <xemul@parallels.com> Date: Mon Oct 4 16:53:46 2010 +0400 sunrpc: Factor out udp sockets creation commit 22f793268de3b4dff8abfcd873ba7afc1f34224f Author: Pavel Emelyanov <xemul@parallels.com> Date: Mon Oct 4 16:54:26 2010 +0400 sunrpc: Factor out v4 sockets creation commit 22d44a7d8a03456aa6d0a047c051aa28728e6ecd Author: Pavel Emelyanov <xemul@parallels.com> Date: Mon Oct 4 16:54:55 2010 +0400 sunrpc: Factor out v6 sockets creation
Created attachment 49842 [details] SUNRPC: Fix a bug in xs_create_sock()
Does the above patch suffice to fix the Oops?
(In reply to comment #2) > Does the above patch suffice to fix the Oops? WTF, you take a week to respond and then send back a patch I already wrote?
Note, this bug report was copied from mail purely so that it can be tracked as a regression.
Sorry I missed the deadline. I'be been traveling for 2 weeks. That patch was one I wrote a week ago for a different bug report. If you already had a patch, then why didn't you attach it to the bugreport? bugzilla-daemon@bugzilla.kernel.org wrote: >https://bugzilla.kernel.org/show_bug.cgi?id=30322 > > > > > >--- Comment #3 from Ben Hutchings <bhutchings@solarflare.com> 2011-03-01 >23:45:47 --- >(In reply to comment #2) >> Does the above patch suffice to fix the Oops? > >WTF, you take a week to respond and then send back a patch I already wrote? > >-- >Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email >------- You are receiving this mail because: ------- >You are the assignee for the bug.
(In reply to comment #5) > Sorry I missed the deadline. I'be been traveling for 2 weeks. > > That patch was one I wrote a week ago for a different bug report. If you > already had a patch, then why didn't you attach it to the bugreport? Because I've been told (I forget who by) that patches shouldn't be posted on Bz if they have already been posted on the appropriate mailing list.
As concerns the mailing list, I won't start scouring that for patches until I get back home, which won't be for another few days. Posting just part of a report wastes everybody's time. If you have a fix, then attach the damned thing, so that I don't have to go looking for it.
Patch: http://article.gmane.org/gmane.linux.nfs/38949
*** Bug 30222 has been marked as a duplicate of this bug. ***
*** Bug 30232 has been marked as a duplicate of this bug. ***
*** Bug 30252 has been marked as a duplicate of this bug. ***
*** Bug 30272 has been marked as a duplicate of this bug. ***
*** Bug 30292 has been marked as a duplicate of this bug. ***
The fix has been merged in mainline for v2.6.38: commit 4cea288aaf0e11647880cc487350b1dc45d9febc Author: Ben Hutchings <bhutchings@solarflare.com> Date: Tue Feb 22 21:54:34 2011 +0000 sunrpc: Propagate errors from xs_bind() through xs_create_sock()