Bug 13893 - NULL pointer dereference by SRP initiator after restarting SRP target followed by SCSI reset of initiator
Summary: NULL pointer dereference by SRP initiator after restarting SRP target followe...
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Infiniband/RDMA (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_infiniband-rdma
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-02 17:57 UTC by Bart Van Assche
Modified: 2011-01-22 11:05 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.30.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Bart Van Assche 2009-08-02 17:57:41 UTC
Setup of the target system:
- SCST revision 1000.
- Contents of /etc/scst.conf on the target:
[HANDLER vdisk]
DEVICE disk01,/dev/exported-block,NV_CACHE,512           
[HANDLER vcdrom]
[GROUP Default]
[ASSIGNMENT Default]
DEVICE disk01,0
[TARGETS enable]
[TARGETS disable]
- After having installed SCST, start it as follows:
dd if=/dev/zero of=/dev/exported-block bs=1M count=1000
/etc/init.d/scst restart

Setup of the initiator system:
- Vanilla 2.6.30.3 kernel.
- Once the target has been set up, import the SRP target as follows:
rmmod ib_srp; modprobe ib_srp; ibsrpdm -c | while readtarget_info; do echo "${target_info}"; echo "${target_info}" > /sys/class/infiniband_srp/srp-mlx4_0-1/add_target; done

How to reproduce the NULL pointer dereference:
- Run the following command on the target:
/etc/init.d/scst restart
- Run the following command on the initiator:
sg_reset -d /dev/sdb

Result:
scsi host7: SRP reset_device called                  
BUG: unable to handle kernel NULL pointer dereference at 0000000000000074
IP: [<ffffffffa03f2db2>] srp_send_tsk_mgmt+0xb4/0x130 [ib_srp]           
PGD 51e7067 PUD 48543067 PMD 0                                           
Oops: 0000 [1] SMP                                                       
last sysfs file: /sys/devices/system/cpu/cpu1/cache/index2/shared_cpu_map
CPU 0                                                                    
Modules linked in: ib_srp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack iptable_filter ip_tables x_tables vboxnetflt(N) vboxdrv(N) snd_pcm_oss snd_mixer_oss binfmt_misc snd_seq snd_seq_device rdma_ucm scsi_transport_srp scsi_tgt ib_ipoib ib_uverbs ib_umad ib_iser rdma_cm ib_cm iw_cm mlx4_ib ib_sa ipv6 ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq fuse loop dm_mod coretemp(N) snd_hda_intel snd_pcm snd_timer snd_page_alloc snd_hwdep ohci1394 i2c_i801 snd rtc_cmos mlx4_core sr_mod serio_raw pcspkr ieee1394 i2c_core intel_agp pata_marvell rtc_core skge soundcore button rtc_lib sky2 cdrom sg floppy uhci_hcd ehci_hcd sd_mod crc_t10dif usbcore edd ext3 mbcache jbd fan ide_pci_generic ide_core ata_generic ata_piix thermal processor thermal_sys hwmon pata_jmicron ahci libata scsi_mod dock [last unloaded: ib_srp]                                             
Supported: No                                                                   
Pid: 17736, comm: sg_reset Tainted: G          2.6.27.25-0.1-default #1         
RIP: 0010:[<ffffffffa03f2db2>]  [<ffffffffa03f2db2>] srp_send_tsk_mgmt+0xb4/0x130 [ib_srp]                                                                      
RSP: 0018:ffff88005e4ddbc8  EFLAGS: 00010046                                    
RAX: 0000000000000000 RBX: ffff8800623d8620 RCX: 0000000000000000               
RDX: ffff8800778d2000 RSI: ffff88006f088d80 RDI: ffff8800623d8620               
RBP: ffff8800623d8b40 R08: ffffffff806e2c70 R09: 0000000100000000               
R10: 0000000000000046 R11: 0000000000000000 R12: ffff88006f088d80               
R13: 0000000000000008 R14: ffff8800623d8000 R15: ffff88007e7d3c00               
FS:  00007f3cab09f6f0(0000) GS:ffffffff80a43080(0000) knlGS:0000000000000000    
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b                               
CR2: 0000000000000074 CR3: 00000000069b6000 CR4: 00000000000006e0               
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000               
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400               
Process sg_reset (pid: 17736, threadinfo ffff88005e4dc000, task ffff8800095ca0c0)                                                                               
Stack:  ffff8800623d82a8 0000000000000000 ffff8800623d8620 ffff8800623d8000     
 ffff8800381fd380 ffffffffa03f2ea5 ffff88005e4ddc38 ffff8800381fd380            
 ffff8800623d8000 0000000000000000 00007fff39b51144 ffffffffa0008351            
Call Trace:
 [<ffffffffa03f2ea5>] srp_reset_device+0x77/0x101 [ib_srp]
 [<ffffffffa0008351>] scsi_reset_provider+0xc8/0x18d [scsi_mod]
 [<ffffffffa00069d8>] scsi_nonblockable_ioctl+0x90/0xb5 [scsi_mod]
 [<ffffffffa012a869>] sd_ioctl+0x61/0xc6 [sd_mod]
 [<ffffffff8033ec81>] blkdev_driver_ioctl+0x5d/0x72
 [<ffffffff8033f4ee>] blkdev_ioctl+0x1f5/0x217
 [<ffffffff802d71aa>] block_ioctl+0x1b/0x20
 [<ffffffff802bd275>] vfs_ioctl+0x21/0x6c
 [<ffffffff802bd4e2>] do_vfs_ioctl+0x222/0x231
 [<ffffffff802bd542>] sys_ioctl+0x51/0x73
 [<ffffffff8020bfbb>] system_call_fastpath+0x16/0x1b
 [<00007f3caac19b77>] 0x7f3caac19b77


Code: 00 4d 85 e4 0f 84 85 00 00 00 49 8b 54 24 08 31 c0 b9 0c 00 00 00 4c 89 e6 48 89 d7 f3 ab c6 02 01 48 89 df 48 8b 45 10 48 8b 00 <8b> 40 74 48 c1 e0 30 48 0f c8 48 89 42 14 8b 45 50 44 88 6a 1e
RIP  [<ffffffffa03f2db2>] srp_send_tsk_mgmt+0xb4/0x130 [ib_srp]
 RSP <ffff88005e4ddbc8>
CR2: 0000000000000074
---[ end trace 4cec2e39421a0374 ]---
Comment 1 Bart Van Assche 2009-08-03 13:22:56 UTC
See also http://lists.openfabrics.org/pipermail/general/2009-August/061221.html for a proposed patch.
Comment 2 Bart Van Assche 2009-08-03 14:27:10 UTC
Note: the steps explained above do not always trigger this phenomenon. The following sequence always triggers the NULL pointer dereference:
[ target  ] /etc/init.d/scst restart
[initiator] rmmod ib_srp; modprobe ib_srp; ibsrpdm -c | while readtarget_info; do echo "${target_info}"; echo "${target_info}" >/sys/class/infiniband_srp/srp-mlx4_0-1/add_target; done
[ target  ] rmmod ib_srpt
[initiator] sg_reset -d ${srp_device}
Comment 3 Florian Mickler 2011-01-22 10:58:56 UTC
fixed in 2.6.38-rc1: 
commit f8b6e31e4e46bf514c27fce38783ed5615cca01d
Author: David Dillow <dillowda@ornl.gov>
Date:   Fri Nov 26 13:02:21 2010 -0500

    IB/srp: allow task management without a previous request

Note You need to log in before you can comment on or make changes to this bug.