Bug 69201

Summary: qla2xxx: Low-latency storage triggers lock contention
Product: SCSI Drivers Reporter: Bart Van Assche (bvanassche)
Component: QLOGIC QLA2XXXAssignee: scsi_drivers-qla2xxx
Status: NEW ---    
Severity: enhancement    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.12.7 Subsystem:
Regression: No Bisected commit-id:
Attachments: perf report --stdio output

Description Bart Van Assche 2014-01-22 09:45:44 UTC
Running a fio test on an initiator system with an 8 Gb/s QLogic FC adapter revealed a bottleneck in the qla2xxx initiator driver - lock contention on  ha->hardware_lock.

The test that revealed this is as follows:
- On a target system with 4 CPU threads (Intel i5), an 8 Gb/s QLogic FC HBA and kernel 3.12.7, download the SCST trunk r5194, build it in release mode, load the brd kernel module and configure SCST such that it exports /dev/ram[0123] via the vdisk_blockio driver. Set the vdisk_blockio parameter threads_num to 2. Export these four RAM disks as LUNs 0..3.
- On an initiator system with 12 CPU threads (Intel Core i7 with hyperthreading enabled), an 8 Gb/s QLogic HBA and kernel 3.12.7, run the following fio job (where /dev/sd[cdef] corresponds to the SCST LUNs):

fio --bs=4K --ioengine=libaio --rw=randrw --buffered=0 --numjobs=12 \
    --iodepth=16 --iodepth_batch=8 --iodepth_batch_complete=8   \
    --thread --loops=$((2**31)) --runtime=60 --group_reporting	\
    --gtod_reduce=1 --invalidate=1				\
    $(for d in /dev/sd[cdef]; do echo --name=$d --filename=$d; done)

- While this fio job is running, run the following commands:

perf record -ag sleep 10
perf report –stdio >perf-report-fc.txt

The perf report shows that quite some time is spent in the spin_lock_irqsave() call invoked from qla24xx_dif_start_scsi(). Does this mean that this test revealed lock contention on ha->hardware_lock ?
Comment 1 Bart Van Assche 2014-01-22 09:47:51 UTC
Created attachment 123011 [details]
perf report --stdio output