Bug 156321

Summary: mpt3sas timeout with xen, works only with mpt3sas.msix_disable=1 (SAS3008 card)
Product: SCSI Drivers Reporter: jwzr (john.wyzer)
Component: OtherAssignee: scsi_drivers-other
Status: RESOLVED OBSOLETE    
Severity: normal CC: sreekanth.reddy
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 4.6.0-1-amd64 (debian-kernel@lists.debian.org) Subsystem:
Regression: No Bisected commit-id:

Description jwzr 2016-09-08 07:39:17 UTC
At the moment I've only tested the kernel version cited above. The hardware in question is a SAS3008 card.

When I start the the machine without loading Xen, the mpt3sas driver works just fine. 
When Xen is loaded, different timeouts occur during I/O.

For simplicity I'm citing the trace from
https://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5099120 :

[Â Â 618.160182] mpt3sas0: _base_event_notification
[Â Â 648.160097] mpt3sas0: _base_event_notification: timeout
[Â Â 648.160103] mf:
07000000 00000000 00000000 00000000 00000000 0f2f7fff ffffff7c ffffffff
ffffffff 00000000 00000000
[Â Â 648.160117] mpt3sas0: mpt3sas_base_free_resources
[Â Â 648.160122] mpt3sas0: _base_make_ioc_ready
[Â Â 648.160125] mpt3sas0: sending message unit reset !!
[Â Â 648.168091] mpt3sas0: message unit reset: SUCCESS
[Â Â 648.168098] mpt3sas0: mpt3sas_base_unmap_resources
[Â Â 648.169040] mpt3sas0: _base_release_memory_pools
[Â Â 648.170854] mpt3sas0: request_pool(0xffff8800eae00000): free
[Â Â 648.171770] mpt3sas0: sense_pool(0xffff8800f8400000): free
[Â Â 648.173718] mpt3sas0: reply_pool(0xffff8801bba00000): free
[Â Â 648.173786] mpt3sas0: reply_free_pool(0xffff8800eeeb0000): free
[Â Â 648.173789] mpt3sas0: reply_post_free_pool(0xffff8800f9400000): free
[Â Â 648.173792] mpt3sas0: reply_post_free_pool(0xffff8801bb900000): free
[Â Â 648.176634] mpt3sas0: config_page(0xffff8801cc620000): free
[Â Â 649.490617] mpt3sas0: failure at mpt3sas_scsih.c:11439/_scsih_probe()!

I don't know if the trace is exactly the same but I remember the "base_event_notification: timeout".

As the bug report on ibm.com cited above mentions, booting with mpt3sas.msix_disable=1 solves the problem for the time being.

https://forums.servethehome.com/index.php?threads/lsi9211-8i-on-ubuntu-15-10-timeouts.8820/page-2 mentions the same problem in connection with Vmware ESXi.
Comment 1 Sreekanth Reddy 2016-09-08 11:49:53 UTC
Hi John,

Can you please try once with latest Phase13 SAS3008 HBA firmware.

Thanks,
Sreekanth
Comment 2 jwzr 2016-09-08 12:20:08 UTC
Hi, Sreekanth!

I'm sorry - I was a bit too little specific:

The physical card in question is a "LSI 9300-8i". I took the SAS3008 from the lspci output which I considered to be more generic.

The firmware in used is the latest available from http://www.avagotech.com/products/server-storage/host-bus-adapters/sas-9300-8i#downloads which is P12 from April 2016.
Comment 3 Sreekanth Reddy 2016-09-12 07:59:55 UTC
Hi John,

Recently our FW team has fixed this type of issue in the Firmware. Maybe this fix went in the Phase13 FW. So you may need to wait to some more time until this Phase13 FW is available for outside.

Thanks,
Sreekanth
Comment 4 jwzr 2016-09-12 19:12:06 UTC
Thanks. I will keep an eye open for the new firmware to be published and will report back here.
Comment 5 jwzr 2016-10-02 06:32:19 UTC
With the latest revision of the firmware (P13) this is no longer an issue.