Bug 156321 - mpt3sas timeout with xen, works only with mpt3sas.msix_disable=1 (SAS3008 card)
Summary: mpt3sas timeout with xen, works only with mpt3sas.msix_disable=1 (SAS3008 card)
Status: RESOLVED OBSOLETE
Alias: None
Product: SCSI Drivers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: scsi_drivers-other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-08 07:39 UTC by jwzr
Modified: 2016-10-02 06:32 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.6.0-1-amd64 (debian-kernel@lists.debian.org)
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description jwzr 2016-09-08 07:39:17 UTC
At the moment I've only tested the kernel version cited above. The hardware in question is a SAS3008 card.

When I start the the machine without loading Xen, the mpt3sas driver works just fine. 
When Xen is loaded, different timeouts occur during I/O.

For simplicity I'm citing the trace from
https://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5099120 :

[Â Â 618.160182] mpt3sas0: _base_event_notification
[Â Â 648.160097] mpt3sas0: _base_event_notification: timeout
[Â Â 648.160103] mf:
07000000 00000000 00000000 00000000 00000000 0f2f7fff ffffff7c ffffffff
ffffffff 00000000 00000000
[Â Â 648.160117] mpt3sas0: mpt3sas_base_free_resources
[Â Â 648.160122] mpt3sas0: _base_make_ioc_ready
[Â Â 648.160125] mpt3sas0: sending message unit reset !!
[Â Â 648.168091] mpt3sas0: message unit reset: SUCCESS
[Â Â 648.168098] mpt3sas0: mpt3sas_base_unmap_resources
[Â Â 648.169040] mpt3sas0: _base_release_memory_pools
[Â Â 648.170854] mpt3sas0: request_pool(0xffff8800eae00000): free
[Â Â 648.171770] mpt3sas0: sense_pool(0xffff8800f8400000): free
[Â Â 648.173718] mpt3sas0: reply_pool(0xffff8801bba00000): free
[Â Â 648.173786] mpt3sas0: reply_free_pool(0xffff8800eeeb0000): free
[Â Â 648.173789] mpt3sas0: reply_post_free_pool(0xffff8800f9400000): free
[Â Â 648.173792] mpt3sas0: reply_post_free_pool(0xffff8801bb900000): free
[Â Â 648.176634] mpt3sas0: config_page(0xffff8801cc620000): free
[Â Â 649.490617] mpt3sas0: failure at mpt3sas_scsih.c:11439/_scsih_probe()!

I don't know if the trace is exactly the same but I remember the "base_event_notification: timeout".

As the bug report on ibm.com cited above mentions, booting with mpt3sas.msix_disable=1 solves the problem for the time being.

https://forums.servethehome.com/index.php?threads/lsi9211-8i-on-ubuntu-15-10-timeouts.8820/page-2 mentions the same problem in connection with Vmware ESXi.
Comment 1 Sreekanth Reddy 2016-09-08 11:49:53 UTC
Hi John,

Can you please try once with latest Phase13 SAS3008 HBA firmware.

Thanks,
Sreekanth
Comment 2 jwzr 2016-09-08 12:20:08 UTC
Hi, Sreekanth!

I'm sorry - I was a bit too little specific:

The physical card in question is a "LSI 9300-8i". I took the SAS3008 from the lspci output which I considered to be more generic.

The firmware in used is the latest available from http://www.avagotech.com/products/server-storage/host-bus-adapters/sas-9300-8i#downloads which is P12 from April 2016.
Comment 3 Sreekanth Reddy 2016-09-12 07:59:55 UTC
Hi John,

Recently our FW team has fixed this type of issue in the Firmware. Maybe this fix went in the Phase13 FW. So you may need to wait to some more time until this Phase13 FW is available for outside.

Thanks,
Sreekanth
Comment 4 jwzr 2016-09-12 19:12:06 UTC
Thanks. I will keep an eye open for the new firmware to be published and will report back here.
Comment 5 jwzr 2016-10-02 06:32:19 UTC
With the latest revision of the firmware (P13) this is no longer an issue.

Note You need to log in before you can comment on or make changes to this bug.