Bug 110681
Summary: | System hangs while trying to access vpd data on LSI Logic / Symbios Logic MegaRAID SAS 2108 controllers | ||
---|---|---|---|
Product: | Drivers | Reporter: | Babu Moger (babu.moger) |
Component: | PCI | Assignee: | drivers_pci (drivers_pci) |
Status: | NEW --- | ||
Severity: | normal | CC: | ajb, bjorn, mrmansfield, toracat |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.4 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg file taken from crash
RFC patch disable vpd access to few buggy devices |
Description
Babu Moger
2016-01-11 19:09:08 UTC
Please attach a complete dmesg log, including the panic. Can you try accessing VPD on this device in a Windows system? If Windows crashes too, it's a pretty good indication that the LSI device is defective and maybe we should just completely blacklist it. If Windows shows some VPD data, that's a clue that we should be able to do at least that much. Created attachment 199361 [details]
dmesg file taken from crash
(In reply to Bjorn Helgaas from comment #1) > Please attach a complete dmesg log, including the panic. Done. > > Can you try accessing VPD on this device in a Windows system? If Windows > crashes too, it's a pretty good indication that the LSI device is defective Sorry. We don't test Windows machine. > and maybe we should just completely blacklist it. If Windows shows some VPD > data, that's a clue that we should be able to do at least that much. Looks like multiple vendors have this problem. Blacklisting may not be a good option. Thanks for the log. Do you test this card with any OS other than Linux? Solaris? By "blacklist", I meant a patch like what you proposed (http://lkml.kernel.org/r/1449174319-52798-1-git-send-email-babu.moger@oracle.com) that disables or limits VPD on that particular device. Blacklisting is a last resort because it's hard to populate the list in the first place (somebody has to trip over the broken device, report it, and we have to figure out what's wrong and how to handle it), and it's hard to maintain over time. (In reply to Bjorn Helgaas from comment #4) > Thanks for the log. Do you test this card with any OS other than Linux? > Solaris? No. > > By "blacklist", I meant a patch like what you proposed Ok. Got it. > (http://lkml.kernel.org/r/1449174319-52798-1-git-send-email-babu. > moger@oracle.com) that disables or limits VPD on that particular device. > > Blacklisting is a last resort because it's hard to populate the list in the > first place (somebody has to trip over the broken device, report it, and we > have to figure out what's wrong and how to handle it), and it's hard to > maintain over time. I have posted an RFC patch. Please take a look. I will attach here as well. Created attachment 199371 [details]
RFC patch disable vpd access to few buggy devices
I have encountered this problem with the 3.10 on Centos 7 with a LSI 9240-4i where "lspci -vv" would hang the PC if the megasas driver failed to init correctly but if it did then "lspci -vv" would read a single zero byte of VPD and output an error. The link to the Centos 7 bug report is https://bugs.centos.org/view.php?id=10818 This patch to blacklist the device may be the best way to go but there is another patch which has been recently committed which limits the size of VPD read. Has anyone tested that patch with a megasas card without the blacklisting patch to be certain that the card really is broken in that area ? If there is valid VPD in there then the next question is, how should access to VPD in a device be prevented if the driver failing to init causes VPD access to hang the PC ? (In reply to Martin Mansfield from comment #7) > I have encountered this problem with the 3.10 on Centos 7 with a LSI 9240-4i > where "lspci -vv" would hang the PC if the megasas driver failed to init > correctly but if it did then "lspci -vv" would read a single zero byte of > VPD and output an error. The link to the Centos 7 bug report is > > https://bugs.centos.org/view.php?id=10818 > > This patch to blacklist the device may be the best way to go but there is > another patch which has been recently committed which limits the size of VPD Yes. I have tested these patches. It reads the vpd data and tries to figure out the actual length. > read. Has anyone tested that patch with a megasas card without the > blacklisting patch to be certain that the card really is broken in that area > ? What i have seen is, this device causes system to hang as soon we attempt to read the vpd. Only way is to blacklist the vpd access. I did not see any other alternative. > > If there is valid VPD in there then the next question is, how should access > to VPD in a device be prevented if the driver failing to init causes VPD > access to hang the PC ? There is no other way as far as I can tell. |