Bug 15888

Summary: Unsafe PCIe assumptions in fixed_bar_cap
Product: Drivers Reporter: Petr Vandrovec (petr)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED OBSOLETE    
Severity: normal CC: akpm, alan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34-rc6 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Stop fixed_bar_cap if offset 0x100 is aliased to offset 0x000

Description Petr Vandrovec 2010-04-30 23:41:25 UTC
Created attachment 26190 [details]
Stop fixed_bar_cap if offset 0x100 is aliased to offset 0x000

fixed_bar_cap seems to assume that all Intel devices return zero or 0xFFFFFFFF when their PCIe configuration space is accessed.  I'm not sure whether it is really true on real Intel pre-PCIe hardware (I doubt, but I have no real world test case).

But what I know is that it breaks in the VMware's VMs.  Couple of pre-requisities are necessary to hit this:

(1) One has to create ESX3.5 or older VM, which does not have PCIe support.  ESX4.0 & newer VMs do have PCIe, have mmconfig, and our platform returns zero for >0x100 accesses to legacy PCI devices.

(2) Virtual machine must run on AMD rev 10h or newer processor.  arch/x86/pci/amd_bus.c believes that these processors always have only PCIe devices connected, switch legacy port 0xCF8/0xCFC to new AMD-only mode to access PCIe registers.  It does so by write to northbridge MSR.  We do ignore write (we have no PCIe, so nothing to do), and keep port 0xCF8 capable of accessing only first 256 bytes of config space.

(3) And VM has PCI device with vendor ID 0x8086:7191.

When all three conditions are satisfied, fixed_bar_cap reads dword from 0x100 - which is aliased to 0x000 - so it gets back 0x71918086.  It cuts out 0x719, and reads next capability from device at offset 0x718 (aliased 0x18).  There it finds 0x40010100.  Still nonzero, and reads next capability from offset 0x400.  There it finds device/vendor ID, and enters endless loop of reads from 0x718 and 0x400.

Attached is a patch which seems to do the trick.  I think that more correct change would be to check pci_is_pcie() on the device, but this information is not known at the time this quirk runs, and I did not felt like parsing non-extended capabilities here.

Thanks.

Patch is Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Patch was tested in a VM.  Was not tested on real Moorestown platform.
Comment 1 Andrew Morton 2010-04-30 23:43:59 UTC
Please don't send patches via bugzilla - it causes lots of problems with
our usual patch management and review processes.

Please send this patch via email as per Documentation/SubmittingPatches. 
Suitable recipients may be found via scripts/get_maintainer.pl.  Please
also cc myself on the email.

Thanks.
Comment 2 Andrew Morton 2010-04-30 23:44:44 UTC
Since which kernel version did this regress, btw?
Comment 3 Petr Vandrovec 2010-05-01 01:05:04 UTC
It was introduced between 2.6.33 and 2.6.34-rc1, when Moorestown platform support was added.  Original checkin is:

commit a712ffbc199849364c46e9112b93b66de08e2c26
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Thu Feb 4 10:59:27 2010 -0800

    x86/PCI: Moorestown PCI support

and there was one fix to stop crashing on non-PCIe systems... which unfortunately does not catch AMD's ECS in PCIe-less systems:

commit c54113823c777f035fa7444f8841fbccda4a5cc0
Author: Jacob Pan <jacob.jun.pan@intel.com>
Date:   Wed Feb 24 09:42:50 2010 -0800

    x86, pci: Add sanity check for PCI fixed bar probing


I was not sure whether my approach with comparing vendor/device ID is acceptable one.  If you say it is acceptable approach, I'll send it for inclusion.