Bug 204057

Summary: Enabling ASPM on 02:00:00 NVMe device causes RX CRC errors on 00:1f.6 ethernet
Product: Drivers Reporter: Kai-Heng Feng (kai.heng.feng)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED CODE_FIX    
Severity: normal CC: bjorn, ypwong
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: v5.2-rc7 Tree: Mainline
Regression: No
Attachments: dmesg
lspci -t
lspci -vvv

Description Kai-Heng Feng 2019-07-03 11:24:23 UTC

    
Comment 1 Kai-Heng Feng 2019-07-03 11:27:06 UTC
Created attachment 283527 [details]
dmesg
Comment 2 Kai-Heng Feng 2019-07-03 11:27:26 UTC
Created attachment 283529 [details]
lspci -t
Comment 3 Kai-Heng Feng 2019-07-03 11:27:47 UTC
Created attachment 283531 [details]
lspci -vvv
Comment 4 Bjorn Helgaas 2019-07-03 12:49:39 UTC
Thanks for the logs.

Per [1], this issue doesn't occur on the out-of-tree e1000e driver, presumably from Intel:

  Same behavior can be observed on both mainline kernel and on your
  dev-queue branch.  OTOH, the same issue can’t be observed on
  out-of-tree e1000e.

  Is there any plan to close the gap between upstream and out-of-tree
  version?

So somebody needs to figure out the difference between the two and get the fix upstream.  This is just a heads-up that the solution is there, waiting to be discovered, so there's no need to try to debug it from scratch.

[1] https://lore.kernel.org/lkml/C4036C54-EEEB-47F3-9200-4DD1B22B4280@canonical.com/
Comment 5 Kai-Heng Feng 2019-10-25 06:59:40 UTC
Fixed by e5e9a2ecfe780975820e157b922edee715710b66 e1000e: add workaround for possible stalled packet