Bug 209321 - DMAR: [DMA Read] Request device [03:00.0] PASID ffffffff fault addr fffd3000 [fault reason 06] PTE Read access is not set
Summary: DMAR: [DMA Read] Request device [03:00.0] PASID ffffffff fault addr fffd3000 ...
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks: 178231
  Show dependency tree
 
Reported: 2020-09-18 20:33 UTC by Todd Brandt
Modified: 2020-11-19 23:21 UTC (History)
4 users (show)

See Also:
Kernel Version: 5.8.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
otcpl-dell-9370-kbl_mem.html (565.51 KB, text/html)
2020-09-18 20:33 UTC, Todd Brandt
Details
issue.def (459 bytes, text/plain)
2020-09-18 20:34 UTC, Todd Brandt
Details
otcpl-aml-y_mem.html (721.06 KB, text/html)
2020-09-23 15:13 UTC, Todd Brandt
Details
otcpl-dell-7390-cmlu_mem.html (575.31 KB, text/html)
2020-09-23 15:13 UTC, Todd Brandt
Details
otcpl-kbl-rvp-7_mem.html (1004.71 KB, text/html)
2020-09-23 15:13 UTC, Todd Brandt
Details
otcpl-whl-u_mem.html (650.87 KB, text/html)
2020-09-23 15:14 UTC, Todd Brandt
Details
otcpl-yoga-920-kblr_mem.html (612.84 KB, text/html)
2020-09-23 15:14 UTC, Todd Brandt
Details
otcpl-dell-7390-cmlu-lspci-vv.txt (62.34 KB, text/plain)
2020-11-19 23:21 UTC, Todd Brandt
Details

Description Todd Brandt 2020-09-18 20:33:46 UTC
Created attachment 292537 [details]
otcpl-dell-9370-kbl_mem.html

This issue has been showing up on several different platforms for the past few releases:

DMAR: [DMA Read] Request device [03:00.0] PASID ffffffff fault addr fffd3000 [fault reason 06] PTE Read access is not set

It doesn't seem to cause any functional, power, or performance issues as of yet, but it's worth tracking. I'm in the process of gathering a histogram of all the machines and kernel versions it happens on.
Comment 1 Todd Brandt 2020-09-18 20:34:04 UTC
Created attachment 292539 [details]
issue.def
Comment 2 Todd Brandt 2020-09-23 14:55:59 UTC
This issue appears to have started in 5.8.0-rc1 on several machines, such as:

otcpl-aml-y
otcpl-cml-h
otcpl-cml-u
otcpl-dell-7390-cmlu
otcpl-dell-9370-kbl
otcpl-icl-u-pnp
otcpl-kbl-rvp-7
otcpl-tgl-h
otcpl-whl-u
otcpl-yoga-920-kblr

It usually occurs in about 6% of the tests in a batch run.
Comment 3 Todd Brandt 2020-09-23 15:13:20 UTC
Created attachment 292585 [details]
otcpl-aml-y_mem.html
Comment 4 Todd Brandt 2020-09-23 15:13:37 UTC
Created attachment 292587 [details]
otcpl-dell-7390-cmlu_mem.html
Comment 5 Todd Brandt 2020-09-23 15:13:56 UTC
Created attachment 292589 [details]
otcpl-kbl-rvp-7_mem.html
Comment 6 Todd Brandt 2020-09-23 15:14:13 UTC
Created attachment 292591 [details]
otcpl-whl-u_mem.html
Comment 7 Todd Brandt 2020-09-23 15:14:33 UTC
Created attachment 292593 [details]
otcpl-yoga-920-kblr_mem.html
Comment 8 Bjorn Helgaas 2020-10-07 15:34:43 UTC
Can you attach a complete dmesg log and "sudo lspci -vv" output for one of the affected systems?  Also please describe a way to reproduce it (even if it's not 100% reliable).  If it's practical to bisect it, that would be very helpful.
Comment 9 Sean V Kelley 2020-10-07 17:30:27 UTC
Todd,

I can also take a look at reproducing.  Is it a system that I can access?  If so I can give a hand at triaging.
Comment 10 Joerg Roedel 2020-10-08 08:21:47 UTC
(In reply to Bjorn Helgaas from comment #8)
> Can you attach a complete dmesg log and "sudo lspci -vv" output for one of
> the affected systems?  Also please describe a way to reproduce it (even if
> it's not 100% reliable).  If it's practical to bisect it, that would be very
> helpful.

Yes, please let us know what kind of device 03:00.0 is and which driver is attached to it. The output of lspci -vv contains that information.

Thanks, Joerg
Comment 11 Todd Brandt 2020-11-19 23:15:04 UTC
(In reply to Bjorn Helgaas from comment #8)
> Can you attach a complete dmesg log and "sudo lspci -vv" output for one of
> the affected systems?  Also please describe a way to reproduce it (even if
> it's not 100% reliable).  If it's practical to bisect it, that would be very
> helpful.

The html timelines have dmesg and logs embedded. The buttons are on the upper right hand corner. click dmesg to see the dmesg log, click log to see the log. The log includes several basic system info calls like lspci -tv and lsusb -t.

There's nothing special to reproducing it. You just call "echo mem > /sys/power/state" in a loop and it occurs in 6% of the timelines. All the systems are running ubuntu 19.04.

Try:
%> sleepgraph -m mem -rtcwake 15 -multi 6h 0

That'll run it over and over for 6 hours.
Comment 12 Todd Brandt 2020-11-19 23:21:26 UTC
Created attachment 293739 [details]
otcpl-dell-7390-cmlu-lspci-vv.txt

lspci -vv on otcpl-dell-7390-cmlu

Note You need to log in before you can comment on or make changes to this bug.