Created attachment 290179 [details] dmesg on 4.19.132 OS: Debian 10.4 Buster CPU: Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz Hardware: Supermicro Super Server Mainboard: Supermicro X10SDV DVB card: Digital Devices Cine S2 V7 Advanced DVB adapter Issue: ===== Loading kernel module ddbridge fails with i2c timeouts, see attached dmesg. The dvb media adapter is unusable. This happened after Linux kernel upgrade from 4.19.98-1+deb10u1 to 4.19.118-2+deb10u1. A git bisect based on the Debian kernel repo on branch buster identified as first bad commit: [1fb0eb795661ab9e697c3a053b35aa4dc3b81165] Update to 4.19.116. Another git bisect based on upstream Linux kernel repo on branch v4.19.y identified as first bad commit: [d2345d1231d80ecbea5fb764eb43123440861462] PCI: Add boot interrupt quirk mechanism for Xeon chipsets. Other affected Debian kernel version: 5.6.14+2~bpo10+1 I tested this version via buster-backports, because so far I was unable to build my own kernel from 5.6.y or even 5.7.y. Workaround: ========== Reverting the mentioned commit d2345d1231d80ecbea5fb764eb43123440861462 on top of 4.19.132 is fixing the problem. Reverting the same commit on 4.19.118 or 4.19.116 is also fixing the problem. It seems, I can only add one attachment now, so I will add more attachments later after bug is submitted. Thanks and Regards Berni
Created attachment 290181 [details] dmesg on 4.19.132 with reverted commit
Created attachment 290183 [details] git bisect log on debian buster kernel repo
Created attachment 290185 [details] git bisect log on linux 4.19.y repo
Created attachment 290187 [details] lscpu output
Created attachment 290189 [details] lspci -vvv output
Debian Bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964734
Created attachment 290193 [details] dmesg on 5.7.8
Created attachment 290195 [details] dmesg on 5.7.8 with reverted commit
I am now able to build also newer kernel versions. v5.7.8 test results: BAD: 5.7.8 GOOD: 5.7.8 with reverted commit b88bf6c3b6ff ("PCI: Add boot interrupt quirk mechanism for Xeon chipsets")
Created attachment 290199 [details] dmesg on 5.8.0-rc4
Created attachment 290201 [details] dmesg on 5.8.0-rc4 with reverted commit
Tests with master (5.8.0-rc4) show the same results: BAD: 5.8.0-rc4 GOOD: 5.8.0-rc4 with reverted commit b88bf6c3b6ff ("PCI: Add boot interrupt quirk mechanism for Xeon chipsets")
Interesting, the Platform is a D-1500 based Xeon which makes it a broadwell based Xeon. I don't see it in the logs, but I'm assuming the device ID is 0x6f28 Those Xeon have the capability disable the route to the ICH: Xeon D1500 Data sheet, Volume 2 (Registers), #5.6.41 cipintrc Coherent Interface Protocol Interrupt Control. Type: CFG PortID: N/A Bus: 0 Device: 5 Function: 0 Offset: 0x14c 25:25 RW 0x0 dis_intx_route2ich: Writing to the above will disable the legacy intx forwarding to the ICH. In looking at the lspci output (assuming this is without the revert): 05:00.0 Multimedia controller: Digital Devices GmbH Cine V7 Subsystem: Digital Devices GmbH Cine V7 Physical Slot: 2 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 16 I'd be curious about your output of lspci -vvvnn with the revert. I'm just wondering that with this particular vendor card, their is some sort of expectation that the IRQ will always be routed to the PCH... Thanks, Sean
Created attachment 290649 [details] lspci -vvvnn v5.7.10 plain
Created attachment 290651 [details] lspci -vvvnn v5.7.10 with revert
Created attachment 290653 [details] diff between lspci -vvvnn v5.7.10 plain and with revert
Hello Sean, yes, you are right, the Platform is a D-1500 Xeon Broadwell based and the device ID is indeed 0x6f28. I have added three more files: a) lspci -vvvnn on v5.7.10 plain b) lspci -vvvnn on v5.7.10 with revert c) diff between the two Many Thanks Berni
Thanks Berni. The only difference that I can see is that when you revert the patch the DD Cine v7 card no longer shows INTx status as pending. When the patch is not reverted, the card is forced to follow INTx emulation. This driver is not handling INTx emulation and on top of that the MSI support is marked "experimental": Config DVB_DDBRIDGE_MSIENABLE bool "Enable Message Signaled Interrupts (MSI) per default (EXPERIMENTAL)" I'm willing to wager that Debian's kernel configuration is *not* enabling MSI. Further, if you were to enable DVB_DDBRIDGE_MSIENABLE, I do not think you would see this issue requiring the revert. The alternative is to plumb support for proper INTx emulation handling. Could you enable DVB_DDBRIDGE_MSIENABLE in your system and see if you get the timeouts without the revert? Thanks, Sean
Hello Sean, sorry for not answering until today. Indeed, MSI is not enabled by default, so I did enable it within /etc/modprobe.d/ options ddbridge msi=1 With that setting, I can confirm the card is working fine with the current standard kernel version 4.19.132-1 in Debian 10 Buster as well as following versions taken from Buster Backports: 5.7.10-1~bpo10+1 5.6.14-2~bpo10+1 5.5.17-1~bpo10+1 5.4.19-1~bpo10+1 Many Thanks for the help and Best Regards Berni
Mark as Resolved - Invalid.
Having to specify a module option doesn't *seem* like a real resolution. It's certainly a workaround, but it seems like other Debian users will likely trip over this, and it's a lot of hassle to find the workaround. Sean, you mention "plumbing support for INTx emulation" above. Is that in the ddbridge driver? Elsewhere? Is that really the only way to fix this without requiring a module parameter or kernel config change?
Agreed, I was a bit too fast. Reopen it.
I've more time to look at this now with some PCI/RCEC patches close to merging hopefully! Bjorn, I believe it is in the driver and the way they have implemented it. What I asked to test was just to confirm it was a driver issue and not per-se the quirk itself. I will take a closer look now. Sean
Hello Sean, I saw some patches regarding PCI merged in Dec 2020. So I just made another test with 5.12.0-11146-g8ca5297e7e3 which still has the same issue. Let me know if you have any other news or a patch to try out. Thanks Berni
Hi Sean, Did you had a chance to look into this? Regards, Salvatore
Is this still a problem? If so, how can we make progress on it? I don't think Sean is available any more, but I guess a dmesg log from a current kernel, e.g., v6.1, could be a start if somebody else can work on it.
I will try to test with a recent kernel asap and report back the result including dmesg log.
Created attachment 303645 [details] dmesg with kernel 6.1.4
Created attachment 303646 [details] dmesg with kernel 6.1.4 with options ddbridge msi=1
Created attachment 303647 [details] lspci -vvvnn with kernel 6.1.4
Created attachment 303648 [details] lspci -vvvnn with kernel 6.1.4 with options ddbridge msi=1
This is still a problem with kernel 6.1.4, see attachments for dmesg and lspci output.
I'm not sure we can find anybody with the right combination of knowledge about Xeon INTx handling, ddbridge, i2c, etc, to work on this. But I would suggest replying to this email thread: https://lore.kernel.org/all/20200709191722.GA6054@bjorn-Precision-5520/ because I think the people who *might* be able to help only pay attention to email, not to bugzilla.