Bug 211957
Summary: | IRQ storm triggered by I2C touchpad on Tigerlake H | ||
---|---|---|---|
Product: | Drivers | Reporter: | Kai-Heng Feng (kai.heng.feng) |
Component: | GPIO/Pin Control | Assignee: | Jarkko Nikula (jarkko.nikula) |
Status: | RESOLVED INVALID | ||
Severity: | normal | CC: | acelan, andy.shevchenko, jarkko.nikula, kaichuan.hsieh, mika.westerberg |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | mainline, linux-next | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg
lspci -vvnn /proc/interrupts acpidump dmesg with debugs enabled i2c designware controller keeps receiving interrupt A test patch for change irq flag acpidump of TPD0 resource template Full dsdt.dsl |
Description
Kai-Heng Feng
2021-02-26 12:06:41 UTC
Created attachment 295475 [details]
dmesg
Created attachment 295477 [details]
lspci -vvnn
Created attachment 295479 [details]
/proc/interrupts
Thanks for the report. I have a few questions here: - Since it's Dell machine, have you installed latest firmware on it? - Are you able to share also ACPI tables `acpidump -o dell-$MODEL.dat` (replace $MODEL with the actual one)? - Can you enable pin control and I²C HID debug? (`i2c_hid.debug=1` in tje kernel command line for the latter one and CONFIG_DEBUG_PINCTRL=y in the kernel configuration for the former one) From what I see that the controller with IRQ storm is the second one (00:15.1) which has DELL0A66:00 without any sign of IRQ handler registered. Also, check if bug #207189 has anything to do with your case (I think not, but just to be sure). (In reply to Andy Shevchenko from comment #4) > From what I see that the controller with IRQ storm is the second one > (00:15.1) which has DELL0A66:00 without any sign of IRQ handler registered. Okay, I found a line in the /proc/interrupts. I also wondering what `apic=debug` will add and perhaps GPIO debug with CONFIG_DEBUG_GPIO=y. Created attachment 295487 [details]
acpidump
The platform is not on the market yet, can't disclose its model name.
Created attachment 295489 [details]
dmesg with debugs enabled
(In reply to Andy Shevchenko from comment #4) > Thanks for the report. I have a few questions here: > - Since it's Dell machine, have you installed latest firmware on it? Yes, the firmware is latest. > - Are you able to share also ACPI tables `acpidump -o dell-$MODEL.dat` > (replace $MODEL with the actual one)? > - Can you enable pin control and I²C HID debug? (`i2c_hid.debug=1` in tje > kernel command line for the latter one and CONFIG_DEBUG_PINCTRL=y in the > kernel configuration for the former one) > > From what I see that the controller with IRQ storm is the second one > (00:15.1) which has DELL0A66:00 without any sign of IRQ handler registered. > > Also, check if bug #207189 has anything to do with your case (I think not, > but just to be sure). No, the platform doesn't have an Nvidia GPU. I don't see any smoking gun in the logs. The HID doesn't flood the logs, so there is no wrong communication with the touchpad. The Interrupt resource of the controller and its configuration seems sane. Is it possible that we have another peripheral connected to the same bus which has no driver / registered handler? Can you run `i2cdetect -y 1` (choose the right bus number) and check what is connected there? It might be some firmware issue as well, I would recommend to talk to Dell if they have any insights. Also, we may try to debug interrupt handler of the I²C controller, by adding something like `dev_info_ratelimited(...);` there and print the IRQ status. I believe that driver recognizes the interrupt as one that doesn't belong to I²C controller. $ sudo i2cdetect -y 1 Warning: Can't use SMBus Quick Write command, will skip some addresses 0 1 2 3 4 5 6 7 8 9 a b c d e f 00: 10: 20: 30: -- -- -- -- -- -- -- -- 40: 50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 60: 70: The IRQs are indeed from i2c-hid, however the report read is 0, caused an early return in i2c_hid_get_input(). Asking vendors to provide more insights. Created attachment 295663 [details]
i2c designware controller keeps receiving interrupt
I enable i2c_designware_master dyndbg log, and I can see it keeps receiving interrupt, even though there is no interaction with the touchpad.
There is only one slave touchpad attach to the controller. may I know if the interrupt is caused by touchpad transferred message?
The touchpad vendor said the touchpad keeps sending message because of host keeps sending read command to it, the data from touchpad perspective is dumped below, and I didn't see the i2c-command in i2c_hid log, is it sent by designware controller directly?
Time [s],Packet ID,Address,Data,Read/Write,ACK/NAK
2.982085300000000,0,0x58,0x20,Write,ACK
2.982190100000000,0,0x58,0x00,Write,ACK
2.982406500000000,1,0x59,0x1E,Read,ACK
2.982504300000000,1,0x59,0x00,Read,ACK
2.982601800000000,1,0x59,0x00,Read,ACK
2.982715600000000,1,0x59,0x01,Read,ACK
2.982812900000000,1,0x59,0x39,Read,ACK
2.982910500000000,1,0x59,0x01,Read,ACK
2.983008900000000,1,0x59,0x02,Read,ACK
2.983107700000000,1,0x59,0x00,Read,ACK
2.983205200000000,1,0x59,0x03,Read,ACK
2.983302800000000,1,0x59,0x00,Read,ACK
2.983400500000000,1,0x59,0x0C,Read,ACK
2.983499000000000,1,0x59,0x00,Read,ACK
2.983596900000000,1,0x59,0x04,Read,ACK
2.983695800000000,1,0x59,0x00,Read,ACK
The i2c-hid driver keeps reading is because IRQ isn't de-asserted... The IRQ should be de-asserted by touchpad or host? HID Over I2C Protocol Specification, 6.1.3 Retrieval of Input Reports: "If the DEVICE has no more Input Reports to send, it de-asserts the interrupt line." However it still _could_ be an issue on intel-pinctrl, since the same touchpad works fine on older platforms. (In reply to Kai-Heng Feng from comment #16) > However it still _could_ be an issue on intel-pinctrl, since the same > touchpad works fine on older platforms. It could be very well the IRQ line misconfiguration (edge vs. level, etc). Created attachment 295675 [details] A test patch for change irq flag (In reply to Andy Shevchenko from comment #17) > (In reply to Kai-Heng Feng from comment #16) > > However it still _could_ be an issue on intel-pinctrl, since the same > > touchpad works fine on older platforms. > > It could be very well the IRQ line misconfiguration (edge vs. level, etc). The vendor reply that the touchpad uses level trigger low to issue interrupt. However, I try to modify the i2c_hid as attached patch, the IRQ storm is still happened. Do you have any suggestion to configure the the IRQ line correctly? By the way, touchpad vendor has measured the interrput pin's signal, they say it keeps high when there is no interaction with the touchpad, but the i2c-designware.1's interrupt counts is still increasing. (In reply to KaiChuan-Hsieh from comment #18) > Created attachment 295675 [details] > A test patch for change irq flag It is not correct. The IRQ we are talking about a) doesn't have anything to do with pin control (it's IOxAPIC); b) the modification in I²C HID driver obviously has no relation to the controller's IRQ handler. > (In reply to Andy Shevchenko from comment #17) > > (In reply to Kai-Heng Feng from comment #16) > > > However it still _could_ be an issue on intel-pinctrl, since the same > > > touchpad works fine on older platforms. > > > > It could be very well the IRQ line misconfiguration (edge vs. level, etc). > > The vendor reply that the touchpad uses level trigger low to issue interrupt. > However, I try to modify the i2c_hid as attached patch, the IRQ storm is > still happened. Do you have any suggestion to configure the the IRQ line > correctly? The question is why I²C controller got the interrupt flood. > By the way, touchpad vendor has measured the interrput pin's signal, they > say it keeps high when there is no interaction with the touchpad, but the > i2c-designware.1's interrupt counts is still increasing. I moved it to Jarkko and I²C subsystem, but it might be as well in HID driver something related to touchpad firmware (no clue here). We only have a clue that the touchpad with the same fw/hardware have different result on TGL-H platform but manufactured by different ODM. One can't see IRQ storming, the other can see the IRQ storming. I already confirmed that the touchpad connect to the same pin's board name on their design, but I wonder if there is any instruction for them to check BIOS implementation for IRQ line configuration. If you can help to suggest, it would be helpful. Thanks, Hello, I checked the problem platform, its touchpad device irq in /proc/interrupts has type INT34C6:00, but on okay platform's touchpad device irq has type IR-IO-APIC. The full name is like: Fail: INT34C6:00 291 DELL0A69:00 Pass: IR-IO-APIC 96-fasteoi DELL0A68:00 May I know what makes the interrupt become like this? I don't know if INT34C6:00 is a valid interrupt type. will it cause the host not using level trigger to wakeup host? Thanks, (In reply to KaiChuan-Hsieh from comment #21) > Fail: INT34C6:00 291 DELL0A69:00 > Pass: IR-IO-APIC 96-fasteoi DELL0A68:00 > > May I know what makes the interrupt become like this? I don't know if > INT34C6:00 is a valid interrupt type. will it cause the host not using level > trigger to wakeup host? The former one (Fail) is GPIO, while the latter is IOxAPIC. I just realized that BIOS may have a bug (no validation?) when it is using GpioInt() instead of Interrupt() in DSDT. It may be simply that the pin numbering is wrong in the ACPI. Note, ACPI expects GPIO # which differs to the actual pin number (thanks to Microsoft :-). Created attachment 295729 [details]
acpidump of TPD0 resource template
Hello,
The TPD0 resource in acpidump is attached. And ODM confirm the hardware is connected to the board name shown below:
DATA: PCH: GPP_C18/I2C1_SDA
CKL: PCH: GPP_C19/I2C1_SCL
INT: PCH: GPP_E3/CPU_GP0
It seems it connects to GP0, and it is 0x0000 in the resource pin list, can you help to indicate what might be wrong.
Thanks a lot,
Created attachment 295731 [details]
Full dsdt.dsl
Attach the full dsdt.dsl.
(In reply to KaiChuan-Hsieh from comment #23) > Created attachment 295729 [details] > acpidump of TPD0 resource template > > Hello, > > The TPD0 resource in acpidump is attached. And ODM confirm the hardware is > connected to the board name shown below: > > DATA: PCH: GPP_C18/I2C1_SDA > CKL: PCH: GPP_C19/I2C1_SCL > INT: PCH: GPP_E3/CPU_GP0 > > It seems it connects to GP0, and it is 0x0000 in the resource pin list, can > you help to indicate what might be wrong. Ha! Seems somebody misinterpreted E and F (yes, capital letters are quite similar) 291 is GPP_F3! According to the above GPP_E3 must be 259. Seems like a BIOS bug. (In reply to KaiChuan-Hsieh from comment #23) > It seems it connects to GP0, and it is 0x0000 in the resource pin list, can > you help to indicate what might be wrong. There is GNUM() that is called on top of GPDI. GPDI is provided (filled) by BIOS. Thanks for your reply. May I ask how did you know the current setting is 291 from the dsdt.dsl? (In reply to KaiChuan-Hsieh from comment #27) > Thanks for your reply. May I ask how did you know the current setting is 291 > from the dsdt.dsl? No, I may not know that from DSDT. The pure Linux logs and files is the key (see output of /proc/interrupts which you cited in comment #21). ah, I see. thanks for you explanation. I'll request ODM to check. (In reply to Andy Shevchenko from comment #28) > (In reply to KaiChuan-Hsieh from comment #27) > > Thanks for your reply. May I ask how did you know the current setting is > 291 > > from the dsdt.dsl? > > No, I may not know that from DSDT. The pure Linux logs and files is the key > (see output of /proc/interrupts which you cited in comment #21). Hello Andy, Thanks for your help, ODM confirm that the touchpad interrupt works normal after setting PchI2cTouchPadIrqMode = 1. I'll close the bug. Thanks, |