Bug 199207 - GPD Pocket / Cherry Trail designware i2c timeouts on resume from disk with failure of touchscreen, battery charging, etc.
Summary: GPD Pocket / Cherry Trail designware i2c timeouts on resume from disk with fa...
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: I2C (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Drivers/I2C virtual user
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-25 20:37 UTC by ich
Modified: 2020-01-01 14:59 UTC (History)
7 users (show)

See Also:
Kernel Version: 4.18.0-rc8
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel log (81.56 KB, text/plain)
2018-03-25 20:37 UTC, ich
Details
Kernel log excerpt from suspend to error messages after resume (5.23 KB, text/plain)
2018-05-01 13:05 UTC, Teer Sandal
Details
Kernel log excerpt from removing to restoring (1.69 KB, text/plain)
2018-05-01 13:15 UTC, Teer Sandal
Details
log with errors using 4.18-rc8 (60.40 KB, text/plain)
2018-08-13 07:16 UTC, ich
Details

Description ich 2018-03-25 20:37:56 UTC
Created attachment 274931 [details]
kernel log

The touchscreen on the GPD Pocket device (Intel Cherry Trail SoC with a tablet screen in a laptop form factor) occasionally stops working after a suspend/resume cycle. This can be connected standby or suspend to disk.

The core message is an endless stream of those:

[ 1194.623845] i2c_designware 808622C1:05: timeout waiting for bus ready
[ 1194.623856] Goodix-TS i2c-GDIX1001:00: I2C write end_cmd error
[ 1194.646926] i2c_designware 808622C1:05: timeout waiting for bus ready
[ 1194.646943] Goodix-TS i2c-GDIX1001:00: I2C transfer error: -110

The kernel in use is the tree by Hans de Goede from https://github.com/jwrdegoede/linux-sunxi as he is working on support for the GPD there. The issue exists since several releases and was most lately seen on my device with commit ae718bc. It happens much less often than not, so is tricky to debug.

There is discussion on  https://github.com/nexus511/gpd-ubuntu-packages/issues/10 and it was asked that someone post a bug report here against the designware i2c driver.

I am attaching a full log of kernel messages up to the beginning of the endless stream of errors from a recent boot and suspend cycle.
Comment 1 Teer Sandal 2018-05-01 13:05:48 UTC
Created attachment 275685 [details]
Kernel log excerpt from suspend to error messages after resume

I've got same issue on Chuwi Hi13 tablet (Intel Appolo Lake N3450).
After Suspend/Resume Goodix touchscreen behind Synopsys Designware i2c-device becomes unavailable and Linux floods system log with following:
kernel: i2c_designware i2c_designware.3: timeout waiting for bus ready
kernel: Goodix-TS i2c-GDIX1002:00: I2C transfer error: -110

Kernel log excerpt from suspend to error messages after resume is attached.
Comment 2 Teer Sandal 2018-05-01 13:15:56 UTC
Created attachment 275687 [details]
Kernel log excerpt from removing to restoring

It's possible to restore touchscreen operation with deleting/discovering device under root:
echo 1 > /sys/bus/pci/devices/0000\:00\:16.3/remove
echo 1 > /sys/devices/pci0000\:00/pci_bus/0000\:00/rescan

PCI name of device may vary.

Kernel log excerpt from removing to restoring is attached.
Comment 3 Jarkko Nikula 2018-05-02 13:44:38 UTC
Looks like in both cases i2c-desingware was without power after resume: "Unknown Synopsys component type: 0xffffffff". Do you have kernel log after successful resume?
Comment 4 Bastien Nocera 2018-05-02 13:55:04 UTC
You might also want to make sure that this patch:
faec44b6838312484d63e82286087cf2d5ebb891
is in your tree. It's called "Input: goodix - disable IRQs while suspended".

I don't know if it's in the kernel you're using.
Comment 5 Teer Sandal 2018-05-02 18:06:34 UTC
(In reply to Jarkko Nikula from comment #3)
> Looks like in both cases i2c-desingware was without power after resume:
> "Unknown Synopsys component type: 0xffffffff". Do you have kernel log after
> successful resume?

Unfortunately, I don't, since "successful resume" never happened at my tablet. The issue is 100% reproducible.


(In reply to Bastien Nocera from comment #4)
> You might also want to make sure that this patch:
> faec44b6838312484d63e82286087cf2d5ebb891
> is in your tree. It's called "Input: goodix - disable IRQs while suspended".
> 
> I don't know if it's in the kernel you're using.

Yes, patch faec44b6838312484d63e82286087cf2d5ebb891 is in my tree. I use kernel 4.16.3 compiled from source. And in goodix.c I can see lines:
	if (!ts->gpiod_int || !ts->gpiod_rst) {
		disable_irq(client->irq);
corresponding to the patch.


Where should I "dig" to provide you with additional info?
May be increase logging level or use sysfs dynamic_debug for modules goodix/i2c_designware*.
Comment 6 ich 2018-06-29 10:41:24 UTC
Some update from OP here ... I am using a different GPD Pocket unit now and an
updated OS (Ubuntu 18.04) currently with kernel 4.18-rc1 from the linux-sunxi
branch by Hans de Goede.

I so far did not observe a missing touchscreen, but the timeout messages are 
still there. What I did not see yet is the I2C transfer error.

These are still present:


Jun 29 10:47:55: i2c_designware 808622C1:00: Unknown Synopsys component type: 0xffffffff

Jun 29 10:47:57: i2c_designware 808622C1:01: timeout in disabling adapter

[ 1554.922768] i2c_designware 808622C1:01: controller timed out
[ 1555.946911] i2c_designware 808622C1:01: controller timed out

[ 4555.908518] i2c_designware 808622C1:00: timeout waiting for bus ready
[ 4555.936961] i2c_designware 808622C1:00: timeout waiting for bus ready


Preferrably, they happen after a resume cycle. Sometimes they occur with a rate
that drowns out all else in the kernel log. Maybe we can just experiment
with some increased timeouts?
Comment 7 RussianNeuroMancer 2018-07-27 08:44:00 UTC
Same i2c timeouts and Goodix error messages on GPD Win 2.

> currently with kernel 4.18-rc1 from the linux-sunxi branch by Hans de Goede

Did you tested this issue with upstream 4.18 build? Possibly necessary patches in upstream by now.
Comment 8 ich 2018-07-27 08:59:26 UTC
I tested up to kernel.org 4.18-rc5. I switched to 4.17.8 for now, as I experienced a high likelihood of a blank screen after grub2 (where the initramfs waits for me to enter the password for LUKS). Occasional intel gfx troubles are nothing new to me and I hoped that gets ironed out before 4.18 proper.

Anyway, are there specific changes in 4.18-rc6 that concern i2c_designware?

So far, it runs rather well with 4.17.8 as long as I do not attempt suspend to disk. On resume from hibernation, I usuallly don't lose the touchscreen, but the charger and/or battery monitoring instead.

PS: Also with 4.17.8, I get GPU errors, but they don't seem to have long-lasting effects. On resume from s2idle, it's such rather often:

[209268.085465] PM: suspend devices took 1.453 seconds
[212306.645823] ------------[ cut here ]------------
[212306.645833] Unclaimed write to register 0x1e0100
[212306.646215] WARNING: CPU: 0 PID: 23726 at drivers/gpu/drm/i915/intel_uncore.c:1070 __unclaimed_reg_debug+0x46/0x60 [i915]

I figured that I'd first get the basic trouble with i2c settled beffore battling the intel GPU again, which I probably have to if switching to 4.18:-(
Comment 9 RussianNeuroMancer 2018-07-28 07:05:06 UTC
> I switched to 4.17.8 for now, as I experienced a high likelihood of a blank
> screen after grub2 (where the initramfs waits for me to enter the password
> for LUKS)

Did you reported this issue anywhere? 

> Also with 4.17.8, I get GPU errors, but they don't seem to have long-lasting
> effects. 

Please follow this steps to report this issue: 

https://01.org/linuxgraphics/documentation/how-report-bugs

drm-tip packages: http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/
Comment 10 ich 2018-07-28 10:14:53 UTC
I know I should report the issues. For the blank screen on boot one I wanted to pin down the commit that caused it first, but that takes unnerving time spent on testing. THe GPU error, yeah, that shoud be easier, just copying the dmesg ...
Comment 11 ich 2018-08-03 06:41:05 UTC
About the blank screen issue: This might be related to some surrounding machinery (boot loader?). I cannot reproduce it right now, after some Ubuntu updates. I will report if it reappears in a consistent manner.

I will test the most current 4.18 rc again, but I repeat my question: is there something specific after rc5 that should fix the i2c trouble?
Comment 12 RussianNeuroMancer 2018-08-08 16:11:45 UTC
> is there something specific after rc5 that should fix the i2c trouble?

I just focused attention on the fact that most of Hans patches already part of upstream kernel.
Comment 13 ich 2018-08-13 07:16:31 UTC
Created attachment 277829 [details]
log with errors using 4.18-rc8

Yes, I am using the upstream kernel since 4.17 now. I have two little patches on top that do not change the mechanics of resuming device drivers.

I updated my git checkout and tested with 4.18-rc8 now. While s2idle works fine now apart from a little trouble that has user space processes try to write to my USB-connected /home too early (adding a delay after starting kernel threads helps), the resume from hibernation is still resulting in the driver errors.

In my case, with this specific GPD Pocket device, the touchscreen is not affected. But the analog audio output broke. Battery monitoring still works, but charging is broken (only gives 5 V 500 mA, no negotiation of charging modes that would not still deplete the battery);
Comment 14 RussianNeuroMancer 2018-08-13 07:35:17 UTC
Regarding hibernation issues, I think two separate reports is necessary (for audio output and for charging).

Reverting this commit https://github.com/torvalds/linux/commit/12864ff8545f6b8144fdf1bb89b5663357f29ec4 or just trying 4.18rc7 also could make a difference.
Comment 15 ich 2018-08-13 11:11:59 UTC
I just booted rc5 again (before the named commit, I presume) and the audio is also broken after resume from disk. This probably has always been the case, only recently I am more aware of this after blacklisting the HDMI LPE audio module so that pulseaudio starts up at _all_.

The audio being flaky in general is a separate issue, but I do think that it being specifically broken at resume from disk could have a common cause with the touchscreen and i2c issues. So far I see the charging issue to be another symptom of the same problem as the touchscreen failure. Aren't they both caused by failing i2c bus?

I can of course create another bug report for the charger, but I wonder if that is of any help. I expect that problem to disappear once the basic i2c functionality is ensured.
Comment 16 Teer Sandal 2018-11-03 11:09:15 UTC
I reckecked a suspend on my Chuwi Hi13 under linux 4.19.0 and it seems the problem has gone.
Comment 17 RussianNeuroMancer 2018-11-04 06:40:43 UTC
On GPD Win 2 issue is still reproducible with Linux 4.19.0.
Comment 18 Hans de Goede 2020-01-01 14:59:18 UTC
I have just posted a series of Goodix touchscreen driver patches upstream:
https://lore.kernel.org/linux-input/20200101145429.16185-1-hdegoede@redhat.com/T/#t

Which fix some suspend/resume issues with Goodix touchscreens which might help here. I'm not sure these fix this though, as some of the debugging done here so far suggests that this is a i2c controller or suspend-ordering issue, rather then an issue with the touchscreen driver.

These patches are also available in my personal kernel git repo:
https://github.com/jwrdegoede/linux-sunxi/commits/master

Note You need to log in before you can comment on or make changes to this bug.