Created attachment 184601 [details] Log from `lshw` before and after suspend I'm using Asus T200TA with Ubuntu 15.04 amd64, kernel 4.1.2 and updated linux-firmware. The wireless drivers work after operations as described in #102531. Battery and touchscreen work out of the box. Then the device is suspended. After waking up the notebook, the battery is not detected anymore. KDE5 displayed information that no battery was found, Unity simply didn't display the applet. The battery is detected again after a charges has been plugged in (and the status is updated properly when removing the charger). The touchscreen stops working, no input can be done through it. The network card isn't detected anymore, see the log from `lshw`. The first command was issued before suspend, the other - after. The problem is 100% reproducible. I'm eager to provide any other feedback or even create a patch if I am up to it and if I could be guided a little.
Please attach dmesg output after system resumed, and acpidump too.
Created attachment 185211 [details] Output of acpidump before suspend
Created attachment 185221 [details] Output of acpidump after suspend
Created attachment 185231 [details] Output of dmesg after suspend
[ 201.283355] PM: resume from suspend-to-idle [ 201.385503] i2c_designware 80860F41:00: timeout waiting for bus ready [ 201.401599] PM: noirq resume of devices complete after 118.040 msecs [ 201.758302] PM: early resume of devices complete after 356.530 msecs [ 201.766355] i2c i2c-0: i2c read failed [ 201.778279] i2c i2c-0: i2c read failed [ 202.055521] PM: resume of devices complete after 297.155 msecs [ 202.056211] PM: Finishing wakeup. I guess the problem is that after resume, i2c controller stops working and battery information is got from it, so battery information is lost too. I have no idea why plug in AC will solve the problem though. Mike, do you have any ideas on the above error messages?
My theory: (newbie here, so might be wrong) i2c read fails and it's not retried automatically. When the status changes, a read is requested, it succeeds. Or maybe the read occurs too early? I'll attach a dmesg after AC plug-in today.
I guess at that point (during resume cycle) the I2C bus is powered off and that's why the adapter times out. I don't see anything obvious in the DSDT related to I2C1 power.
Aaron, actually I think what is happening: [ 201.385503] i2c_designware 80860F41:00: timeout waiting for bus ready [ 201.401599] PM: noirq resume of devices complete after 118.040 msecs The I2C driver has not been resumed yet (it is done after _noirq and _late hooks have been run) and something is trying to access the bus already. Since it cannot be the I2C children devices, I suspect that it is battery or similar through the I2C Operation Region.
Created attachment 185321 [details] Output of dmesg after attaching the AC power
Created attachment 185351 [details] Print I2C operation region errors
Marcin, Can you try the attached patch and attach dmesg after resume. It should print out something if it was I2C operation region access that caused the I2C error.
Kernel build fails for me: CC drivers/i2c/i2c-core.o In file included from include/linux/kernel.h:13:0, from include/linux/list.h:8, from include/linux/module.h:9, from drivers/i2c/i2c-core.c:30: drivers/i2c/i2c-core.c: In function ‘acpi_i2c_space_handler’: drivers/i2c/i2c-core.c:363:54: error: invalid type argument of ‘->’ (have ‘struct i2c_client’) pr_err("ACPI I2C (%u): %u failed with %d\n", client->addr, ^ include/linux/printk.h:250:33: note: in definition of macro ‘pr_err’ printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__) ^ CC drivers/idle/intel_idle.o Using gcc 4.8.4 (my build machine is running Linux Mint 17.2 - based on Ubuntu 14.04, and the target machine has Ubuntu 15.04, which uses gcc 4.9)
It may be of importance: since I'm building anyway, I took the source of 4.1.6
Right, it was changed with 7ef85f5fdd ("i2c: core: Reduce stack size of acpi_i2c_space_handler()"). Please replace "client->" with "client." and it should compile fine.
> Mike, do you have any ideas on the above error messages? I guess I'm typing too fast, sorry for mis-typing your name Mika. (In reply to Mika Westerberg from comment #8) > Aaron, actually I think what is happening: > > [ 201.385503] i2c_designware 80860F41:00: timeout waiting for bus ready > [ 201.401599] PM: noirq resume of devices complete after 118.040 msecs > > The I2C driver has not been resumed yet (it is done after _noirq and _late > hooks have been run) and something is trying to access the bus already. Makes sense. > Since it cannot be the I2C children devices, I suspect that it is battery or > similar through the I2C Operation Region. OK, let's see what that is. > [ 201.283355] PM: resume from suspend-to-idle > [ 201.385503] i2c_designware 80860F41:00: timeout waiting for bus ready > [ 201.401599] PM: noirq resume of devices complete after 118.040 msecs > [ 201.758302] PM: early resume of devices complete after 356.530 msecs > [ 201.766355] i2c i2c-0: i2c read failed > [ 201.778279] i2c i2c-0: i2c read failed > [ 202.055521] PM: resume of devices complete after 297.155 msecs > [ 202.056211] PM: Finishing wakeup. I just noticed that the "i2c read failed" message above is emitted by function acpi_gsb_i2c_read_bytes, which is used by the I2C operation region handler. And according to its position, it happened during the device resume phase. So the above two "i2c read failed" messages means some ASL code access I2C operation region fields(most probably battery related methods), but the bus is still not operational at that time, which is not normal?
The I2C host controller _PS0 is not yet executed at that point (since it has not been resumed yet).
I think acpi_lpss_resume_early will power these controllers up?
It will but the error happens in _noirq resume which happens earlier if I'm not mistaken.
The "timeout waiting for bus ready" error happens in _noirq resume phase, but do the two "i2c read failed" error also happen in that phase? I thought they happen in resume phase.
Yes, they happen in resume phase. The debug patch should reveal if I2C access happens from ASL (something else than acpi_gsb_i2c_read_bytes()).
Created attachment 185431 [details] Output of dmesg from the patched kernel 4.1.6
[ 60.767740] PM: resume from suspend-to-idle [ 60.990719] i2c_designware 80860F41:00: timeout waiting for bus ready [ 60.990723] ACPI I2C (102): 6 failed with -110 As Mika explained, this is expected, since the controller is still in powered off state. It's not clear to me why ASL needs to access the I2C operation region at such an early time. [ 61.006818] PM: noirq resume of devices complete after 238.850 msecs [ 61.363656] PM: early resume of devices complete after 356.665 msecs [ 61.371524] i2c i2c-0: i2c read failed [ 61.371537] ACPI I2C (102): 11 failed with -121 [ 61.379530] i2c i2c-0: i2c read failed [ 61.379544] ACPI I2C (102): 11 failed with -121 Perhaps this is due to the I2C controller's resume is not done yet? Marcin, Please add initcall_debug to kernel cmdline and attach dmesg again after resume, thanks. [ 61.760262] PM: resume of devices complete after 396.528 msecs [ 61.760816] PM: Finishing wakeup.
OK, so my command line in Grub would be now: linux /boot/vmlinuz-4.1.6 root=<UUID> ro quiet splash $vt_handoff initcall_debug I see that quiet removes most log messages. Should I disable this option too?
(In reply to Marcin Mielniczuk from comment #23) > OK, so my command line in Grub would be now: > > linux /boot/vmlinuz-4.1.6 root=<UUID> ro quiet splash $vt_handoff > initcall_debug > > I see that quiet removes most log messages. Should I disable this option too? Yes please.
Created attachment 185591 [details] Output of dmesg with initcall_debug
Something I noticed once or twice, IIRC when I getting the first log on the patched kernel: sometimes after a wakeup the computer suspends back again without user's will. Should I give you a dmesg when I experience this once again?
[ 42.036319] i2c_designware 80860F41:00: timeout waiting for bus ready [ 42.036324] ACPI I2C (102): 6 failed with -110 [ 42.052350] call 0000:00:02.0+ returned 0 after 170605 usecs The device in question seems to be GFX adapter: Device (GFX0) { Name (_ADR, 0x00020000) // _ADR: Address ... Method (_PS0, 0, NotSerialized) // _PS0: Power State 0 { If (^^^I2C1.GLID ()) { The driver itself does not do anything in _noirq phase but it's the PCI core which goes and powers on the device resulting _PS0 to be called.
I wonder if we can just uninstall the I2C OpReg when the I2C adapter is suspended and reinstall it again on resume?
I think there are two problems: 1 GFX0 access I2C when I2C is not available yet and a possible solution is provided by Mika in comment #28; 2 the battery device(PNP0C0A:00) doesn't wait for its parent I2C controller(80860F41:00) to finish its resume callback in resume phase: [ 42.409345] PM: early resume of devices complete after 356.715 msecs ... ... [ 42.409637] calling PNP0C0A:00+ @ 1468, parent: 80860F41:00 ... ... [ 42.424747] i2c i2c-0: i2c read failed [ 42.424750] ACPI I2C (102): 11 failed with -121 [ 42.424907] call PNP0C0A:00+ returned 0 after 14906 usecs ... ... [ 42.443969] calling 80860F41:00+ @ 1468, parent: platform [ 42.443977] call 80860F41:00+ returned 0 after 6 usecs It's not clear to me how this can happen, and this may contribute to the above "i2c read failed" error message, which again cause the battery information lost.
Marcin, Can you disable I2C Operation Region support from the kernel and see if things like touchscreen work better after suspend/resume cycle? You can do it by unselecting following from Kconfig: Device Drivers -> I2C Support -> ACPI I2C Operation region support (The symbol is CONFIG_ACPI_I2C_OPREGION).
The bug persists, as before. Attaching new dmesg.
Created attachment 186031 [details] Output of dmesg after disabled ACPI I2C opregion support Kernel options: root=<UUID> ro initcall_debug splash $vt_handoff
Hmm, I don't see any errors in the dmesg. Apart from the battery how about touch screen, is it working after resume?
Sorry, I didn't write clearly. Battery info is missing altogether, even before suspend. Wireless and touchscreen are lost after suspend. I managed to reproduce the aforementioned "suspend-back" problem: I suspend the device then choose to wake it up. Shortly after the wakeup, the computer suspends back again on its own. Maybe this is connected, see the new dmesg. I see some brcmfmac-related errors.
Created attachment 186261 [details] Output of dmesg - computer suspends back on its own. Unfortunately with quiet and without initcall_debug. I've just realized.
Can you open the touch screen input device directly and check if it outputs anything after suspend? Probably best way to test this is to boot to non-GUI mode and then run: # echo freeze > /sys/power/state (wake up the device) and then run # od -x /dev/input/event9 (or whatever is the correct device node for the touch screen). Also can you attach output of "/proc/interrupts" to this bug.
What does this `echo freeze` do? Suspend? Should I do it on the mainline 4.1.2 kernel or my patched 4.1.6?
It does suspend-to-idle (same thing that you do now with the GUI). Please try with your patched kernel.
My Ubuntu seems to refuse not to launch GUI. So I tried from tty `od -x /dev/input/event9` shows touchscreen taps before suspending. Then # echo freeze /sys/power/state [...] brcmf_sdio_bus_rxctl: resumed on timeout [...] brcmf_fil_cmd_data: bus is down. we have nothing to do [...] brcmf_cfg80211_get_station: Could not get rate (-5) [...] brcmf_fil_cmd_data: bus is down. we have nothing to do [...] brcmf_cfg80211_get_station: Could not get rate (-5) [...] brcmf_fil_cmd_data: bus is down. we have nothing to do [...] brcmf_link_down: WLC_DISASSOC failed (-5) [...] brcmf_cfg80211_reg_notifier: not a ISO3166 code Then `od -x /dev/input/event9` shows touchscreen taps (still!) Suspended from menu, woke up. `od -x ...` still works, touchscreen in the UI - not. Then I tried the old kernel; the behavior is identical. The battery applet disappears after suspending via echo. (in patched it's missing all the time)
Just to be clear: the first test was on the patched kernel.
So what comes to the touchscreen, I think it actually works just fine. Just the Ubuntu GUI somehow loses it. Does output of 'xinput' change after suspend?
Created attachment 186311 [details] /proc/interrupts
The Ubuntu GUI only uses it when using Suspend from GUI. The touchscreen works after suspend if the echo method is used. Besides, the brcmfmac output from comment #39 appears BEFORE the screen turns off.
Created attachment 186321 [details] I2C DW late suspend patch
Marcin, can you try the attached patch with unpatched kernel? It should resume I2C before battery tries to read from it.
On 4.1.6: $ git apply i2c-dw-late-suspend.diff error: patch failed: drivers/i2c/busses/i2c-designware-platdrv.c:343 error: drivers/i2c/busses/i2c-designware-platdrv.c: patch does not apply
Created attachment 186331 [details] I2C DW late suspend patch for v4.1 Late suspend patch that should apply to v4.1.
Did the first one apply to 4.2? I can build this as well, no problem, provided it's stable. Should I? The kernel.org website contradicts itself, once says 4.2 is mainline, once that it's stable
Yeah, it applies on top of v4.2. v4.2 was just released so it will become stable soon.
So, this patch fixed the issue with battery, it's properly detected after suspend. Wireless is lost as before, but no wonder - the problem seems to occur during suspend, cf. comment #43. Maybe the i2c read occurs after something has already suspended? The output from brcmfmac is the same as in comment #39 I noticed a few important things as for the touchscreen. 1. The touchscreen input fails if `echo freeze > /sys/power/state` is issued from the gnome-terminal, i.e. from the GUI session 2. When switching to tty, `od` manages to receive touchscreen info correctly. 3. When switching back to tty, the touchscreen starts working back again (simple switch GUI-tty-GUI suffices to work) I have no clue why switching back to tty magically fixes everything. I won't be fully available until the end of this week. No kernel rebuild will be possible for sure, debug info - I don't know whether I'll be able to provide it. Anyway, I'll reply ASAP, whenever this possible occurs.
(In reply to Marcin Mielniczuk from comment #50) > So, > > this patch fixed the issue with battery, it's properly detected after > suspend. Hi Mika, Thanks for the patch, any plan to send it upstream?
(In reply to Aaron Lu from comment #51) > (In reply to Marcin Mielniczuk from comment #50) > > So, > > > > this patch fixed the issue with battery, it's properly detected after > > suspend. > > Hi Mika, > > Thanks for the patch, any plan to send it upstream? I'm not sure if that is general enough solution for this. I think we should handle platform devices (battery) which use I2C OpRegs in such way that the I2C host controller gets resumed whenever the region is accessed. Unfortunately I have no idea how to do this. The battery device (BATC) seems to have I2C in its _DEP list but how can we take advantage of this is a good question.
Marcin, Can you next try to disable the Ubuntu GUI completely and do the suspend/resume cycle directly from the console? I'm interested if the touch screen works before and after (use od -x). If the touch screen works we can, at least, rule out kernel/driver issue.
Mika, I don't really know how I can do it. I've already tried booting with the text parameter on the kernel cmdline but this still boots me into the GUI.
Does this work? http://askubuntu.com/questions/16371/how-do-i-disable-x-at-boot-time-so-that-the-system-boots-in-text-mode
Adding Tianyu who did the initial _DEP support for ACPI battery device.
Aaron, Tianyu, The reason why I2C is not resumed before battery is that the ACPI battery driver is acpi_driver instead of a physical device driver (e.g a platform driver). So there is no parent <-> child relationship between the two.
Mika, Yes, the reason the usual text mode didn't work was the fact that Ubuntu 15.04 uses systemd. Thanks. I booted the system into text mode, and after suspend `od -x ...` prints stuff after touching the screen. So it seems it's not a kernel thing. I tried to find out the affected component. I did `xinit -- :1` to set up a new X server. Then I enter the newly created X server, suspended from the xterm inside. After suspending in this new environment `od -x ...` failed. Thus, I suspect that Xorg is the affected component. Should I report a bug against Xorg? One more observation: `rmmod brcmfmac && modprobe brcmfmac` doesn't bring wlan back
Yeah, I think you should file the touchscreen issue against Xorg. I'll try to look what can we do to the battery issue. I'm able to reproduce it here on Asus T100. For the WLAN issue, I think it may be better to file a separate buf since it is not related to the battery problem.
s/buf/bug/
Thanks! Filed a bug #104201 against kernel.
(In reply to Mika Westerberg from comment #59) > Yeah, I think you should file the touchscreen issue against Xorg. Sorry for the late reply and I'll also take a look at this battery to see if I can come up with something.
OK, thanks Aaron.
(In reply to Mika Westerberg from comment #57) > Aaron, Tianyu, > > The reason why I2C is not resumed before battery is that the ACPI battery > driver is acpi_driver instead of a physical device driver (e.g a platform > driver). So there is no parent <-> child relationship between the two. I see, thanks. I just noticed that the battery driver will return -EPROBE_DEFER on initial call due to the unavailability of the I2C operation region, its device node should be moved to last of the dpm_list by deferred_probe_work_func, thus maintain the order of parent/child with I2C controller 80860F41:00. And since both the I2C controller and battery are not async suspend enabled, they should be resumed one by one with battery being resumed first. I'll prepare a patch for Marcin to dump the device list of dpm_list on suspend.
Created attachment 187121 [details] Dump device names according to dpm_list Hi Marcin, Please add initcall_debug and no_console_suspend to kernel cmdline option, and please attach the dmesg after system resumed, thanks.
Of course I should apply the patch on a clean kernel, shouldn't I? Is it for 4.1.y?
(In reply to Marcin Mielniczuk from comment #66) > Of course I should apply the patch on a clean kernel, shouldn't I? Is it for > 4.1.y? Feel free to apply the patch on the tree you have been using, it's a one-line patch for a file that shouldn't have much changes. And yes, it can apply cleanly on 4.1.y.
Created attachment 187291 [details] dmesg after Aaron's patch
Thanks for the result, the battery device node isn't moved to the tail of dpm_list, I'll see what happened there.
BTW, I found following patch floating around: https://lkml.org/lkml/2015/9/10/218 Maybe it helps here?
(In reply to Mika Westerberg from comment #70) > BTW, I found following patch floating around: > > https://lkml.org/lkml/2015/9/10/218 > > Maybe it helps here? Yes, definitely worth a try.
BTW, please enable debug output for drivers/base/dd.c and drivers/base/power/main.c so that we can see if the battery device and I2C host controller device have deferred probe happened, thanks. https://www.kernel.org/doc/local/pr_debug.txt
Aaron, should I rebuild the kernel to enable pr_debug now and attach the dmesg on the new kernel or should I use a kernel compiled in such a way for all the future dmesgs?
(In reply to Marcin Mielniczuk from comment #73) > Aaron, > > should I rebuild the kernel to enable pr_debug now and attach the dmesg on > the new kernel or should I use a kernel compiled in such a way for all the > future dmesgs? For all the future dmesgs, thanks.
Hi Marcin, It looks like I mis-understood your comment #73. Can you please build a kernel according to comment #72 and attach dmesg? Thanks.
Hmmm... will it suffice to prepend `CFLAGS += -DDEBUG` to make? Or should I do `CFLAGS_drivers/base/dd.o := -DDEBUG` (and the same for the second)?
Like this: diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 6b2a84e7f2be..29e6f4fd61c6 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -26,3 +26,4 @@ obj-$(CONFIG_GENERIC_MSI_IRQ_DOMAIN) += platform-msi.o ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG +CFLAGS_dd.o :=-DDEBUG diff --git a/drivers/base/power/Makefile b/drivers/base/power/Makefile index f94a6ccfe787..b38bbf0399ea 100644 --- a/drivers/base/power/Makefile +++ b/drivers/base/power/Makefile @@ -6,3 +6,4 @@ obj-$(CONFIG_PM_GENERIC_DOMAINS) += domain.o domain_governor.o obj-$(CONFIG_HAVE_CLK) += clock_ops.o ccflags-$(CONFIG_DEBUG_DRIVER) := -DDEBUG +CFLAGS_main.o := -DDEBUG
Created attachment 191081 [details] dmesg with debug enabled in Makefiles Sorry it took me so long.
Created attachment 191151 [details] do not attach in dep_walk, instead, rely on defer probe to trigger it Marcin, It seems you missed the dd debug part. Never mind, I just prepared this patch to have all the required stuffs, please apply it on top of v4.3-rc7 and after boot, please attach dmesg, thanks.
Aaron, Sorry it takes me such a long time but I'm simply a busy student :) Can I apply it on top of v4.3 final? Thanks
Never mind and yes, it can be applied cleanly on top of v4.3.
Any update?
ping...
Created attachment 198381 [details] On patched 4.3 Sorry, I've been very busy but I finally have some free time after Christmas. So, the battery detection doesn't work altogether on the patched kernel, even before the reboot. The dmesg after suspend will follow in a while.
Created attachment 198391 [details] On patched 4.3, after suspend
(In reply to Marcin Mielniczuk from comment #84) > Created attachment 198381 [details] > On patched 4.3 > > Sorry, I've been very busy but I finally have some free time after Christmas. > > So, the battery detection doesn't work altogether on the patched kernel, > even before the reboot. > > The dmesg after suspend will follow in a while. Sorry for the long delay. Can you please set CONFIG_DEBUG_DRIVER to y and upload the dmesg after boot again?
please attach the output of "cat /sys/bus/acpi/devices/PNP0C0A\:0*/status" both before and after resume.
Should anything unusual than CONFIG_DEBUG_DRIVER=y be enabled?
Configs for ACPI battery/I2C designware host controllers/etc. should also be enabled and I believe you have already done that. The CONFIG_DEBUG_DRIVER=y will enable debug print from driver core, I didn't see battery driver gets loaded from your last dmesg so would like to know more. In you last dmesg, there is: [ 1.099413] calling acpi_battery_init+0x0/0x2c @ 1 [ 1.099428] initcall acpi_battery_init+0x0/0x2c returned 0 after 0 usecs which means the battery driver has loaded, but I don't see any output of: [ x.xxxxxx] bus: 'acpi': driver_probe_device: matched device PNP0C0A:00 with driver battery It could be due to the bettery device's status indicates it's not present, thus Rui has also asked for the status output in comment #87.
I doubt I still have the old sources on my computer. It's why I'm asking for the complete set of options.
No need to use the old sources, just use the latest upstream code and enable that config when building, then attach dmesg after boot is enough for now. The old patch is used to see the device order in dpm_list but I don't see battery device gets probed in your last dmesg so I need to make sure the battery driver works after boot first. And please list the status of device PNP0C0A as requested in comment #87.
I don't even have the old config with me. I can use the default config for my distro.
I'll have access to the affected computer only for two days, after that I won't be able to access it for 3 weeks. So if I don't build the kernel tomorrow, I won't be able to test it during these two days. So, can you please list all the non-standard kernel build options I should enable?
can you reproduce the problem for now? If yes, as the first step, please just attach the output of "cat /sys/bus/acpi/devices/PNP0C0A\:0*/status" both before and after resume.
Please tell me which config options have to be enabled for the results to be worth anything. I don't have my old config, I'll base on the default config for the Ubuntu kernel, some old version in fact. Because otherwise you'll soon tell me I should I compile with some kernel option I didn't remember to enable; the results will be worthless.
hmmm, for the battery, please make sure CONFIG_ACPI_BATTERY is enabled, for the others, I don't know what drivers you were using, but if you want to reproduce the problem (device not working after resume), you must enable those drivers, right?
Ok, I'm building the kernel, I'll hopefully report back today in the evening.
The system froze during upgrade, now is unbootable. (I reported the freezes some time ago but can't find the report anymore) I'll reinstall the system and try again. Expect a report at the beginning of July
Installed Ubuntu 16.04 + mainline kernel 4.6.3 (unpatched). The device does not wake up from suspend (pressing the power button seems to do nothing) What should I do next?
The original problem is about battery not usable after resume, but now the system failed to resume. So looks like we will need to make it resume first and then continue to track the battery problem. Just to make sure, do you have CONFIG_INPUT_SOC_BUTTON_ARRAY and CONFIG_GPIO_CRYSTAL_COVE set?
CONFIG_INPUT_SOC_BUTTON_ARRAY=m CONFIG_GPIO_CRYSTAL_COVE=m
Exactly which hardware do these options pertain? Why are they important in our case?
Based on my T100 knowledge, the CONFIG_GPIO_CRYSTAL_COVE is for the CrystalCove PMIC device where the power button is connected to and the CONFIG_INPUT_SOC_BUTTON_ARRAY is the driver for the 5 button keys around the tablet.
Marcin, Are you willing to continue debug this issue? If so, we will need to first figure out why resume breaks and then continue the debug of battery issue; if the problem doesn't affect you much and you do not want to spend time on it, we can close it. What do you think?
Aaron, of course. But you need to know: - I have continuous access to the device (every day) in July, August and September (possibly barring a week or two) - Later on, (from October to December) my physical access to the computer will be sparse (one or two weekends a month) Hopefully we'll manage to fix it during the vacation :) Please tell me what should I do next. One observation: the system doesn't really suspend! I called # echo freeze > /sys/power/state Then the display froze. (but stayed lit) The system didn't react to pressing the power button. I raised the elephant and the system successfully rebooted.
My above comment may be wrong. I've just tried it on my own laptop and pressing any key wakes the system up. The conclusion "the system doesn't really suspend" may simply be wrong. Anyway, the conclusion is: the system reacts to raising the elephant.
One more note: I'm using intel_idle.max_cstate=1 so as to avoid system freezes (see #109051) This doesn't seem to affect the suspend behavior
The easy way to see why freeze/resume stops working is git bisect, but that would cause much time, you may consider doing that if we run out of ideas. The display stayed lit seems to indicate the freeze failed, probably it's related to the GPU driver. Booting the system with "nomodeset" will get rid of the GPU driver, but then I suppose the display will not be turned on after resume since there is no driver for it. But anyway, it should be worth a try to see if adding "nomodeset" to kernel cmdline could at least make suspend work.
I'll do a simple check whether I can suspend at all with a kernel old enough. Some users have reported similar problems with suspend with T200TA on the Arch Linux forum. After using nomodeset and # echo freeze > /sys/power/state the display stays lit. The GUI session disappears, in some virtual console there can be seen some messages similar to the ones in comment 39. The system has not suspended (or has suspended and woken up instantly), since I can "switch" virtual consoles. Written in quotation marks, since pressing ctrl+alt+FN yields a black screen. for any N=1, 2, ..., 12 The system reacts to reisub.
(In reply to Marcin Mielniczuk from comment #110) > I'll do a simple check whether I can suspend at all with a kernel old > enough. That would be good to know. > > After using nomodeset and > > # echo freeze > /sys/power/state > > the display stays lit. The GUI session disappears, in some virtual console > there can be seen some messages similar to the ones in comment 39. Is it possible to get rid of the broadcom SDIO wifi driver when testing freeze/resume? > > The system has not suspended (or has suspended and woken up instantly), > since I can "switch" virtual consoles. Written in quotation marks, since > pressing ctrl+alt+FN yields a black screen. for any N=1, 2, ..., 12 Looks like the system is in a messed state... > > The system reacts to reisub. Thanks for the test.
Under kernel 4.1.2 the device wakes up from suspend as it used to. So we have a regression. (In reply to Aaron Lu from comment #111) > > Is it possible to get rid of the broadcom SDIO wifi driver when testing > freeze/resume? Will it suffice to blacklist brcmfmac? As for the git binsearch, this is maybe not the most efficient way, but it'll be feasible. There are Ubuntu builds for almost all 4.x kernel releases here: http://kernel.ubuntu.com/~kernel-ppa/mainline/
(In reply to Marcin Mielniczuk from comment #112) > Under kernel 4.1.2 the device wakes up from suspend as it used to. So we > have a regression. > > (In reply to Aaron Lu from comment #111) > > > > Is it possible to get rid of the broadcom SDIO wifi driver when testing > > freeze/resume? > > Will it suffice to blacklist brcmfmac? I think so, or you can also blacklist the sdio host controller driver, something like sdhci_acpi I guess. > > As for the git binsearch, this is maybe not the most efficient way, but > it'll be feasible. There are Ubuntu builds for almost all 4.x kernel > releases here: http://kernel.ubuntu.com/~kernel-ppa/mainline/ This is not enough to get the bad commit, only the first bad kernel version.
When I blacklisted sdhci_acpi, the system gets stuck in initramfs. I blacklisted brcmfmac, but the problem suspend persists.
I forgot about nomodeset then. Now with modprobe.blacklist=brcmfmac nomodeset, I called the command from tty1. After issuing it, I get a blank tty with a single underscore. I can freely switch ttys 1-6. I can see everything I typed into tty1. As soon as I enter tty7, I get a black screen forever. Reisub still responds.
(In reply to Marcin Mielniczuk from comment #114) > When I blacklisted sdhci_acpi, the system gets stuck in initramfs. I > blacklisted brcmfmac, but the problem suspend persists. My bad, I just realized the MMC storage, where the roofs lives, is also on sdhci host controller.
Aaron, since there were no new ideas from you, I tested two kernel versions. 4.4.0 is good 4.5.0 is tainted This may help make some hypotheses. I'll test 4.4.7 today and report back.
Thanks for the test, I do not have any idea what could be the cause for the freeze failure. I guess 4.4.7 still works, since stable kernel doesn't pick major changes normally. To know the exact commit that breaks freeze, only git bisect is possible I'm afraid.
Ok, so I'll test the last from the 4.4.x series. Can you take a look at the commits which enter kernel 4.5? Maybe you'll find something possibly related?
A small note: after `echo freeze` on 4.5+ I can still type into the tty!
(and ctrl+alt+del doesn't cause a reboot, reisub is needed)
(In reply to Marcin Mielniczuk from comment #119) > Ok, so I'll test the last from the 4.4.x series. > > Can you take a look at the commits which enter kernel 4.5? Maybe you'll find > something possibly related? It doesn't seem possible, there are too many commits for a release: [aaron@aaronlu linux]$ git log --no-merges v4.4..v4.5 |grep ^commit |wc -l 12080 Looking for a single subsystem may be possible, but we do not know which subsystem(e.g. gpu driver, wifi driver, etc.) makes the freeze fail yet.
Can you give me the lo and hi for the git binsearch?
v4.4 is the good(lo) one and v4.5 is the bad(hi) one. You can start by: $ git bisect start $ git bisect good v4.4 $ git bisect bad v4.5
Sorry that I reply after such a long time. I started bisecting, but until I finished, I had to move to another city and don't have physical access to the machine anymore. I guess I won't be of much use anymore. Sorry.
okay. Bug closed. Please feel free to reopen it if you can provide the bisect result.