Bug 103211

Summary: Unable to power off on MacbookPro11,4
Product: Drivers Reporter: Justin Dray (justin)
Component: PCIAssignee: Bjorn Helgaas (bjorn)
Status: CLOSED CODE_FIX    
Severity: normal CC: a3at.mail, aagaande, aaron.lu, arjen.veenhuizen, arjenzwerver, bastian.triller, berglh, berthold.crysmann, bjorn, bonzini, brian.wisti, bwinterton, chandlermelton, chkr, ChrisBroome7, clg, crew4ok, cruz, cyrille, djkessler, don.bucci, dustin.webber, dylan.kyle.powers, edouard.thuleau, ephemient, fortizc, gustaf.gunnarsson, hamza, hermann.mayer92, hugo, jadit2, jan, jgeboski, johannes.stuettgen, jschulz, junkmail-trash, junkmail9969, lufimtse, luis, mail, martin.klapetek, matt, nd, oliver.greg, oxynux, p.kernel, p.oliveira.castro, pablo.catalina, pickeringw, pkozlov.vrn, primetalk.github, ptitjes, rfkrocktk, robberphex, robert.abraham86, robin.marlow, ronald, rui.zhang, shein, simon.vanderveldt, soomebob, ss1ha3tw, stefan, tbo, thejoe, tim.sammut, tom, tonylambiris+kernel, travii23, uestclx, v, valentinrothberg, vamsi360, vitalatron, vlad_lesin, yinghai, zach
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.3.3 Subsystem:
Regression: No Bisected commit-id:
Attachments: lspci -vn at Macbook Pro 11'4
acpidump-MacBookPro11,4
acpidump-macbook-11,4
force efi poweroff
force efi poweroff - v2
force poweroff -v2
ACPI dump for MacBook Pro
ACPIDUMP in Macbook Pro 11,4
IMG_0939.JPG
IMG_0938.JPG
IMG_0937.JPG
dmesg MacBookPro11,4
dmesg macbook pro
Force a S5 directly
force s5
no force s5
force s5 patch applied - mem > /s/p/s
test halt at early stage in linux
bisect=15
find the driver who disable port writting
dmesg bisect-driver-patched without bisect parameter
/proc/ioports
/proc/iomem
lspci -vvxx
/proc/iomem
/proc/ioports
lspci -vvxx
Debug counter output
dmesg macbook pro
setpci results
screenshot on attempt to poweroff
bypass the enumeration for specific pci device
avoid io allocation for specific pci bridge
dmesg with kernel opts in comment 108
#comment_116 dmesg
#comment_116 lspci
#comment_116 lspci-t
#comment_116 lspci-vvxx
#comment_121 dmesg
#comment_121 setpci
#comment_121 poweroff output
mbp 11,4 dmesg with patch from comment #111
mbp 11,4 lspci --vvxx output with patch from comment #111
mbp 11,4 lspci -t output with patch from comment #111
iasl -tc dsdt.dsl recompile warnings from a mbp 11,4
debug patch to find the exact operation which broke io port space
Dmesg after patch from comment #111
lspci -vvxx after patch from comment #111
lspci -t after patch from comment #111
genkernel log for kernel gentoo-sources-4.6.0 with bisect patch
kernel build failure output on 4.5.4-1-ARCH
bisect pci operation
dmesg halt_count grep between 84 and 85
maximum halt count
patched mbp 11,4 dmesg output with dynbg and initcall_debug kernel opts
dmesg with debug cmdline
quirk to bypass mmio assignment for Mac Pro 11
dmesg without disable_mode
dmesg with disable_mode=1
suspend dmesg
mpb11,4 dmesg output, dynbg and initcall debug kernel opts with patch from comment 147
mpb11,4 dmesg output, dynbg and initcall debug kernel opts with patch from comment 147, plus disable_mode=1
default state /proc/acpi/wakeup on mbp 11,4
dmesg without disable_mode
dmesg with disable_mode=1 (wireless not working)
lspci -t with disable_mode=1 (wireless not working)
lspci -vvxx with disable_mode=1 (wireless not working)
dmesg with disable_mode=1 (wireless working)
lspci-vvxx-comment-172
lspci -vvxx with patch on #comment 172
attachment-25612-0.html
ioreg for MacBookPro11,5
ioreg -l output mbp 11,4
lscpi from MacOS
dmesg.txt
lspci-n.txt
lspci-vv.txt
lspci -vvxx Macbook Pro 11,4
dmidecode after setpci
MacBookPro11,4 lspci -vvxxx quirk_apple_mbp_poweroff patched
MacBookPro11,4 lspci -vvxxx vanilla
MacBookPro11,4 lspci -vvxxx quirk_apple_mbp_poweroff patched (after reboot)
MacBookPro11,5 lspci -vvxxx
MacBookPro11,5 lspci -vvxxx with the quirk patch
MBP 11,4 lscpi -vvxxx - patched kernel
MBP 11,4 lscpi -vvxxx - unpatched kernel
setpci/in* results on a MacBookPro11,5
rdwrmem results on a MacBookPro11,5
rdwrmem results on a MacBookPro11,5 test2
test5 results
VGA relocation test patch
attachment-28579-0.html
proposed v4.13 patch
proposed v4.13 patch, v2
disable XCH1 on systemd suspend

Description Justin Dray 2015-08-20 11:46:40 UTC
Hi,

I've been trying to get the new MacbookPro11,4 to be able to power down all day with no luck. When shutting down it gets to
reboot: power down

And hangs. There are no logs at this point as it is already running in the systemd initrd image by this point. I've tried with reboot=pci (which seems to work for many previous macbooks) and reboot=acpi (which works on other older macbooks) and neither of these have worked. I'm currently running on the latest commit (as of 2 hours ago) to 4.2.0-rc7 and this issue remains. It also occurs in 4.1.5, so I do not believe it to be a regression.

It does the same thing when going to sleep, or hibernating. In the case of hibernation if I hold in the power button until it powers off, then turn it back on it will resume cleanly. The screen turns off for sleep (unlike hibernate/shutdown) but it does not shut down; the fans stay running and it is impossible to wake it back up without forcing it to power off and back on.

Let me know if there is any more info/logs/photo of the part where it hangs required, or if this should be two bugs, one for Power-Off and the other for Power-Sleep-Wake.
Comment 1 Aaron Lu 2015-08-24 05:17:58 UTC
This doesn't seem likely, but just in case, can you please follow this doc:
https://www.kernel.org/doc/Documentation/power/basic-pm-debugging.txt
and finish the "core" mode of testing for suspend?
Comment 2 Justin Dray 2015-08-24 07:01:24 UTC
Thanks, I went from the top down, and this is what I've observed:

# echo reboot > /sys/power/disk
# echo disk > /sys/power/state
worked; it hibernated, rebooted and went straight back to the terminal in ~6-7 seconds as expected. Doing it a second time however, gave me a black unresponsive screen (with the keyboard backlight still on; this silly model has no power light). Forcing it to power off after a minute at this black screen and then powering it on, it resumed from hibernate! Doing this test a couple more times resulting in random instability. Even after only a single suspend, sometimes it would hang a short time after coming back from hibernation (30-60 seconds)

# echo platform > /sys/power/disk
# echo disk > /sys/power/state
Without pm_test set to core: Hangs when it should have rebooted. Keyboard backlight stays on and the display is blank. Holding in power for 5 seconds to shut down and then powering it back on resumes from where it was. Doing this a second time resulted in it sitting at the prompt without returning from the 'echo disk' command for about 30-40 seconds. Then it had the same outcome as the first time I ran it.

When I set pm_test to core for each of the above, they both appeared to work. But with a very high (read all except one time) chance of it locking up completely after 60 seconds. I think this may have to do with the brcmfmac driver. removing the brcmfmac module before doing the tests left it stable for some time. reloading the brcmfmac driver later resulted in it not finding any wifi networks. But there is a seperate bug for brcmfmac issues: https://bugzilla.kernel.org/show_bug.cgi?id=103201

Logs (with pm_test set to core):
reboot mode: https://gist.github.com/justin8/eef75db997219001a1e6
platform mode: https://gist.github.com/justin8/524547161e2e42d440f9
Comment 3 Aaron Lu 2015-08-24 07:34:24 UTC
(In reply to Justin Dray from comment #2)
> Thanks, I went from the top down, and this is what I've observed:
> 
> # echo reboot > /sys/power/disk
> # echo disk > /sys/power/state
> worked; it hibernated, rebooted and went straight back to the terminal in
> ~6-7 seconds as expected. Doing it a second time however, gave me a black
> unresponsive screen (with the keyboard backlight still on; this silly model
> has no power light). Forcing it to power off after a minute at this black
> screen and then powering it on, it resumed from hibernate! Doing this test a

This suggests that everything worked well, except the reboot.

> couple more times resulting in random instability. Even after only a single
> suspend, sometimes it would hang a short time after coming back from
> hibernation (30-60 seconds)

dmesg at this point would be useful, i.e. after the 1st time resume, before it hang.

> 
> # echo platform > /sys/power/disk
> # echo disk > /sys/power/state
> Without pm_test set to core: Hangs when it should have rebooted. Keyboard

The mode is set to platform, so it should power off instead of reboot. Anyway, it again means the system is not able to power off under Linux.

> backlight stays on and the display is blank. Holding in power for 5 seconds
> to shut down and then powering it back on resumes from where it was. Doing

Again, this means everything worked well except the power off.

> this a second time resulted in it sitting at the prompt without returning
> from the 'echo disk' command for about 30-40 seconds. Then it had the same
> outcome as the first time I ran it.

Seems something is broken here. And by "the same outcome as the first time I ran it", do you mean it returned back to normal state?

> 
> When I set pm_test to core for each of the above, they both appeared to
> work. But with a very high (read all except one time) chance of it locking
> up completely after 60 seconds. I think this may have to do with the

dmesg before it completely lock up would be useful.

> brcmfmac driver. removing the brcmfmac module before doing the tests left it
> stable for some time. reloading the brcmfmac driver later resulted in it not
> finding any wifi networks. But there is a seperate bug for brcmfmac issues:
> https://bugzilla.kernel.org/show_bug.cgi?id=103201

Can you please redo all these suspend/resume tests with this wifi driver removed/unloaded? i.e. do not build the driver or make sure it's not loaded before you attempt any of these tests.

> 
> Logs (with pm_test set to core):
> reboot mode: https://gist.github.com/justin8/eef75db997219001a1e6
> platform mode: https://gist.github.com/justin8/524547161e2e42d440f9
Comment 4 Justin Dray 2015-08-24 13:06:42 UTC
(In reply to Aaron Lu from comment #3)
> dmesg at this point would be useful, i.e. after the 1st time resume, before
> it hang.
The dmesg logs are included in the full journals linked at the end:
> reboot mode: https://gist.github.com/justin8/eef75db997219001a1e6
> platform mode: https://gist.github.com/justin8/524547161e2e42d440f9
I can get dmesg by itself if you would like, but grepping for 'justinmacbook kernel:' would filter it out.

> > this a second time resulted in it sitting at the prompt without returning
> > from the 'echo disk' command for about 30-40 seconds. Then it had the same
> > outcome as the first time I ran it.
> 
> Seems something is broken here. And by "the same outcome as the first time I
> ran it", do you mean it returned back to normal state?

Sorry, I meant that once I forced it off again I could power it on and resume from hibernation; i.e. the hibernation worked, but not the power off


> > When I set pm_test to core for each of the above, they both appeared to
> > work. But with a very high (read all except one time) chance of it locking
> > up completely after 60 seconds. I think this may have to do with the
> 
> dmesg before it completely lock up would be useful.

The journals before have all messages up until you can see wlp3s0 (the brcmfmac device) activating, at which point it locked up. I've run the same things again with brcmfmac blacklisted from startup though in case it shows something different. Please see these new logs below for brcmfmac blacklisted:

# echo core > /sys/power/pm_test
# echo platform > /sys/power/disk
# echo disk > /sys/power/state
Journal (including dmesg): https://gist.github.com/justin8/c53e3d54647aca41e56f

With the brcmfmac driver blacklisted and pm_test set to core hibernate worked and resumed back where it was without issue.

Without setting pm_test I got the same results as before, but without the hanging on resume. (i.e reboot worked, but platform would hang on the shutdown part). Multiple attempts to hibernate in a row introduced no extra issues for either reboot or platform modes.
Comment 5 Tormen 2015-08-24 22:19:10 UTC
Possibly related bug: https://bugzilla.kernel.org/show_bug.cgi?id=101681
Comment 6 Aaron Lu 2015-08-25 01:55:39 UTC
Then the only problem is the poweroff/reboot, which is not easy to solve :-(
Comment 7 Justin Dray 2015-08-25 03:16:16 UTC
Fun! Is there anything else I can do? More info to provide, tests to run, etc? I'm more than happy to help with anything around this bug; but I'm an admin not a programmer ;)

oh, and are we able to change the bug to verified yet?
Comment 8 Aaron Lu 2015-08-25 04:25:37 UTC
Yes, but it's not clear to me what does verify mean here...
Anyway, feel free to do that. As the reporter of the bug, you should be able to change the status.
Comment 9 Justin Dray 2015-08-25 04:33:21 UTC
Well I can. But I wasn't sure of the policy around what would be required to consider a bug as verified. From those tests I ran and what you've said as well I would personally consider it verified, but just wasn't sure around the process side of things and didn't want to mess around with it myself.
Comment 10 Zhang Rui 2015-08-31 05:39:54 UTC
(In reply to Justin Dray from comment #2)
> When I set pm_test to core for each of the above, they both appeared to
> work. But with a very high (read all except one time) chance of it locking
> up completely after 60 seconds. I think this may have to do with the
> brcmfmac driver. removing the brcmfmac module before doing the tests left it
> stable for some time. reloading the brcmfmac driver later resulted in it not
> finding any wifi networks. But there is a seperate bug for brcmfmac issues:
> https://bugzilla.kernel.org/show_bug.cgi?id=103201
> 
> Logs (with pm_test set to core):
> reboot mode: https://gist.github.com/justin8/eef75db997219001a1e6
> platform mode: https://gist.github.com/justin8/524547161e2e42d440f9

first of all, this looks like a BIOS/hardware problem to me. To confirm this, is it possible for you to run other OSes like windows and check if the problem also exists?

BTW, can you please build out or blacklist brcmfmac driver and check if power off works or not?
Comment 11 Justin Dray 2015-08-31 08:32:09 UTC
I would have to delete things and resize my partitions to be able to install windows and I don't have enough time to sort that out as I'm about to start a new job in a few days and need this laptop. I've seen no reports of windows having issues on this model macbook from my searches on google though, only issues with wifi not working after sleep (https://discussions.apple.com/thread/7065673) (which there is a separate bug open for brcmfmac issues). OS X sleeps fine, but that is to be expected when they make the hardware.

As for the brcmfmac question, please see my previous post:

Please see these new logs below for brcmfmac blacklisted:

> # echo core > /sys/power/pm_test
> # echo platform > /sys/power/disk
> # echo disk > /sys/power/state
> Journal (including dmesg):
> https://gist.github.com/justin8/c53e3d54647aca41e56f

> With the brcmfmac driver blacklisted and pm_test set to core hibernate worked
> and resumed back where it was without issue.

> Without setting pm_test I got the same results as before, but without the
> hanging on resume. (i.e reboot worked, but platform would hang on the
> shutdown part). Multiple attempts to hibernate in a row introduced no extra
> issues for either reboot or platform modes.
Comment 12 Tormen 2015-08-31 08:39:05 UTC
Hi,

I'll have MacOs, Win8.1 and Linux on my MacbookPro12.1. I did not have the time yet to verify hibernation under linux and am not sure if under Windows 8.1 I did already hibernate.
I'll try to do so this week and report back.

Tormen
Comment 13 Aaron Lu 2015-09-28 02:43:29 UTC
Comment #23 of this thread:
https://bbs.archlinux.org/viewtopic.php?pid=1551011#p1551011
said that reboot works for him with reboot=pci, and he has "Macbook Pro 11,4 (2015 Retina 15")". Are you also using this model?
Comment 14 Aaron Lu 2015-09-28 02:48:55 UTC
Please also attach lspci.
Comment 15 Shane Chen 2015-09-30 02:46:05 UTC
I'm the person wrote #23 in archlinux bbs.
The reboot -p with reboot=pci work for me in 4.1.X (I forget the actual version
But after 4.2 , reboot -p only will reboot my Macbook Pro. 
It make me not sure about whether my notebook really shutdown at 4.1.X
Maybe it is rebooting too, because the screen turning off making me believe it has shutdown.
Comment 16 Shane Chen 2015-09-30 02:48:56 UTC
Created attachment 189051 [details]
lspci -vn at Macbook Pro 11'4

Here's my lspci -vn output on my Macbook Pro 11'4 (2015 Retina version without amd graphics card)
Comment 17 Felipe Ortiz 2015-09-30 15:52:54 UTC
(In reply to ss1ha3tw from comment #15)
> I'm the person wrote #23 in archlinux bbs.
> The reboot -p with reboot=pci work for me in 4.1.X (I forget the actual
> version
> But after 4.2 , reboot -p only will reboot my Macbook Pro. 
> It make me not sure about whether my notebook really shutdown at 4.1.X
> Maybe it is rebooting too, because the screen turning off making me believe
> it has shutdown.

I have the same behaviour on my MBP 11,5 (2015 Retina version with AMD graphic card)
Comment 18 Niels Dettenbach 2015-10-23 13:47:11 UTC
+1
on MacBookPro 11,5
kernel 4.2.2
Comment 19 Jan Hilberath 2015-11-24 07:08:54 UTC
Same behavior here on a MacBookPro11,4 with the following kernel versions (all running on Ubuntu Wily):

4.2.0-17.21
4.3.0-040300.201511020949
4.4.0-040400rc2.201511231054

As someone else mentioned in bug #101681, running the "halt" command from within Grub powers the machine off (Grub version 2.02~beta2-29).
Comment 20 Vamsi Subhash Achanta 2015-12-07 20:37:46 UTC
Same here.

I am using Fedora23 with kernel 4.2.6 on a Macbook Pro 11,4 and suspend/poweroff don't work for me.

Also systemctl poweroff and systemctl suspend don't work as well.
Comment 21 Didier 'Ptitjes' 2015-12-23 20:19:19 UTC
Same problem here.
MacBookPro11,4 - Fedora 23, Kernel 4.4.0-rc4

When I poweroff, I get only those lines:
  brcmfmac: brcmf_cfg80211_reg_notifier: not a ISO3166 code
  systemd-shutdown[1]: Failed to finalize  DM devices, ignoring
  reboot: Power down
and the system appears to hang. If I wait there, heat continually rises and fans gets louder and louder. I have to force shutdown by keeping the power button pressed.

I tried to blacklist the brcmfmac, but it behaves the same (minus the brcmfmac error).

I tried the "reboot=pci" kernel parameter and:
- "reboot" (with or without "-p") and clicking on "Restart" in Gnome makes the computer correctly shutdown.
- "poweroff" and clicking on "Power Off" in Gnome still fails as before.

I did not try suspend/resume.

I'd be glad if someone can tell me how I can provide more useful informations.
Comment 22 Tom B 2016-02-08 14:46:59 UTC
Same problem here on 11,5. I'd love to see this issue fixed, is there anything I can do to help get useful information to those who need it?
Comment 23 cameron.e.wood 2016-02-14 04:13:28 UTC
(In reply to Tom B from comment #22)
> Same problem here on 11,5. I'd love to see this issue fixed, is there
> anything I can do to help get useful information to those who need it?

+1, I'm also experiencing this issue on an 11,5 and would be happy to help provide more info or testing if needed.
Comment 24 pkozlov 2016-02-28 19:17:56 UTC
I experience the same issue - reboot works fine, but power down not working (see message "reboot: power down" and hanging console.
Linux-4.4.3-gentoo, MacBook Pro 2015 15.4 retina.
Comment 25 berglh 2016-03-01 23:44:19 UTC
Same issue with the MacBookPro11,5 (2015 15" retina). Has been a problem in all 4.x kernel version for me. Seems to be related to the same problem as the suspend/resume (close laptop lid off power) where it enters a state producing a lot of heat and the GPU/system fans increase to full speed and then slow down.

No problems rebooting.

The problem was repeatable using USB live boot environments. All power options work correctly in OS X as you would expect.
Comment 26 Chen Yu 2016-03-05 02:43:17 UTC
Does this patch work for you, and would someone please  upload his acpidump data?
https://patchwork.kernel.org/patch/8487861/
Comment 27 Bastian Triller 2016-03-05 07:41:47 UTC
Created attachment 207701 [details]
acpidump-MacBookPro11,4
Comment 28 Hermann Mayer 2016-03-05 08:28:42 UTC
Created attachment 207711 [details]
acpidump-macbook-11,4
Comment 29 Hermann Mayer 2016-03-05 09:17:32 UTC
Chen: I've tried your patch on a Macbook 11,4 but unfortunately without luck. The problem still exists for me - it keeps hanging on shutdown.
Comment 30 Chen Yu 2016-03-06 08:16:59 UTC
Well please apply this debug patch instead and provide the message printed on the screen when poweroff. thanks.
Comment 31 Chen Yu 2016-03-06 08:18:18 UTC
Created attachment 207781 [details]
force efi poweroff
Comment 32 Hermann Mayer 2016-03-06 08:45:08 UTC
The patch seems to be broken:

  LINK    vmlinux
  LD      vmlinux.o
  MODPOST vmlinux.o
  GEN     .version
  CHK     include/generated/compile.h
  UPD     include/generated/compile.h
  CC      init/version.o
  LD      init/built-in.o
arch/x86/built-in.o: In function `native_machine_power_off':
reboot.c:(.text+0x4b442): undefined reference to `efi_power_off'
reboot.c:(.text+0x4b46b): undefined reference to `efi_power_off'
Makefile:929: die Regel für Ziel „vmlinux“ scheiterte
make: *** [vmlinux] Fehler 1
Comment 33 Hermann Mayer 2016-03-06 08:51:21 UTC
I'm trying to apply the patch on the vanilla 4.4.3 sources. (on Archlinux with gcc 5.3.0)
Comment 34 Chen Yu 2016-03-06 08:52:54 UTC
Created attachment 207791 [details]
force efi poweroff - v2
Comment 35 Chen Yu 2016-03-06 08:54:58 UTC
plz try Comment34 instead,  is CONFIG__EFI=y in your kernel config?
Comment 36 Hermann Mayer 2016-03-06 09:03:41 UTC
I guess double underscore is a typo. (No result for this) For only one underscore:

$ grep CONFIG_EFI config

CONFIG_EFI_PARTITION=y
CONFIG_EFI=y
CONFIG_EFI_STUB=y
# CONFIG_EFI_VARS is not set
CONFIG_EFI_ESRT=y
CONFIG_EFI_RUNTIME_MAP=y
# CONFIG_EFI_FAKE_MEMMAP is not set
CONFIG_EFI_RUNTIME_WRAPPERS=y
CONFIG_EFIVAR_FS=y
# CONFIG_EFI_PGT_DUMP is not set

--

Comment #34 / force efi poweroff - v2: fix_wakeup.diff? I can try it but sounds strange to me.
Comment 37 Chen Yu 2016-03-06 09:09:21 UTC
Created attachment 207801 [details]
force poweroff -v2

plz use this patch.
Comment 38 Hermann Mayer 2016-03-06 09:41:11 UTC
Patch is working (compiling w/o errors) but shutdown is keep on hanging.
Debug output while shutdown:

[   26.478054] ACPI: Preparing to enter system sleep state S5
[   26.479539] reboot: Power down
[   26.480744] reboot: Using poweroff acpi_power_off+0x0/0x2f
[   26.482111] reboot: Using efi for poweroff...
_
Comment 39 Chen Yu 2016-03-06 11:28:49 UTC
thanks. As Jan mentioned in Comment19, does it work to 'halt' in grub shell?
Comment 40 Hermann Mayer 2016-03-06 12:57:43 UTC
Unfortunately the 'halt' command on the grub (2.02.beta2-6) shell also freezes my system.

There seems to be some patches applied on ArchLinux (https://projects.archlinux.org/svntogit/packages.git/tree/trunk?h=packages/grub) but they seem harmless to me. Only the intel ucode patches/fw sounds related to me maybe.
Comment 41 pkozlov 2016-03-06 16:36:13 UTC
Created attachment 207841 [details]
ACPI dump for MacBook Pro

Please find my ACPI dump attached. "halt" command from grub works fine for me (grub-2.02_beta2-r8). "reboot" works fine, "halt or poweroff" just hangs.
Comment 42 pkozlov 2016-03-06 17:02:37 UTC
Chen Yu, thank you for the patch. It was applied successfully - but still nothing. Halt is not working for me. I attached my ACPI dump in previous comment. I hope it will be helpful. My CONFIG_EFI is identical with Hermann Mayer. I have tried  gentoo-sources-4.4.4.
Comment 43 Shane Chen 2016-03-06 17:24:05 UTC
Created attachment 207851 [details]
ACPIDUMP in Macbook Pro 11,4

'halt' in grub 2.02.beta2-6 works well for me.
But after applied the patch in linux-mainline4.5rc6 , shutdown -P still doesn't work for me, and reboot works well.
Here's my acpidump for my Macbook Pro 11,4
Comment 44 Bastian Triller 2016-03-06 19:06:46 UTC
'halt' in grub 2.02.beta2-36 (Debian) works for me, too.
Comment 45 pkozlov 2016-03-11 10:47:01 UTC
Chen Yu, do you have any updates? If you need some additional hardware info, kernel dumps or whatever - feel free to ask. Or I can test some other patches, even several at one time.
Comment 46 Chen Yu 2016-03-13 12:57:13 UTC
Hi, plz help to confirm which method grub2 is using to halt the system by:
"set debug=all" and then 'halt', thanks.
Comment 47 Robert Abraham 2016-03-13 13:21:49 UTC
Created attachment 208801 [details]
IMG_0939.JPG

Hello,

i tried to capture as much as possible with some images. If it is possible
to get the output to a file please let me know.

2016-03-13 12:57 GMT+00:00 <bugzilla-daemon@bugzilla.kernel.org>:

> https://bugzilla.kernel.org/show_bug.cgi?id=103211
>
> --- Comment #46 from Chen Yu <yu.c.chen@intel.com> ---
> Hi, plz help to confirm which method grub2 is using to halt the system by:
> "set debug=all" and then 'halt', thanks.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
>
Comment 48 Robert Abraham 2016-03-13 13:21:50 UTC
Created attachment 208811 [details]
IMG_0938.JPG
Comment 49 Robert Abraham 2016-03-13 13:21:50 UTC
Created attachment 208821 [details]
IMG_0937.JPG
Comment 50 Hermann Mayer 2016-03-13 13:30:42 UTC
Mostly like Robert's output at my side. I saw for about half a minute the acpihalt.c lines repeated endlessly but then it stopped repeating and printed the efidisk stuff and then nothing more. But it did not halted my system at all.

commands/acpihalt.c:195: Opcode 0x10
commands/acpihalt.c:196: Tell 87b
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell 888
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell 8a8
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell 8cb
commands/acpihalt.c:195: Opcode 0x10
commands/acpihalt.c:196: Tell 962
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell 96f
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell 98f
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell 9b2
commands/acpihalt.c:195: Opcode 0x10
commands/acpihalt.c:196: Tell a49
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell a56
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell a76
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell a99
commands/acpihalt.c:195: Opcode 0x10
commands/acpihalt.c:196: Tell b30
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell b3d
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell b5d
commands/acpihalt.c:195: Opcode 0x14
commands/acpihalt.c:196: Tell b80
cannands/acpihalt.c:386: SLP_TYP = -2, port = 0x1804
disk/efi/efidisk.c:581: reading 0x40 sectors at the sector 0x7f700 from hd1
Comment 51 Bastian Triller 2016-03-13 22:08:39 UTC
There is a patch [1] included in Debian's, but not in Arch's, grub package, which could be related to Hermann's halt issues.

[1] <http://git.savannah.gnu.org/cgit/grub.git/commit/?id=0f1f95c7b7bc72cfbeea2f6dc5986855738ad96d>
Comment 52 Bastian Triller 2016-03-13 22:18:55 UTC
Created attachment 208961 [details]
dmesg MacBookPro11,4
Comment 53 pkozlov 2016-03-14 09:07:04 UTC
Created attachment 209051 [details]
dmesg macbook pro

Just in case if it can be useful - my dmesg.
Comment 54 pkozlov 2016-03-14 09:08:35 UTC
I found the same warning as Bastian in my dmesg.

[   10.478013] ACPI Warning: SystemIO range 0x000000000000EFA0-0x000000000000EFBF conflicts with OpRegion 0x000000000000EFA0-0x000000000000EFAF (\_SB.PCI0.SBUS.SMBI) (20160108/utaddress-255)
[   10.478018] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

Could it be a possible reason of issue?
Comment 55 Niels Dettenbach 2016-03-14 09:42:26 UTC
Hiho guys,

i've digged a bit more in my MacBookPro 11,5:

   DMI: Apple Inc. MacBookPro11,5/Mac-06F11F11946D27C5, BIOS MBP114.88Z.0172.B07.1510261437 10/26/2015
   ACPI: DSDT 0x0000000078D7F000 007331 (v03 APPLE  MacBookP 00110004 INTL 20140424)

and
   [    2.504622] ACPI Warning: SystemIO range 0x000000000000EFA0-0x000000000000EFBF conflicts with OpRegion 0x000000000000EFA0-0x000000000000EFAF (\_SB_.PCI0.SBUS.SMBI) (20150930/utaddress-254)


dmesg (filtered ACPI stuff) and found this:

[0.219074] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20150930/hwxface-580)
[0.219080] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20150930/hwxface-580)

which seems to me as the acpi kernel drivers has some problems / are incompatible in any way. Here is the "full" (ACPI) output:

[    0.000000] BIOS-e820: [mem 0x0000000078d01000-0x0000000078d48fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000078d4d000-0x0000000078d8efff] ACPI data
[    0.000000] reserve setup_data: [mem 0x0000000078d01000-0x0000000078d48fff] ACPI NVS
[    0.000000] reserve setup_data: [mem 0x0000000078d4d000-0x0000000078d8efff] ACPI data
[    0.000000] efi:  ACPI=0x78d8e000  ACPI 2.0=0x78d8e014  SMBIOS=0x78f8c000 
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x0000000078D8E014 000024 (v02 APPLE )
[    0.000000] ACPI: XSDT 0x0000000078D8E1C0 0000B4 (v01 APPLE  Apple00  00000000      01000013)
[    0.000000] ACPI: FACP 0x0000000078D8C000 0000F4 (v05 APPLE  Apple00  00000000 Loki 0000005F)
[    0.000000] ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 128/0 (20150930/tbfadt-623)
[    0.000000] ACPI: DSDT 0x0000000078D7F000 007331 (v03 APPLE  MacBookP 00110004 INTL 20140424)
[    0.000000] ACPI: FACS 0x0000000078D06000 000040
[    0.000000] ACPI: FACS 0x0000000078D06000 000040
[    0.000000] ACPI: HPET 0x0000000078D8B000 000038 (v01 APPLE  Apple00  00000001 Loki 0000005F)
[    0.000000] ACPI: APIC 0x0000000078D8A000 0000BC (v02 APPLE  Apple00  00000001 Loki 0000005F)
[    0.000000] ACPI: SBST 0x0000000078D88000 000030 (v01 APPLE  Apple00  00000001 Loki 0000005F)
[    0.000000] ACPI: ECDT 0x0000000078D87000 000053 (v01 APPLE  Apple00  00000001 Loki 0000005F)
[    0.000000] ACPI: SSDT 0x0000000078D7E000 000024 (v01 APPLE  SataAhci 00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D7D000 000024 (v01 APPLE  SmcDppt  00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D7C000 000032 (v01 APPLE  SsdtS3   00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D68000 0086DA (v01 APPLE  TbtPEG11 00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D67000 0000B8 (v01 APPLE  Sdxc     00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D66000 000A7B (v02 APPLE  Xhci     00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D62000 000341 (v01 APPLE  PEG2SSD0 00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D60000 0019B9 (v01 APPLE  PEG0GFX0 00001000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D5F000 000639 (v01 PmRef  Cpu0Ist  00003000 INTL 20140424)
[    0.000000] ACPI: SSDT 0x0000000078D5E000 000C17 (v01 CpuRef CpuSsdt  00003000 INTL 20140424)
[    0.000000] ACPI: MCFG 0x0000000078D89000 00003C (v01 APPLE  Apple00  00000001 Loki 0000005F)
[    0.000000] ACPI: DMAR 0x0000000078D5D000 000088 (v01 APPLE  BDW      00000001 INTL 00000001)
[    0.000000] ACPI: VFCT 0x0000000078D4D000 00F284 (v01 APPLE  Apple00  00000001 AMD  31504F47)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: PM-Timer IO Port: 0x1808
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x04] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x05] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x06] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x07] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x08] high edge lint[0x1])
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[    0.000027] ACPI: Core revision 20150930
[    0.015461] ACPI: 11 ACPI AML tables successfully acquired and loaded
[    0.190841] PM: Registering ACPI NVS region [mem 0x78d01000-0x78d48fff] (294912 bytes)
[    0.201726] ACPI: bus type PCI registered
[    0.201729] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.207986] ACPI: Added _OSI(Module Device)
[    0.207990] ACPI: Added _OSI(Processor Device)
[    0.207993] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.207996] ACPI: Added _OSI(Processor Aggregator Device)
[    0.210354] ACPI : EC: EC description table is found, configuring boot EC
[    0.210363] ACPI : EC: EC started
[    0.216460] [Firmware Bug]: ACPI: BIOS _OSI(Linux) query ignored
[    0.216661] ACPI: Dynamic OEM Table Load:
[    0.216668] ACPI: SSDT 0xFFFF88046BF12000 0004F0 (v01 PmRef  Cpu0Cst  00003001 INTL 20140424)
[    0.217271] ACPI: Dynamic OEM Table Load:
[    0.217277] ACPI: SSDT 0xFFFF88046BF12800 00067C (v01 PmRef  ApIst    00003000 INTL 20140424)
[    0.217895] ACPI: Dynamic OEM Table Load:
[    0.217900] ACPI: SSDT 0xFFFF88046B953400 000119 (v01 PmRef  ApCst    00003000 INTL 20140424)
[    0.219067] ACPI: Interpreter enabled
[    0.219074] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20150930/hwxface-580)
[    0.219080] ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20150930/hwxface-580)
[    0.219092] ACPI: (supports S0 S3 S4 S5)
[    0.219095] ACPI: Using IOAPIC for interrupt routing
[    0.219444] PCI: MMCONFIG at [mem 0xe0000000-0xe9cfffff] reserved in ACPI motherboard resources
[    0.219455] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.225708] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.226202] pci 0000:00:01.0: System wakeup disabled by ACPI
[    0.226318] pci 0000:00:01.1: System wakeup disabled by ACPI
[    0.226413] pci 0000:00:01.2: System wakeup disabled by ACPI
[    0.226564] pci 0000:00:14.0: System wakeup disabled by ACPI
[    0.226809] pci 0000:00:1b.0: System wakeup disabled by ACPI
[    0.227056] pci 0000:00:1c.2: System wakeup disabled by ACPI
[    0.227180] pci 0000:00:1c.3: System wakeup disabled by ACPI
[    0.227643] pci 0000:01:00.0: System wakeup disabled by ACPI
[    0.237265] pci 0000:04:00.0: System wakeup disabled by ACPI
[    0.241645] ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 10 12 14 15) *0, disabled.
[    0.241702] ACPI: PCI Interrupt Link [LNKB] (IRQs 1 3 4 5 6 7 11 12 14 15) *0, disabled.
[    0.241759] ACPI: PCI Interrupt Link [LNKC] (IRQs 1 3 4 5 6 7 10 12 14 15) *0, disabled.
[    0.241812] ACPI: PCI Interrupt Link [LNKD] (IRQs 1 3 4 5 6 7 11 12 14 15) *0, disabled.
[    0.241863] ACPI: PCI Interrupt Link [LNKE] (IRQs 1 3 4 5 6 7 10 12 14 15) *0, disabled.
[    0.241914] ACPI: PCI Interrupt Link [LNKF] (IRQs 1 3 4 5 6 7 11 12 14 15) *0, disabled.
[    0.241966] ACPI: PCI Interrupt Link [LNKG] (IRQs 1 3 4 5 6 7 10 12 14 15) *0, disabled.
[    0.242012] ACPI: PCI Interrupt Link [LNKH] (IRQs 1 3 4 5 6 7 11 12 14 15) *0, disabled.
[    0.242175] ACPI: Enabled 4 GPEs in block 00 to 3F
[    0.242295] ACPI : EC: GPE = 0x17, I/O: command/status = 0x66, data = 0x62
[    0.242661] ACPI: bus type USB registered
[    0.242861] PCI: Using ACPI for IRQ routing
[    0.249519] pnp: PnP ACPI init
[    0.250025] system 00:00: Plug and Play ACPI device, IDs PNP0103 PNP0c01 (active)
[    0.250058] system 00:01: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.250071] pnp 00:02: Plug and Play ACPI device, IDs PNP0b00 (active)
[    0.250112] pnp 00:03: Plug and Play ACPI device, IDs APP000b (active)
[    0.250223] system 00:04: Plug and Play ACPI device, IDs PNP0c02 (active)
[    0.250268] system 00:05: Plug and Play ACPI device, IDs PNP0c01 (active)
[    0.250369] pnp: PnP ACPI: found 6 devices
[    0.287266] ACPI: Deprecated procfs I/F for AC is loaded, please retry with CONFIG_ACPI_PROCFS_POWER cleared
[    0.287314] ACPI: AC Adapter [ADP1] (off-line)
[    0.287819] ACPI: Lid Switch [LID0]
[    0.287872] ACPI: Power Button [PWRB]
[    0.287922] ACPI: Sleep Button [SLPB]
[    0.287971] ACPI: Power Button [PWRF]
[    0.288024] [Firmware Bug]: ACPI(GFX0) defines _DOD but not _DOS
[    0.288051] ACPI: Video Device [GFX0] (multi-head: yes  rom: yes  post: no)
[    0.290944] ACPI: SBS HC: EC = 0xffff88046b837a00, offset = 0x20, query_bit = 0x10
[    0.336305] ACPI: Smart Battery System [SBS0]: Battery Slot [BAT0] (battery present)
[    2.504622] ACPI Warning: SystemIO range 0x000000000000EFA0-0x000000000000EFBF conflicts with OpRegion 0x000000000000EFA0-0x000000000000EFAF (\_SB_.PCI0.SBUS.SMBI) (20150930/utaddress-254)
[    2.504627] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver

If someone need a full dmesg out or similiar details pls just ask.

many thanks,


Niels.
Comment 56 Hermann Mayer 2016-03-14 15:48:20 UTC
I can confirm that with grub2 patch [1] I'm able to halt my machine via the
grub shell. (Very good tip Bastian Triller, Comment #51) I hope this is useful
for a kernel patch.

[1] <http://git.savannah.gnu.org/cgit/grub.git/commit/?id=0f1f95c7b7bc72cfbeea2f6dc5986855738ad96d>
Comment 57 Chen Yu 2016-03-17 11:27:20 UTC
thanks for your info, although I'm not sure if there is a hw r/w problem in acpica, I copied the code from grub and force it to enter S5. please help test with following steps:

1. patch force_s5.diff, and recompile the kernel with CONFIG_ACPI_DEBUG=y

2. boot into kernel,
   echo 0x00000002 > /sys/module/acpi/parameters/debug_layer
   echo 0x04000001 > /sys/module/acpi/parameters/debug_level
   echo 7 > /proc/sys/kernel/printk
3. halt -p (this should hang), and collect as much as possible the messages 

4. reboot the system,repeat step 2
5. echo 1 > /proc/sys/kernel/force_s5
6. halt -p ( works? if not, please also collect the messages)
Comment 58 Chen Yu 2016-03-17 11:28:42 UTC
Created attachment 209631 [details]
Force a S5 directly

plz do not apply other patches except this one.
Comment 59 Bastian Triller 2016-03-17 12:39:46 UTC
Created attachment 209641 [details]
force s5
Comment 60 Bastian Triller 2016-03-17 12:40:24 UTC
Created attachment 209651 [details]
no force s5
Comment 61 Bastian Triller 2016-03-17 12:41:06 UTC
forcing s5 does not work :(
Comment 62 Chen Yu 2016-03-17 14:07:50 UTC
:(

How about boot the kernel with 'text nomodeset' appended?
Comment 63 Chen Yu 2016-03-17 14:10:16 UTC
(In reply to Chen Yu from comment #62)
> :(
> 
> How about boot the kernel with 'text nomodeset' appended?

plz ignore this, I'll provide another debug patch later.
Comment 64 Chen Yu 2016-03-17 14:12:55 UTC
plz check:
1. patch force_s5.diff
2. append:

init=/bin/bash nomodeset 
in your commandline, and boot up the system
3. echo 7 > /proc/sys/kernel/printk
4.echo 1 > /proc/sys/kernel/force_s5
5. echo mem > /sys/power/state

thx
Comment 65 Bastian Triller 2016-03-17 14:21:46 UTC
hangs
Comment 66 Bastian Triller 2016-03-17 14:22:53 UTC
Created attachment 209661 [details]
force s5 patch applied - mem > /s/p/s
Comment 67 Chen Yu 2016-03-20 12:54:20 UTC
Created attachment 210001 [details]
test halt at early stage in linux
Comment 68 Chen Yu 2016-03-20 13:01:02 UTC
Comment67 is a debug patch that tries to halt system at a early stage.

1. boot linux with 'init=/bin/bash nomodeset bisect=1'
2. if above step halt the system successfully, we have to do a bisect:
   the boot opts bisect is a indicator which ask the linux to halt at some
   special stage. Currently the max number of bisect is 20, if boot with    bisect=1 can halt the linux, while bisect=20 can not, we might have to test
bisect=10, then maybe bisect=5, then bisect=2... hope we can narrow the scope for it. 
3. if step 1 failed to halt the system, I might have to refine our debug patch.

Thanks
Comment 69 Bastian Triller 2016-03-20 16:51:55 UTC
bisect=14 works, bisect=15 not.
Comment 70 Bastian Triller 2016-03-20 16:52:28 UTC
Created attachment 210031 [details]
bisect=15
Comment 71 pkozlov 2016-03-20 20:55:59 UTC
Bastian, Chen, then same for me - works with bisect=14, hangs with bisect=15.

The system just halted next second on booting with options "init=/bin/bash nomodeset bisect=14" - was it expected to see?
Comment 72 Chen Yu 2016-03-21 09:23:34 UTC
Created attachment 210091 [details]
find the driver who disable port writting
Comment 73 Chen Yu 2016-03-21 09:24:37 UTC
thanks Bastian, pkozlov, 
plz help try Comment72, the same step to bisect which driver disable the port writting.
Comment 74 Chen Yu 2016-03-21 09:27:12 UTC
The max number of bisect depend on your system, you can verify it by first booting up without any params, and check the max number of bisect in dmesg
"**halt_me debug counter x"
Comment 75 Bastian Triller 2016-03-21 18:23:36 UTC
bisect=233 halts the system, bisect=234 hangs
Comment 76 Bastian Triller 2016-03-21 18:25:01 UTC
Created attachment 210141 [details]
dmesg bisect-driver-patched without bisect parameter
Comment 77 Chen Yu 2016-03-24 14:29:14 UTC
(In reply to Bastian Triller from comment #76)
> Created attachment 210141 [details]
> dmesg bisect-driver-patched without bisect parameter

thanks, could you attach your /proc/ioport and /proc/iomem
lspci -vvxx
Comment 78 pkozlov 2016-03-24 15:13:48 UTC
Chen Yu, there is no /proc/ioport file (probably need to build kernel with some other options?)
/proc/ioports - https://bpaste.net/show/5c920d921c10
/proc/iomem - https://bpaste.net/show/681e74805a75
lspci -vvxx - https://bpaste.net/show/15d78e6d70f8
Comment 79 pkozlov 2016-03-24 15:15:26 UTC
Created attachment 210591 [details]
/proc/ioports
Comment 80 pkozlov 2016-03-24 15:15:47 UTC
Created attachment 210601 [details]
/proc/iomem
Comment 81 pkozlov 2016-03-24 15:16:08 UTC
Created attachment 210611 [details]
lspci -vvxx
Comment 82 pkozlov 2016-03-24 15:16:33 UTC
Attached three files to the bug (I think it's better than bpaste).
Comment 83 Bastian Triller 2016-03-24 15:26:32 UTC
Created attachment 210621 [details]
/proc/iomem
Comment 84 Bastian Triller 2016-03-24 15:26:51 UTC
Created attachment 210631 [details]
/proc/ioports
Comment 85 Bastian Triller 2016-03-24 15:27:20 UTC
Created attachment 210641 [details]
lspci -vvxx
Comment 86 pkozlov 2016-03-24 18:17:09 UTC
Bastian, please change your attachments content-type to text/plain.
Comment 87 pkozlov 2016-03-31 09:43:48 UTC
Chen Yu, what additional info do you need to confirm this bug? Do you have any ideas how to resolve this?
Comment 88 Christopher Broome 2016-04-03 04:00:02 UTC
Macbook Pro 11,5
Kernel 4.2.0

I just wanted to add that I also cannot fully suspend my machine nor fully poweroff the machine. Reboot works fine.

I'd like to help test some patches, but I'm having trouble booting my freshly built kernel (4.6-rc1) from rEFInd.
Comment 89 pkozlov 2016-04-03 07:11:56 UTC
(In reply to Christopher Broome from comment #88)
> Macbook Pro 11,5
> Kernel 4.2.0
> 
> I just wanted to add that I also cannot fully suspend my machine nor fully
> poweroff the machine. Reboot works fine.
> 
> I'd like to help test some patches, but I'm having trouble booting my
> freshly built kernel (4.6-rc1) from rEFInd.

Well, I booted linux without any rEFInd, I just installed grub to SSD and added macOS entry. Works fine.
Comment 90 Christopher Broome 2016-04-09 02:59:23 UTC
OK I figured out how to boot my kernel via rEFInd and pass params.

I checked my maximum number of bisects and the highest number I see is 637.

Here are my results so far:

176: halt
177: reboot
Comment 91 Christopher Broome 2016-04-09 04:14:51 UTC
Final results:

176: halt
177-299: reboot
300: hang

Unfortunately I'm not sure how to tell which driver was affected. My console isn't initialized until around module 450 or so. I tried passing earlyprintk=efi,keep but the console scrolls incredibly slowly - as in an entire second per line of output. I'll look into changing log levels next in order to filter out some noise to get the needed info.
Comment 92 Christopher Broome 2016-04-09 05:35:55 UTC
I made a couple of adjustments to the patch in the `halt_me` function that are specific to my bisect points listed in comment 91.

if ((count >= 171 && count <= 180) || (count >= 295))
        printk(KERN_ALERT "**halt_me debug counter:%d, checking driver:%pF\n",
                count, fn);

Basically I only print the debug counter if it's within my specific range of usefulness. I've also adjusted all the printk messages to be KERN_ALERT level.

After recompiling, I booted my kernel with the following options:

init=/bin/bash nomodeset earlyprintk=efi,keep loglevel=2 bisect=300

And I was able to take a photo of my results. I'll add that as an attachment.
Comment 93 Christopher Broome 2016-04-09 05:41:04 UTC
Created attachment 212271 [details]
Debug counter output
Comment 94 Chen Yu 2016-04-09 17:21:20 UTC
This port of 1804 becomes un-writtable after pci subsystem re-assign the ioport resource, and according to previous logs, it looks like after 00:1c.0 PCI bridge re-set its io window to 2000-2fff and then 1804 becomes unstable. I got feedback from someone maintaining pci suggested 'use setpci to clear that before access 0x1804', and I'm still trying to figure out why this happen and maybe I'll send another debug patch tomorrow to confirm if it is 00:1c.0 or not.
Comment 95 pkozlov 2016-04-18 14:48:29 UTC
> maybe I'll send another debug patch tomorrow to confirm if it is 00:1c.0 or
> not

Hi Chen Yu!

Do you have any new patch to try?
Comment 96 Chen Yu 2016-04-27 14:43:55 UTC
(In reply to pkozlov from comment #95)
> > maybe I'll send another debug patch tomorrow to confirm if it is 00:1c.0 or
> not
> 
> Hi Chen Yu!
> 
> Do you have any new patch to try?

Hi,
no new patch yet, but would you help test with no patch on top of latest mainline kernel:

1.build linux with CONFIG_DYNAMIC_DEBUG=y, and

dyndbg='file drivers/pci/probe.c +p' ignore_loglevel initcall_debug 
then provide dmesg, and

2.after boot up, provide:
# setpci -s 0000:00:1c.0 IO_BASE
# setpci -s 0000:00:1c.0 IO_LIMIT
# setpci -s 0000:00:1c.0 IO_BASE_UPPER16
# setpci -s 0000:00:1c.0 IO_LIMIT_UPPER16

3.
# setpci -s 0000:00:1c.0 IO_BASE.B=f0
# setpci -s 0000:00:1c.0 IO_LIMIT.B=0
# setpci -s 0000:00:1c.0 IO_BASE_UPPER16.W=0
# setpci -s 0000:00:1c.0 IO_LIMIT_UPPER16.W=0

4.
# setpci -s 0000:00:1c.0 IO_BASE
# setpci -s 0000:00:1c.0 IO_LIMIT
# setpci -s 0000:00:1c.0 IO_BASE_UPPER16
# setpci -s 0000:00:1c.0 IO_LIMIT_UPPER16

5.# poweroff or halt..
Comment 97 Christopher Broome 2016-04-27 14:58:46 UTC
(In reply to Chen Yu from comment #96)
> (In reply to pkozlov from comment #95)
> > > maybe I'll send another debug patch tomorrow to confirm if it is 00:1c.0
> or not
> > 
> > Hi Chen Yu!
> > 
> > Do you have any new patch to try?
> 
> Hi,
> no new patch yet, but would you help test with no patch on top of latest
> mainline kernel:
> 

By mainline kernel do you mean v4.6-rc5 or master?
Comment 98 pkozlov 2016-04-27 21:46:59 UTC
Created attachment 214551 [details]
dmesg macbook pro

Last dmesg (kernel 4.5.2-gentoo)
Comment 99 pkozlov 2016-04-27 21:47:47 UTC
Created attachment 214561 [details]
setpci results

output from setpci commands
Comment 100 pkozlov 2016-04-27 21:48:44 UTC
Created attachment 214571 [details]
screenshot on attempt to poweroff

Still hangs on poweroff
Comment 101 Chen Yu 2016-05-11 07:45:36 UTC
*** Bug 113831 has been marked as a duplicate of this bug. ***
Comment 102 Vlad Lesin 2016-05-13 20:47:31 UTC
I have absolutely the same issue with poweroff and suspend on MacBook Pro 11.5 and 4.2.22 kernel. Chen, if you need any additional info or any ideas how to get things moving I am ready to assist.
Comment 103 rockon999 2016-05-15 15:32:04 UTC
Chen Yu
What do you need to continue work on this bug?
Comment 104 Arnaud Astruc 2016-05-16 21:55:45 UTC
kernel 4.5.4 MacbookPro 11.4 same issue.
If I can help, just let me know
Comment 105 Chen Yu 2016-05-18 06:42:09 UTC
*** Bug 113751 has been marked as a duplicate of this bug. ***
Comment 106 Chen Yu 2016-05-19 02:55:50 UTC
@pkozlov According to your log at #Comment 81, let's try to bypass the enumeration of pci 0000:1c.0 and see if there is any difference, please help apply the patch below, and boot with 'blacklist_pci=0x8c108086'
Comment 107 Chen Yu 2016-05-19 02:57:02 UTC
Created attachment 216661 [details]
bypass the enumeration for specific pci device
Comment 108 Chen Yu 2016-05-19 03:02:59 UTC
Hi all, please help provide boot up dmesg with the following ops appended in your grub:
dyndbg='file drivers/pci/probe.c +p' ignore_loglevel initcall_debug
Comment 109 pkozlov 2016-05-19 18:08:09 UTC
Chen Yu, indeed, your last patch helped! Thank you. Macbook had turned off correctly.

However, wifi is broken after that. Is it possible to turn off this PCI on runtime just before attempt to shutdown, not on bootup?
Comment 110 Dustin Willis Webber 2016-05-23 23:15:37 UTC
Confirmed - it will cut off wifi and facetime I believe.
Comment 111 Chen Yu 2016-05-25 03:57:41 UTC
Created attachment 217361 [details]
avoid io allocation for specific pci bridge
Comment 112 Chen Yu 2016-05-25 03:59:36 UTC
please try boot with :

dyndbg='file drivers/pci/probe.c +p;file drivers/pci/setup-bus.c +p' blacklist_pci=0x8c108086

on top of patch from #Comment 109.

and provide bootup dmesg.
Comment 113 Chen Yu 2016-05-25 04:00:21 UTC
lspci -vvxx 
lspci -t
Comment 114 Zach Norman 2016-05-26 05:15:16 UTC
Created attachment 217601 [details]
dmesg with kernel opts in comment 108

This is on an Apple MBP 11,4 I think. Same suspend/poweroff issues as others.

Patches don't seem to be merging for me on current arch but let me know what I can do contribute.
Comment 115 Chen Yu 2016-05-26 05:21:04 UTC
(In reply to Zach Norman from comment #114)
> Created attachment 217601 [details]
> dmesg with kernel opts in comment 108
> 
> This is on an Apple MBP 11,4 I think. Same suspend/poweroff issues as others.
> 
> Patches don't seem to be merging for me on current arch but let me know what
> I can do contribute.

Please try to apply patch from #Comment 111?
Comment 116 Pablo Catalina 2016-05-26 14:44:37 UTC
I applied the patch on #Comment_111 and booted with the options on #Comment_112.

It is a Macbook Pro 11,4

Attaching dmesg and lspci.
Comment 117 Pablo Catalina 2016-05-26 14:46:41 UTC
Created attachment 217621 [details]
#comment_116 dmesg
Comment 118 Pablo Catalina 2016-05-26 14:47:14 UTC
Created attachment 217631 [details]
#comment_116 lspci
Comment 119 Pablo Catalina 2016-05-26 14:47:37 UTC
Created attachment 217641 [details]
#comment_116 lspci-t
Comment 120 Pablo Catalina 2016-05-26 14:48:16 UTC
Created attachment 217651 [details]
#comment_116 lspci-vvxx
Comment 121 Pablo Catalina 2016-05-26 15:12:30 UTC
With the kernel patched with #Comment_111 I followed the instructions on the  #Comment_96.

The system did not poweroff. Attached the dmesg, output of setpci commands and the screen shown after trying to halt the system.
Comment 122 Pablo Catalina 2016-05-26 15:13:47 UTC
Created attachment 217661 [details]
#comment_121 dmesg
Comment 123 Pablo Catalina 2016-05-26 15:14:17 UTC
Created attachment 217671 [details]
#comment_121 setpci
Comment 124 Pablo Catalina 2016-05-26 15:21:28 UTC
Created attachment 217681 [details]
#comment_121 poweroff output
Comment 125 Zach Norman 2016-05-26 23:04:33 UTC
Created attachment 217771 [details]
mbp 11,4 dmesg with patch from comment #111

mbp 11,4 dmesg with patch from comment #111
Comment 126 Zach Norman 2016-05-26 23:05:56 UTC
Created attachment 217781 [details]
mbp 11,4 lspci --vvxx output with patch from comment #111

mbp 11,4 lspci --vvxx output with patch from comment #111
Comment 127 Zach Norman 2016-05-26 23:06:49 UTC
Created attachment 217791 [details]
mbp 11,4 lspci -t output with patch from comment #111

mbp 11,4 lspci -t output with patch from comment #111
Comment 128 Zach Norman 2016-05-26 23:34:22 UTC
Created attachment 217811 [details]
iasl -tc dsdt.dsl recompile warnings from a mbp 11,4

Doesnt appear to be any change on my side but I wasn't sure if I was supposed to be using the blacklist_pci options or not.

Is a recompiled DSDT summary useful at all? Only shows warnings (no errors) but I've attached it.
Comment 129 Chen Yu 2016-05-27 18:00:56 UTC
thanks, the patch has taken effect although it did not fix the issue. Need your help to do debugging with the following patch attched.
Comment 130 Chen Yu 2016-05-27 18:11:22 UTC
Created attachment 217911 [details]
debug patch to find the exact  operation which broke io port space

This is a bisect debug patch to find the operation who broke the io port address.
1. boot up kernel without any params
2. after booting up, check the dmesg and find the max number of 'current halt_count'
such as:
No need to halt, resource 15 [??? 0x00000000 flags 0x0], current halt_count:66, pivot_halt:2147483647
then halt_count is 66
3. reboot the system with bisect=33 to see if the system is poweroff during bootup.
4. if it shutdown successfully during bootup then reboot the system with bisect=50,otherwise try bisect=16, etc
this method is essentially the same as #Comment 68
Comment 131 pkozlov 2016-05-27 18:45:24 UTC
Created attachment 217921 [details]
Dmesg after patch from comment #111
Comment 132 pkozlov 2016-05-27 18:45:52 UTC
Created attachment 217931 [details]
lspci -vvxx after patch from comment #111
Comment 133 pkozlov 2016-05-27 18:46:12 UTC
Created attachment 217941 [details]
lspci -t after patch from comment #111
Comment 134 pkozlov 2016-05-27 19:07:45 UTC
Created attachment 217961 [details]
genkernel log for kernel gentoo-sources-4.6.0 with bisect patch

Chen Yu, I am trying to apply your last patch. It is applied successfully to the kernel gentoo-sources-4.6.0, however, I can't build it.

Should I choose another kernel version? Latest from git?
Comment 135 pkozlov 2016-05-27 19:12:40 UTC
Sorry, genkernel uses Russian locale for error messages, even if I try to run it with LC_ALL="C".
The issue is with drivers/pci/setup-bus.c file. It's not compiling.
Comment 136 Zach Norman 2016-05-28 00:02:47 UTC
Created attachment 217971 [details]
kernel build failure output on 4.5.4-1-ARCH

My build is failing also on 4.5.4-1-ARCH. Attached output of failed step.
Comment 137 Chen Yu 2016-05-28 02:28:41 UTC
Created attachment 217981 [details]
bisect pci operation

Sorry the patch is broken, please use this one instead and follow the steps in #Comment 130
Comment 138 Zach Norman 2016-05-28 05:25:50 UTC
Created attachment 218011 [details]
dmesg halt_count grep between 84 and 85

Thanks for the updated patch and continued support.

My system stopped halting on bisect=85

Looks like thats '00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d5)' from my lspci? 

lspci --vvxx is in attachment 217781 [details]
lspci -t is in attachment 21791 [details]
Comment 139 Zach Norman 2016-05-28 05:37:08 UTC
correction: my lspci -t is in attachment 217791 [details]
Comment 140 pkozlov 2016-05-28 09:10:26 UTC
Created attachment 218021 [details]
maximum halt count

My maximum halt count from dmesg is 99 after last patch.
Comment 141 pkozlov 2016-05-28 09:17:38 UTC
> My system stopped halting on bisect=85
The same for me. Halted for bisect=84, hanging for bisect=85. I hope this will be helpful.
Comment 142 Chen Yu 2016-05-28 11:52:04 UTC
(In reply to pkozlov from comment #140)
> Created attachment 218021 [details]
> maximum halt count
> 
> My maximum halt count from dmesg is 99 after last patch.

OK.
please provide your full dmesg log without any bisect.

@Zach
Comment 143 Chen Yu 2016-05-28 11:54:30 UTC
I mean, with the patch from #Comment 137 applied and boot with
dyndbg='file drivers/pci/probe.c +p;file drivers/pci/setup-bus.c +p;file drivers/pci/setup-res.c +p' initcall_debug
Comment 144 Chen Yu 2016-05-28 11:56:20 UTC
So it is because the reallocation of [mem 0x7fa00000-0x7fbfffff] of 00:1c.0 broke the system. I'm trying to provide to quirk according to your input based on logs in #Comment 143
Comment 145 Zach Norman 2016-05-28 14:44:03 UTC
Created attachment 218031 [details]
patched mbp 11,4 dmesg output with dynbg and initcall_debug kernel opts
Comment 146 pkozlov 2016-05-28 21:28:23 UTC
Created attachment 218041 [details]
dmesg with debug cmdline

I attached my dmesg with patch from #Comment 137 applied and boot with
dyndbg='file drivers/pci/probe.c +p;file drivers/pci/setup-bus.c +p;file drivers/pci/setup-res.c +p' initcall_debug

I hope it will be helpful.
Comment 147 Chen Yu 2016-05-29 08:23:14 UTC
Created attachment 218061 [details]
quirk to bypass mmio assignment for Mac Pro 11
Comment 148 Chen Yu 2016-05-29 08:26:54 UTC
@Zach @pkozlov 
please help try the patch at #Comment 147.
1. boot with
dyndbg='file drivers/pci/probe.c +p;file drivers/pci/setup-bus.c +p;file drivers/pci/setup-res.c +p' initcall_debug
provide dmesg.
2. try poweroff/halt, etc
3. if step 2 does not work, try adding disable_mode=1 on top of step 1, and
provide dmesg
4. try poweroff
5. verify if wifi is broken
Comment 149 pkozlov 2016-05-29 09:39:18 UTC
Created attachment 218071 [details]
dmesg without disable_mode

@Chen Yu
If I boot without disable_mode=1, poweroff doesn't work. But if I provide disable_mode=1 - poweroff works! Wifi is not broken.

Attaching first dmesg (without disable_mode=1)
Comment 150 pkozlov 2016-05-29 09:39:45 UTC
Created attachment 218081 [details]
dmesg with disable_mode=1

And the second dmesg (disable_mode=1)
Comment 151 pkozlov 2016-05-29 09:52:42 UTC
Created attachment 218091 [details]
suspend dmesg

Poweroff works better. However, suspend (suspend to ram) doesn't work. It tries to suspend, but wakes up on the next few seconds (no more than 3-5). Attached dmesg messages after suspend try (please tell me if I need to configure kernel with some additional debugging for suspend).
Comment 152 Chen Yu 2016-05-29 10:29:06 UTC
(In reply to pkozlov from comment #149)
> Created attachment 218071 [details]
> dmesg without disable_mode
> 
> @Chen Yu
> If I boot without disable_mode=1, poweroff doesn't work. But if I provide
> disable_mode=1 - poweroff works! Wifi is not broken.
> 
> Attaching first dmesg (without disable_mode=1)

I see, so we should disable both mmio and mmio-pref region on 1c.0, I'll send a patch to the pci mailing list for review later.
Comment 153 Chen Yu 2016-05-29 10:33:28 UTC
(In reply to pkozlov from comment #151)
> Created attachment 218091 [details]
> suspend dmesg
> 
> Poweroff works better. However, suspend (suspend to ram) doesn't work. It
> tries to suspend, but wakes up on the next few seconds (no more than 3-5).
> Attached dmesg messages after suspend try (please tell me if I need to
> configure kernel with some additional debugging for suspend).

Before applying patch #Comment 147, the system can not suspend to ram and just hangs there, right? For the wakeup issue, please provide logs of 
 cat /proc/acpi/wakeup
Device  S-state   Status   Sysfs node
PEG0      S3    *disabled
EC        S4    *disabled  platform:PNP0C09:00
HDEF      S3    *disabled  pci:0000:00:1b.0
RP01      S3    *disabled  pci:0000:00:1c.0
RP02      S3    *disabled  pci:0000:00:1c.1
RP03      S3    *disabled  pci:0000:00:1c.2
ARPT      S4    *disabled  pci:0000:03:00.0
RP05      S3    *disabled  pci:0000:00:1c.4
RP06      S3    *disabled  pci:0000:00:1c.5
SPIT      S3    *disabled
XHC1      S3    *enabled   pci:0000:00:14.0
ADP1      S4    *disabled  platform:ACPI0003:00
LID0      S4    *enabled   platform:PNP0C0D:00

say, if you see the LID0 is enabled, please type
echo LID0 > /proc/acpi/wakeup and 
unplug any usb-devices/AC and try again.
Comment 154 Cédric Le Goater 2016-05-29 17:44:20 UTC
Hello, 

@Chen Yu

I tested your patch from #Comment 147 with a mainline 4.6 kernel on
a MacBookPro11,5. 

Booted with :

    dyndbg='file drivers/pci/probe.c +p;file drivers/pci/setup-bus.c +p;file drivers/pci/setup-res.c +p' initcall_debug disable_mode=1

After power on, I ran "echo LID0 > /proc/acpi/wakeup" to disable LID0.

The system suspends and power-offs fine. The wifi is still operational
after resume.
Comment 155 pkozlov 2016-05-29 18:03:22 UTC
Yes, 'echo LID0 > /proc/acpi/wakeup' worked for me - suspend works fine after that.
Comment 156 Zach Norman 2016-05-29 18:29:51 UTC
Created attachment 218121 [details]
mpb11,4 dmesg output, dynbg and initcall debug kernel opts with patch from comment 147

poweroff did not work here
Comment 157 Zach Norman 2016-05-29 18:33:15 UTC
Created attachment 218131 [details]
mpb11,4 dmesg output, dynbg and initcall debug kernel opts with patch from comment 147, plus disable_mode=1

poweroff is working, comes out of suspend right away like others. will follow up with the results on the suspend issue with the acpi wakeup change suggestions
Comment 158 Zach Norman 2016-05-29 18:46:57 UTC
Created attachment 218141 [details]
default state /proc/acpi/wakeup on mbp 11,4

disabling LID0 in /proc/acpi/wakeup works for me also, as long as power is unplugged.
Comment 159 Pablo Catalina 2016-05-29 21:02:45 UTC
Hi,

Thank you very much!

I tested your patch from #Comment 147 with a mainline 4.6 kernel on
a MacBookPro11,4. 

Booted with :

    dyndbg='file drivers/pci/probe.c +p;file drivers/pci/setup-bus.c +p;file drivers/pci/setup-res.c +p' initcall_debug disable_mode=1


After reboot, not from a full poweroff, the Wireless does not work. See attachement below of dmesg.


Suspend works fine disabling LID0 and wireless works fine after back from suspend.
Comment 160 Pablo Catalina 2016-05-29 21:03:29 UTC
Created attachment 218151 [details]
dmesg without disable_mode
Comment 161 Pablo Catalina 2016-05-29 21:04:32 UTC
Created attachment 218161 [details]
dmesg with disable_mode=1 (wireless not working)

Wireless not working. Boot from reboot not from full poweroff.
Comment 162 Pablo Catalina 2016-05-29 21:05:03 UTC
Created attachment 218171 [details]
lspci -t with disable_mode=1 (wireless not working)
Comment 163 Pablo Catalina 2016-05-29 21:05:36 UTC
Created attachment 218181 [details]
lspci -vvxx with disable_mode=1 (wireless not working)
Comment 164 Pablo Catalina 2016-05-29 21:06:28 UTC
Created attachment 218191 [details]
dmesg with disable_mode=1 (wireless working)
Comment 165 Chen Yu 2016-05-30 05:58:06 UTC
(In reply to Pablo Catalina from comment #161)
> Created attachment 218161 [details]
> dmesg with disable_mode=1 (wireless not working)
> 
> Wireless not working. Boot from reboot not from full poweroff.

If without patch #Comment 147 applied, does this problem still exist or goes away?
Comment 166 Chen Yu 2016-05-30 10:31:13 UTC
Anyway, I've sent out a RFC patch to the pci mailing list and wait for their comment/suggestions.
Comment 168 Pablo Catalina 2016-05-30 14:33:41 UTC
@Chen Yu

I had not been able to reproduce the wireless problem on #comment 162, so not sure why the wireless stopped working. Problably another problem.


The patch on #comment 167 only works to poweroff the system, I cannot suspend even if disabled LID0 on the wakeup events.

@pkozlov can you check it?
Comment 169 Chen Yu 2016-05-30 16:36:17 UTC
@Pablo Catalina 
For suspend verification, how about boot kernel with init=/bin/bash?
Comment 170 pkozlov 2016-05-30 16:39:53 UTC
@Pablo Catalina, well, suspend worked for me - but not perfect. I think it's better to create additional bug, not sure if it's related to this one.

Suspend/resume worked for me, but thunderbolt-ethernet adapter stopped working after that. I think it can be not the only one thing which was broken.
Comment 171 Chen Yu 2016-05-31 03:36:05 UTC
btw, do we have any tool like lspci in Mac OS? we can have a check what Mac OS behaves for this pci bridge.
Comment 172 Chen Yu 2016-05-31 07:15:05 UTC
Hi guys, do you have a chance to also test patch from Yinghai:

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index ee72ebe..d3ec833 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)

 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);

+static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
+{
+       dev->is_hotplug_bridge = 0;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
quirk_hotplug_bridge_skip);
+
 /*
  * This is a quirk for the Ricoh MMC controller found as a part of
  * some mulifunction chips.
Comment 173 Cédric Le Goater 2016-05-31 07:17:03 UTC
As a replacement patch ? or an add-on one ?
Comment 174 Chen Yu 2016-05-31 07:18:23 UTC
replacement
Comment 175 berglh 2016-05-31 08:25:54 UTC
@Chen Yu

Letting you know I tested patch from Comment #147 and dyndbg='file drivers/pci/probe.c +p;file drivers/pci/setup-bus.c +p;file drivers/pci/setup-res.c +p' initcall_debug disable_mode=1 on 4.7-rc1 kernel. I also "echo LID0 > /proc/acpi/wakeup".

I was able to shutdown and suspend to RAM also worked well. I have an issue with wifi but it seems to exist on a cold boot, I'm thinking it's a separate issue.

I will try with replacement section, just to clarify, do not use the patch from #147 at all and only the patch from Yinghai? Or the patch for the the other files but replace quirks.c section with patch from Yinghai?
Comment 176 Chen Yu 2016-05-31 08:29:52 UTC
Only use Yinghai's solution without any other patches applied.
Comment 177 Cédric Le Goater 2016-05-31 08:46:50 UTC
So, poweroff is fine. Suspend/resume worked once. I've had a 
couple a spontaneous resume. That might also be the case with
the other patch though and it might be another issue. 

You need to disable LID0 in any case.
Comment 178 berglh 2016-05-31 10:31:05 UTC
@Chen Yu

I applied only the quirk.c patch from Yinghai Comment #172 to v4.7rc1 with no kernel boot options.

It did allow both suspend to RAM as well as correct shutdown, this was with limited testing (suspend a couple of times and shutdown). I'm not sure how to retrieve my previous boot dmesg to include shutdown information, but can provide you my current one if you desire. I'll attach lspci output.

I did have some issues with gnome shell interface during this time, so it's possible that I'm fighting some issues with the rc1 release and wifi driver which is giving me strange behaviour (unsurprising).

I will try Yinghai's patch with 4.6 and 4.5 where I had stable wifi operation and report back.
Comment 179 berglh 2016-05-31 10:33:36 UTC
Created attachment 218361 [details]
lspci-vvxx-comment-172
Comment 180 berglh 2016-05-31 11:51:58 UTC
@Chen Yu

Quirk.c patch from Comment #172 to v4.5 with no kernel options. (There was a small hunk offset for the patch, but applied fine)

Have suspended a couple of times and shutdown once no major issues detected. This build has a stable wifi driver which survives the suspend to ram. I might try 4.6 but did have some problems with the wifi driver there as well testing one of the previous patches.
Comment 181 Pablo Catalina 2016-05-31 12:24:07 UTC
Hi, The patch from #comment 172 works if nothing is plugged to the laptop.

When I plug a thunderbolt Ethernet it came back from suspend without entering in S3 fully.

I noted that, I have to disable LID0 and XHC1 to suspend the mbp 14,1. And it works fine even if thunderbolt is enabled.

Wifi works fine after back from suspend.
Comment 182 Pablo Catalina 2016-05-31 12:37:39 UTC
Created attachment 218371 [details]
lspci -vvxx with patch on #comment 172
Comment 183 berglh 2016-05-31 21:07:01 UTC
I can confirm with USB ethernet plugged in without LID0 disabled, the laptop returns from suspend within 10 seconds, similar to Pablo.

If I disable LID0, it does sleep with the USB ethernet plugged in. I can try with thunderbolt later, but suspect the result will be the same.

I did manage to have the MacBook in suspend to ram over night and it resumed correctly with wifi returning.
Comment 184 Tony L. 2016-06-01 01:12:29 UTC
Confirmed latest patch is working for poweroff, suspend and resume. Just had another confirmationt of success:
https://bbs.archlinux.org/viewtopic.php?pid=1631202#p1631202

Any readers running Arch Linuc can find a turn-key package here:
https://aur.archlinux.org/packages/linux-macbook-pro

Thanks to everyone for their work on this!
Comment 185 Cédric Le Goater 2016-06-01 10:22:22 UTC
All is working fine on a MacBookPro11,5 with a 4.5 kernel + patch 
from Comment #172. I did not have to disable LID0 but I had to unplug 
thunderbolt to get suspend working.
Comment 186 Pablo Catalina 2016-06-01 10:48:00 UTC
Hi again,

just another comment, if I suspend to disk (aka hibernation) it freezes after loading the image. The following message is showed before freeze:

smboot: CPU1 is now offline
smboot: CPU2 is now offline
smboot: CPU3 is now offline
smboot: CPU4 is now offline
smboot: CPU5 is now offline
smboot: CPU6 is now offline
smboot: CPU7 is now offline


Not sure if a problem with the disk encryption, I'll try to debug it. But I would like to let a comment here about that, because it could be related with this bug.
Comment 187 Tom B 2016-06-01 13:06:56 UTC
Off topic but thank you so much for your work on this!
Comment 188 johannes.stuettgen 2016-06-01 13:09:38 UTC
Created attachment 218621 [details]
attachment-25612-0.html

Tested the arch linux package with these patches and power off and suspend
work like a charm on my mbp 11,5 thanks to everyone involved.

On Wed, Jun 1, 2016, 15:07 <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=103211
>
> --- Comment #187 from Tom B <tom@r.je> ---
> Off topic but thank you so much for your work on this!
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
>
Comment 189 Chen Yu 2016-06-01 13:35:34 UTC
could someone please boot into Mac Os, and provide:
ioreg -l
Comment 190 Cédric Le Goater 2016-06-01 13:45:32 UTC
Created attachment 218631 [details]
ioreg for MacBookPro11,5
Comment 191 Pablo Catalina 2016-06-01 13:57:17 UTC
Created attachment 218641 [details]
ioreg -l output mbp 11,4
Comment 192 Hermann Mayer 2016-06-01 14:09:09 UTC
I can also confirm the success of the comment #172 patch on mbp11,4! Poweroff and suspend are working well. Just had some random resumes after I unplugged all stuff (thunderbold ethernet, displayport monitor, usb mouse/keyboard on hub, audio port). 

Many thanks also from me for your good work! :)
Comment 193 Christopher Broome 2016-06-01 16:23:46 UTC
I was able to successfully suspend and poweroff as well on my MacBook Pro 11,4.

Like some others have reported, my wired Thunderbolt Ethernet doesn't work if I suspend with it plugged in. If I unplug it before suspend, then suspend, then resume, then plug it back in, it works. Here's a diff of lspci output from before and after suspend with the Thunderbolt Ethernet adapter plugged in:

26,28c26,28
< 09:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge]
< 0a:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus Ridge]
< 0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM57762 Gigabit Ethernet PCIe
---
> 09:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus
> Ridge] (rev ff)
> 0a:00.0 PCI bridge: Intel Corporation DSL3510 Thunderbolt Controller [Cactus
> Ridge] (rev ff)
> 0b:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM57762 Gigabit
> Ethernet PCIe (rev ff)
Comment 194 Christopher Broome 2016-06-01 16:28:25 UTC
(In reply to Christopher Broome from comment #193)
> I was able to successfully suspend and poweroff as well on my MacBook Pro
> 11,4.

I should note that this was with the patch applied from comment #172 applied to git tag v4.6 and with all options except XHC1 in /proc/acpi/wakeup disabled.
Comment 195 Chen Yu 2016-06-01 17:04:02 UTC
@Pablo Catalina @Cédric Le Goater 
It seems ioreg does not show the io/mem resource for a pci bridge,
we have to use lspci.
would you please download pciutil and compile it in Mac OS and then provide
lspci -vvxx

ftp://atrey.karlin.mff.cuni.cz/pub/linux/pci/pciutils-3.5.1.tar.gz
tar xvjf pciutils-3.5.1.tar.gz
make OPT="-O2 -arch i386 -arch x86_64" LDFLAGS="-arch i386 -arch x86_64" install install-lib

I found how to use lspci on Mac at  http://blog.malchuk.ru/2015/09/26/260
Comment 196 Cédric Le Goater 2016-06-02 09:12:00 UTC
If someone knows how to get (simply) a gcc on macosx, I would
be very to do so. Xcode looks much too big. 

Thanks,
Comment 197 o587914 2016-06-02 09:52:05 UTC
I installed pciutils, but it won't run.

$ sudo lspci -vvxx
pcilib: Cannot open AppleACPIPlatformExpert (add boot arg debug=0x144 & run as root)
lspci: Cannot find any working access method.

I set the boot argument by running the following command (I also rebooted of cause).

$ sudo nvram boot-args="debug=0x144"
$ sudo nvram boot-args              
boot-args	debug=0x144

Googling didn't help yet. Any hints welcome.
Comment 198 Niels Dettenbach 2016-06-02 11:00:41 UTC
Hiho,

i can confirm the following changes on MacBookPro11,5 (MBP114.88Z.0172.B07) with Gentoo Linux Patchset:
 - power off
 - suspend 2 ram works 

for this i did:
 - in GRUB CMDLINE += nomodeset disable_mode=1 
 - i have to remove the module thunderbolt and thunderbolt drivers plus pull any devices before suspend - resuming thunderbolt seems an longer known problem (even under MacOSX are many third party thunderbolt devices which are not accepting suspend by specs).
 - avoiding black screen after resume of my FGLRX (ATI) works with:
  * vbetool
  * USuspendRamVbeSave no
  * USuspendRamVbePost no
  * USuspendRamAcpiSleep 3

(may be work with less, just tested so far)

WIFI seems to work (but i have another problem with the brcsfmac driver which regresses for 60s during any firmware load timeout the firmware group seems working on att) - so i need to restart net.wlan0 after resume until now.

many thanks for your work!

best regards,


Niels.
Comment 199 Kamal Mostafa 2016-06-02 16:11:34 UTC
(In reply to Chen Yu from comment #172)
> Hi guys, do you have a chance to also test patch from Yinghai:

Chen Yu and Yinghai, would it be possible to repost the comment #172 patch with a From: author, title, description, and author's Signed-off-by line (so that distros could consider it for inclusion)?  Thanks!  -Kamal
Comment 200 Naftuli Tzvi Kay 2016-06-02 16:29:32 UTC
Yes, Chen and Yinghai, could we roll up the required changes and patches into a single description? I've been monitoring this thread for quite some time now and would like to backport the patch to other kernel versions for distributions.

My understanding is that we need to apply a patch and recompile the kernel, and then disable LID0 wakeups. Are there any more steps?
Comment 201 Niels Dettenbach 2016-06-03 11:22:17 UTC
@Naftuli Tzvi Kay: 
With disabeled LID0 in /proc/acpi/wakeup my MBP11,5 did not resume when opening the LID of the MBP - i had to enable it with echo "LID0 > /proc/acpi/wakeup" again which seems to work OK for me: suspend to ram and resume over the LID switch works as expected.

So i'm not shure how/why it seems required to some others here...
Comment 202 Niels Dettenbach 2016-06-03 11:31:13 UTC
(In reply to Niels Dettenbach from comment #201)
> @Naftuli Tzvi Kay: 
> So i'm not shure how/why it seems required to some others here...

...but it seems i have to disable XHC1 to get no problems with some USB devices to suspend. Here is my (working) /proc/acpi/wakeups to clearify:

Device  S-state   Status   Sysfs node
PEG0      S3    *disabled  pci:0000:00:01.0
GFX0      S3    *disabled  pci:0000:01:00.0
PEG1      S3    *disabled  pci:0000:00:01.1
PEG2      S3    *disabled  pci:0000:00:01.2
EC        S4    *disabled  platform:PNP0C09:00
GMUX      S3    *disabled  pnp:00:03
HDEF      S3    *disabled  pci:0000:00:1b.0
RP03      S3    *disabled  pci:0000:00:1c.2
ARPT      S4    *enabled   pci:0000:04:00.0
RP04      S3    *disabled  pci:0000:00:1c.3
XHC1      S3    *disabled  pci:0000:00:14.0
ADP1      S4    *disabled  platform:ACPI0003:00
LID0      S4    *enabled   platform:PNP0C0D:00

...otherwise, sorry for the noise...
Comment 203 Vlad Lesin 2016-06-05 05:49:41 UTC
I have tested the patch from comment #172 on MacBook Pro 11,5 and kernel 4.4.0. Suspend to RAM and power off work good, as well as WiFi after resuming.
Comment 204 junkmail-trash 2016-06-05 15:41:06 UTC
Patch from #172 applied cleanly with some fuzzing to kernel 4.5.3.  Suspend/Resume/Poweroff/Reboot works great on 11,4 with a Arch linux change suggested here:

https://aur.archlinux.org/pkgbase/linux-macbook-pro/

> Create /etc/tmpfiles.d/wakeup.conf with the contents: w /proc/acpi/wakeup - -
> - - LID0

One problem I am running into is the tg3 thunderbolt ethernet adapter will not exit the D3 state after resume if it is left plugged in during suspend.  This persists even after removing/adding the adapter and modules.  Removing the adapter and modules and going through another suspend/resume cycle resolves the issue, as does removing the adapter prior to suspend.

Thanks for all your help Chen Yu!
Comment 205 arjen.veenhuizen 2016-06-07 08:17:13 UTC
Patch from #172 applied to kernel 4.5.0 on a MBP 11,5 running Xubuntu 14.04.4 x64 using this tutorial: http://www.cyberciti.biz/faq/debian-ubuntu-building-installing-a-custom-linux-kernel/

Glad to report that suspend, reboot and shutdown work now. Thanks Chen Yu and all who helped!
Comment 206 thejoe 2016-06-07 17:18:17 UTC
Created attachment 219321 [details]
lscpi from MacOS

lspci output from following these instructions to install lspci: https://rampagedev.wordpress.com/more-guides/use-lspci-for-info/
Comment 207 Naftuli Tzvi Kay 2016-06-09 02:04:24 UTC
Any idea on when this will get merged into mainline kernel?
Comment 208 arjen.veenhuizen 2016-06-10 05:48:18 UTC
In addition to my comment in #205, I had to disable ARPT for suspend to work consistently after the first suspend. Without it disabled, it would resume after 5 to 10 seconds (MBP 11,5, 4.5.0 patched with #172) after the first time it suspended.

$ cat /proc/acpi/wakeup

Device	S-state	  Status   Sysfs node
PEG0	  S3	*disabled  pci:0000:00:01.0
GFX0	  S3	*disabled  pci:0000:01:00.0
PEG1	  S3	*disabled  pci:0000:00:01.1
PEG2	  S3	*disabled  pci:0000:00:01.2
EC	  S4	*disabled  platform:PNP0C09:00
GMUX	  S3	*disabled  pnp:00:03
HDEF	  S3	*disabled  pci:0000:00:1b.0
RP03	  S3	*disabled  pci:0000:00:1c.2
ARPT	  S4	*disabled  pci:0000:04:00.0
RP04	  S3	*disabled  pci:0000:00:1c.3
XHC1	  S3	*disabled  pci:0000:00:14.0
ADP1	  S4	*disabled  platform:ACPI0003:00
LID0	  S4	*disabled  platform:PNP0C0D:00

Use sudo sh -c 'echo 'ARPT' > /proc/acpi/wakeup' to toggle the property.
Comment 209 Berthold Crysmann 2016-06-10 12:04:51 UTC
I successfully applied the patch to kernel 4.6.1 on MBP 11,5. (Ubuntu 16.04 with mainline kernel). Thanks for the patch. I am now back to the OS I have been using ever since 1993.

There is a remaining problem with thunderbolt (see below).

(In reply to junkmail-trash from comment #204)
> Patch from #172 applied cleanly with some fuzzing to kernel 4.5.3. 
> Suspend/Resume/Poweroff/Reboot works great on 11,4 with a Arch linux change
> suggested here:
> 
> https://aur.archlinux.org/pkgbase/linux-macbook-pro/
> 
> > Create /etc/tmpfiles.d/wakeup.conf with the contents: w /proc/acpi/wakeup -
> - - - LID0
> 
> One problem I am running into is the tg3 thunderbolt ethernet adapter will
> not exit the D3 state after resume if it is left plugged in during suspend. 
> This persists even after removing/adding the adapter and modules.  Removing
> the adapter and modules and going through another suspend/resume cycle
> resolves the issue, as does removing the adapter prior to suspend.
> 
> Thanks for all your help Chen Yu!

I cannot suspend with the thunderbolt ethernet adapter plugged in. After resume the device is gone.

If I manually unload the kernel module (modprobe -r thunderbolt) and manually reinsert it after resume, the device is usable. 

Strangely enough, trying to automate this with the help of SUSPEND_MODULES produces a hang during resume. 

I also tried with an external monitor, and behaviour is similar. 

What works for external display is the following procedure:

1. Deactivate external display (e.g. System Setting|Displays)
2. Unload thunderbolt manually
3. Suspend

I can then cleanly resume, load the driver and activate the external display. 

Alternative for the network adapter is to yank it out before suspend. Works with display adapter when external display has been disabled. 


Cheers, 

B
Comment 210 Berthold Crysmann 2016-06-10 12:08:50 UTC
Just tested once more with the VGA adapter: yanking out prior to suspend worked once. Shall provide more info as I keep using it. 

B
Comment 211 Berthold Crysmann 2016-06-10 12:12:46 UTC
(In reply to Berthold Crysmann from comment #210)
> Just tested once more with the VGA adapter: yanking out prior to suspend
> worked once. Shall provide more info as I keep using it. 
> 
> B

I meant: just yank it out, without deactivating the external display beforehand.

B
Comment 212 Naftuli Tzvi Kay 2016-06-10 20:43:01 UTC
I have PM suspend the thunderbolt module on sleep:

/etc/pm/config.d/modules:
SUSPEND_MODULES="thunderbolt"

Be advised that if you disable any GPE ACPI interrupts, you'll need to restore them before suspending. If I do this:

echo disable | /sys/firmware/acpi/interrupts/gpe17

The laptop will not suspend. The above command disables a certain ACPI interrupt from firing a lot and triggering a lot of CPU wakeups. In order to get it to suspend, I have to enable the interrupt immediately before suspending and disable the interrupt on resume:

/etc/pm/sleep.d/30_disable_gpe17:
#!/bin/bash

source "/usr/lib/pm-utils/pm-functions"

gpe17="/sys/firmware/acpi/interrupts/gpe17"

function suspend_action() {
    logger -s "Enabling GPE 17 before sleep."
    echo enable | tee $gpe17 >/dev/null
    return 0
}

function resume_action() {
    logger -s "Disabling GPE 17 on resume from sleep."
    echo disable | tee $gpe17 >/dev/null
    return 0
}

case "$1" in
    hibernate|suspend)
        suspend_action
        ;;
    thaw|resume)
        resume_action
        ;;
    *) 
        exit $NA
        ;;
esac
Comment 213 Justin Dray 2016-06-10 21:10:19 UTC
I'm now up to 4 days uptime, suspended 1-3 times a day without issue. Both with and without monitors attached (triple screen, 2x display port to DVI and a HDMI to DVI) without issues.

I only disabled XHC1, nothing else, and I only one time did the laptop wake up after putting it to sleep, but it went back to sleep after a few seconds anyway. 

USB Ethernet also works fine.  So it seems the only remaining issue is thunderbolt adapters that are broken during suspend.
Comment 214 Naftuli Tzvi Kay 2016-06-15 01:21:56 UTC
I've had WiFi break, but interestingly enough, another subsequent suspend/resume fixed it.
Comment 215 Pablo Catalina 2016-06-15 08:16:54 UTC
Hi,

After few days with the patch from #comment 172:

* Poweroff is working fine
* Suspend is working fine. I did not disabled LID0 wakeups and I suspend closing the LID. The problem of the wake up appears when you suspend the laptop without closing the LID.
* Wakeup from thunderbolt only happens if the thunderbolt is Ethernet. I  tried with Ethernet and DVI. DVI thunderbolt does not wake up the laptop.

* With kernel 4.6.0, when back from suspend sometimes the screen appears black and I have to go to switch TTY in order to recover the X display. It is another bug from i915 and seems to be fixed on 4.6.2 (I just compiled it and I have to test it more time in order to confirm that)

Thank you very much!

Pablo

NOTE: I attached some data from my laptop above, but I'm using a MacBook Pro 15'' (hardware version MacBookPro11,4)
Comment 216 Justin Dray 2016-06-15 12:10:42 UTC
@Pablo, are you sure it was a thunderbolt to DVI adapter? Display port is the same port, and as far as I know no-one makes thunderbolt to DVI adapters, so it wouldn't be using the thunderbolt bus. Unless it was one of those all-in-one thunderbolt to DVI/VGA/etc plugs.
Comment 217 Pablo Catalina 2016-06-15 12:53:25 UTC
@justin, sorry, yes it is DVI, no thunderbolt. I was wrong because the connector is the same and it is pluged on the same port of the laptop than the Thunderbolt.

So I did not tested thunderbolt with suspend except with the Ethernet which is thunderbolt and make the system to wake up after suspend.
Comment 218 Dj 2016-07-07 13:46:41 UTC
Just confirmed that this fixed issues for me (mostly). I have an MBP 11,4 and compiled the patch within v4.4.0-28-generic on Ubuntu 16.04. Shutdown works correctly, and standby works without freezing. When I did this last night, the wireless driver stopped working on resume from standby. However, I can't reproduce it again. I'll subscribe to this and provide an update if I notice anything else.

I'm also keeping track of this in an AskUbuntu question, with more details:
http://askubuntu.com/questions/672750/standby-and-shutdown-hang-on-macbook-pro-11-4

Thanks to everyone who worked on this!
Comment 219 Chandler Melton 2016-07-09 13:55:38 UTC
I can confirm this resolved my issue using a MBP. Applied the fix to 4.4.0-28-generic on Linux Mint 18.
Comment 220 Luis de Bethencourt 2016-07-22 11:01:46 UTC
An other success story after applying the patch in #comment 172

I don't see the patch in linux-next, which I build and run almost on a daily basis for testing different things. Can we push this to mainline?

Thanks Chen Yu and Yinghai for the patch, and everyone else who contributed :)
Comment 221 Chen Yu 2016-07-22 14:49:03 UTC
This patch is a workaround but not a fix for the root cause, previously I got some suggestions on how to debug on this issue and might need your help later to continue, thanks.
Comment 222 Luis de Bethencourt 2016-07-22 15:02:35 UTC
Sure! Happy to test this.

If you don't have time for a correct patch, I am happy to help with the code as well.
Comment 223 Jonathan Dunlap 2016-07-24 18:53:47 UTC
Patch in #comment 172 also fixed my sleep/suspend issue for my 2015 Macbook Pro 15" running Xubuntu with kernel 4.6.4 from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6.4/


It would be great to get this patch or the core fix in for 4.7 if possible.
Comment 224 Luis de Bethencourt 2016-07-24 19:46:35 UTC
(In reply to Jonathan Dunlap from comment #223)
> 
> It would be great to get this patch or the core fix in for 4.7 if possible.

That would be tough considering 4.7 is planned to come out today or tomorrow.
Comment 225 Paulo 2016-07-31 15:16:38 UTC
Now that we are into 4.8 window, was the patch in #comment 172 accepted into mainline?
Comment 226 Bjorn Helgaas 2016-08-01 13:31:05 UTC
(In reply to Paulo from comment #225)
> Now that we are into 4.8 window, was the patch in #comment 172 accepted into
> mainline?

No.  The comment 172 patch is not appropriate for mainline.  That patch disables hotplug for one of the root ports of an Intel Haswell chipset.  On the Macbook Pro, this port seems to be unused, so disabling hotplug doesn't seem to hurt anything.

But the same chipset is also used in other systems, and hotplug may be important in them, so we can't blindly disable hotplug on that port.

There was some discussion of this here: http://lkml.kernel.org/r/1464604404-11257-1-git-send-email-yu.c.chen@intel.com

That thread also included several suggestions about how to debug the real problem, but I haven't seen any indication that anybody looked into them.

If somebody comes up with a fix, we can merge it at any time; it's a bug fix, so we don't have to wait for a merge window.
Comment 227 Cyrille Bartholomée 2016-08-01 20:39:43 UTC
Applied patch in comment 172 to kernel 4.4.0-28 and 4.4.0-31 on Ubuntu 16.04. Issue remains. If additional info or help is needed/wanted please ask.
Comment 228 Jonathan Dunlap 2016-08-01 20:44:36 UTC
(In reply to Cyrille Bartholomée from comment #227)
> Applied patch in comment 172 to kernel 4.4.0-28 and 4.4.0-31 on Ubuntu
> 16.04. Issue remains. If additional info or help is needed/wanted please ask.

Try also using Comment #212's solution, and the below change...

Add to the file /etc/pm/config.d/00sleep_module (using sudo or as root) this line:

SUSPEND_MODULES="sky2"
Comment 229 Bjorn Helgaas 2016-08-01 20:46:36 UTC
(In reply to Cyrille Bartholomée from comment #227)
> Applied patch in comment 172 to kernel 4.4.0-28 and 4.4.0-31 on Ubuntu
> 16.04. Issue remains. If additional info or help is needed/wanted please ask.

Thanks, Cyrille.  Can you please attach a complete dmesg log and output of "sudo lspci -vv" and "lspci -n"?  The dmesg log might help tune a DMI-based quirk, and lspci might help us figure out why the comment 172 patch works on some machines but not on yours.
Comment 230 Chen Yu 2016-08-02 07:04:15 UTC
(In reply to Bjorn Helgaas from comment #226)
> (In reply to Paulo from comment #225)
> > Now that we are into 4.8 window, was the patch in #comment 172 accepted
> into
> > mainline?
> 
> No.  The comment 172 patch is not appropriate for mainline.  That patch
> disables hotplug for one of the root ports of an Intel Haswell chipset.  On
> the Macbook Pro, this port seems to be unused, so disabling hotplug doesn't
> seem to hurt anything.
> 
> But the same chipset is also used in other systems, and hotplug may be
> important in them, so we can't blindly disable hotplug on that port.
> 
> There was some discussion of this here:
> http://lkml.kernel.org/r/1464604404-11257-1-git-send-email-yu.c.chen@intel.
> com
> 
> That thread also included several suggestions about how to debug the real
> problem, but I haven't seen any indication that anybody looked into them.
> 
> If somebody comes up with a fix, we can merge it at any time; it's a bug
> fix, so we don't have to wait for a merge window.
Hi,Bjorn, thanks for your suggestions, and I was a little busy at that time, as I replied at #Comment 221 , I would return to this ticket and dig into more detail as you suggested.
Comment 231 Chen Yu 2016-08-02 07:05:51 UTC
Paste the suggestion here for reference
https://patchwork.kernel.org/patch/9143637/
Comment 232 Cyrille Bartholomée 2016-08-02 13:12:47 UTC
Created attachment 227281 [details]
dmesg.txt

On 01-08-16 22:46, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=103211
>
> --- Comment #229 from Bjorn Helgaas <bhelgaas@google.com> ---
> (In reply to Cyrille Bartholomée from comment #227)
>> Applied patch in comment 172 to kernel 4.4.0-28 and 4.4.0-31 on Ubuntu
>> 16.04. Issue remains. If additional info or help is needed/wanted please
>> ask.
> Thanks, Cyrille.  Can you please attach a complete dmesg log and output of
> "sudo lspci -vv" and "lspci -n"?  The dmesg log might help tune a DMI-based
> quirk, and lspci might help us figure out why the comment 172 patch works on
> some machines but not on yours.
>
Hi Bjorn. You can find the logs attached. Hope it helps. Whenever 
something needs to be tested don't hesitate to ask.
Comment 233 Cyrille Bartholomée 2016-08-02 13:12:50 UTC
Created attachment 227291 [details]
lspci-n.txt
Comment 234 Cyrille Bartholomée 2016-08-02 13:12:50 UTC
Created attachment 227301 [details]
lspci-vv.txt
Comment 235 Cyrille Bartholomée 2016-08-03 18:52:57 UTC
I finally managed to get the patch working! Used tutorial mentioned in comment 205. (failed attempt: https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel) Apologies for the disturbance in this thread :-S
Comment 236 Leo Ufimtsev 2016-08-08 18:01:24 UTC
(In reply to Chen Yu from comment #172)
> Hi guys, do you have a chance to also test patch from Yinghai:
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index ee72ebe..d3ec833 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)
> 
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
> 
> +static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
> +{
> +       dev->is_hotplug_bridge = 0;
> +}
> +
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
> quirk_hotplug_bridge_skip);
> +
>  /*
>   * This is a quirk for the Ricoh MMC controller found as a part of
>   * some mulifunction chips.

I haven't read the whole thread, just fyi:

I have tested this patch:
http://copr-dist-git.fedorainfracloud.org/cgit/asamalik/MacBook-kernel/kernel.git/tree/macbook-sleep.patch?h=f24
On my macbook Pro retina display (2016), it fixed my shutdown /reboot problem.
Comment 237 Chen Yu 2016-08-09 07:02:06 UTC
(In reply to Leo Ufimtsev from comment #236)
> (In reply to Chen Yu from comment #172)
> > Hi guys, do you have a chance to also test patch from Yinghai:
> > 
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index ee72ebe..d3ec833 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -2775,6 +2775,13 @@ static void quirk_hotplug_bridge(struct pci_dev
> *dev)
> > 
> >  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020,
> quirk_hotplug_bridge);
> > 
> > +static void quirk_hotplug_bridge_skip(struct pci_dev *dev)
> > +{
> > +       dev->is_hotplug_bridge = 0;
> > +}
> > +
> > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
> > quirk_hotplug_bridge_skip);
> > +
> >  /*
> >   * This is a quirk for the Ricoh MMC controller found as a part of
> >   * some mulifunction chips.
> 
> I haven't read the whole thread, just fyi:
> 
> I have tested this patch:
> http://copr-dist-git.fedorainfracloud.org/cgit/asamalik/MacBook-kernel/
> kernel.git/tree/macbook-sleep.patch?h=f24
> On my macbook Pro retina display (2016), it fixed my shutdown /reboot
> problem.

Hi, I can not open above link.. could you please provide your /proc/iomem after the patch applied?
Comment 238 Chen Yu 2016-08-09 07:41:57 UTC
Hi, all, could you help confirm if it still works after executing the following commands (with the patch applied):
1.
setpci -s 0000:00:1c.0 MEMORY_BASE
setpci -s 0000:00:1c.0 MEMORY_LIMIT
setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT

2.
setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040

3.
setpci -s 0000:00:1c.0 MEMORY_BASE
setpci -s 0000:00:1c.0 MEMORY_LIMIT
setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT

4.
then check if poweroff works for you?


The reason why doing such test:
based on previous discussion, there might be  an unreported device which is using the memory aperture of [7fa00000-7fbfffff] and [7fc00000-7fdfffff pre].
as according to iomem map in 
https://bugzilla.kernel.org/attachment.cgi?id=210621 and
https://bugzilla.kernel.org/attachment.cgi?id=210601
we can change the window for 1c.0 to
[f0000000-f01fffff] and [f0200000 - f03fffff pre] respectively, since no one is using this mem region. If it still works, then it desmonstrate there is a potential device using the original address and results in a confliction. If not work, then it implies there might be a firmware/hw issue for this bridge, we should not allocate memory for it.

Anyway since this platform has many firmware issues, such as conflict in MMCFG with e820/_CRS for root pci bus, a dmi+quick would be more acceptable. 

5. So, please also upload your dmidecode full messages.
Comment 239 Tom B 2016-08-15 11:10:22 UTC
I can't seem to get this patch to work. I applied it using the `linux-macbook` archlinux package ( https://aur.archlinux.org/packages/linux-macbook/ ) and followed the instructions (added disable_mode=1 to the kernel boot option and `w /proc/acpi/wakeup - - - - LID0` to `/etc/tmpfiles.d/wakeup.conf` shutdown and reboot now work as expected but suspend does not. It still hangs and overheats when either closing the lid or using systemd to suspend. Is there anything else I need to do to get this working?

This is a Macbook pro retina 11,4.
Comment 240 Chen Yu 2016-08-15 11:19:43 UTC
(In reply to Tom B from comment #239)
> I can't seem to get this patch to work. I applied it using the
> `linux-macbook` archlinux package (
> https://aur.archlinux.org/packages/linux-macbook/ ) and followed the
> instructions (added disable_mode=1 to the kernel boot option and `w
> /proc/acpi/wakeup - - - - LID0` to `/etc/tmpfiles.d/wakeup.conf` shutdown
> and reboot now work as expected but suspend does not. It still hangs and
> overheats when either closing the lid or using systemd to suspend. Is there
> anything else I need to do to get this working?
> 
> This is a Macbook pro retina 11,4.

The suspend might be another issue,  could you upload your lspci -vvxx and check the instructions at #Comment 238 please?
Comment 241 Tom B 2016-08-15 11:28:52 UTC
Created attachment 228811 [details]
lspci -vvxx Macbook Pro 11,4
Comment 242 Bastian Triller 2016-08-15 11:33:01 UTC
(In reply to Chen Yu from comment #238)
> Hi, all, could you help confirm if it still works after executing the
> following commands (with the patch applied):
> 1.
> setpci -s 0000:00:1c.0 MEMORY_BASE
> setpci -s 0000:00:1c.0 MEMORY_LIMIT
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> 

root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
fff0
root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
0000
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
fff1
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
0001

> 2.
> setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
> setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040
> 
> 3.
> setpci -s 0000:00:1c.0 MEMORY_BASE
> setpci -s 0000:00:1c.0 MEMORY_LIMIT
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> 

root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
f000
root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
f020
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
f021
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
f041

> 4.
> then check if poweroff works for you?

poweroff still works
Comment 243 Bastian Triller 2016-08-15 11:34:20 UTC
Created attachment 228821 [details]
dmidecode after setpci
Comment 244 Chen Yu 2016-08-15 11:36:14 UTC
(In reply to Bastian Triller from comment #242)
> (In reply to Chen Yu from comment #238)
> > Hi, all, could you help confirm if it still works after executing the
> > following commands (with the patch applied):
> > 1.
> > setpci -s 0000:00:1c.0 MEMORY_BASE
> > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > 
> 
> root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
> fff0
> root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
> 0000
> root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> fff1
> root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> 0001
> 
> > 2.
> > setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
> > setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
> > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
> > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040
> > 
> > 3.
> > setpci -s 0000:00:1c.0 MEMORY_BASE
> > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > 
> 
> root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
> f000
> root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
> f020
> root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> f021
> root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> f041
> 
> > 4.
> > then check if poweroff works for you?
> 
> poweroff still works

Great, thanks
Comment 245 Tom B 2016-08-15 11:40:23 UTC
> 1.
> setpci -s 0000:00:1c.0 MEMORY_BASE
> setpci -s 0000:00:1c.0 MEMORY_LIMIT
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT

Poweroff no longer works. hangs on "Power down"


> 2.
> setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
> setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040

Poweroff works


> 3
> setpci -s 0000:00:1c.0 MEMORY_BASE
> setpci -s 0000:00:1c.0 MEMORY_LIMIT
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT

Poweroff no longer works. hangs on "Power down"




If I run all 3 sets of commands and then Poweroff works
Comment 246 Chen Yu 2016-08-15 11:44:53 UTC
(In reply to Tom B from comment #245)
> > 1.
> > setpci -s 0000:00:1c.0 MEMORY_BASE
> > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> 
> Poweroff no longer works. hangs on "Power down"
> 
> 
> > 2.
> > setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
> > setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
> > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
> > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040
> 
> Poweroff works
> 
> 
> > 3
> > setpci -s 0000:00:1c.0 MEMORY_BASE
> > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> 
> Poweroff no longer works. hangs on "Power down"
> 
> 
> 
> 
> If I run all 3 sets of commands and then Poweroff works

Thanks. Did you test without the patch applied?
If without patch applied, then this demonstrate there is resource conflict at the original address.
Comment 247 Bastian Triller 2016-08-15 11:48:04 UTC
(In reply to Chen Yu from comment #244)
> (In reply to Bastian Triller from comment #242)
> > (In reply to Chen Yu from comment #238)
> > > Hi, all, could you help confirm if it still works after executing the
> > > following commands (with the patch applied):
> > > 1.
> > > setpci -s 0000:00:1c.0 MEMORY_BASE
> > > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > > 
> > 
> > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
> > fff0
> > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > 0000
> > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > fff1
> > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > 0001
> > 
> > > 2.
> > > setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
> > > setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
> > > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
> > > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040
> > > 
> > > 3.
> > > setpci -s 0000:00:1c.0 MEMORY_BASE
> > > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > > 
> > 
> > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
> > f000
> > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > f020
> > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > f021
> > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > f041
> > 
> > > 4.
> > > then check if poweroff works for you?
> > 
> > poweroff still works
> 
> Great, thanks

Oops, I maybe misunderstood your instructions. I only tested poweroff after executing all commands.
Comment 248 Chen Yu 2016-08-15 11:53:11 UTC
(In reply to Bastian Triller from comment #247)
> (In reply to Chen Yu from comment #244)
> > (In reply to Bastian Triller from comment #242)
> > > (In reply to Chen Yu from comment #238)
> > > > Hi, all, could you help confirm if it still works after executing the
> > > > following commands (with the patch applied):
> > > > 1.
> > > > setpci -s 0000:00:1c.0 MEMORY_BASE
> > > > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > > > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > > > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > > > 
> > > 
> > > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
> > > fff0
> > > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > > 0000
> > > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > > fff1
> > > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > > 0001
> > > 
> > > > 2.
> > > > setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
> > > > setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
> > > > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
> > > > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040
> > > > 
> > > > 3.
> > > > setpci -s 0000:00:1c.0 MEMORY_BASE
> > > > setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > > > setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > > > setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > > > 
> > > 
> > > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
> > > f000
> > > root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
> > > f020
> > > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> > > f021
> > > root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> > > f041
> > > 
> > > > 4.
> > > > then check if poweroff works for you?
> > > 
> > > poweroff still works
> > 
> > Great, thanks
> 
> Oops, I maybe misunderstood your instructions. I only tested poweroff after
> executing all commands.
Yes, you are right, I just want to test poweroff with all the command executed, with patch applied.
Comment 249 Tom B 2016-08-15 11:56:42 UTC
> Thanks. Did you test without the patch applied?
>If without patch applied, then this demonstrate there is resource conflict at
>the  original address.

This is with the patch applied. If it means anything the commands in groups 1. and 3. print some hex numbers to the screen, the commands in group 2 have no output.

> Yes, you are right, I just want to test poweroff with all the command
> executed, with patch applied.

Sorry I wasn't sure so provided the result from all stages.
Comment 250 Bastian Triller 2016-08-15 11:58:40 UTC
(In reply to Chen Yu from comment #238)
> Hi, all, could you help confirm if it still works after executing the
> following commands (with the patch applied):
> 1.
> setpci -s 0000:00:1c.0 MEMORY_BASE
> setpci -s 0000:00:1c.0 MEMORY_LIMIT
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> 

root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
7fa0
root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
7fb0
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
7fc1
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
7fd1

> 2.
> setpci -s 0000:00:1c.0 MEMORY_BASE.W=f000
> setpci -s 0000:00:1c.0 MEMORY_LIMIT.W=f020
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE.W=f020
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT.w=f040
> 
> 3.
> setpci -s 0000:00:1c.0 MEMORY_BASE
> setpci -s 0000:00:1c.0 MEMORY_LIMIT
> setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
> setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
> 

root@grml:~# setpci -s 0000:00:1c.0 MEMORY_BASE
f000
root@grml:~# setpci -s 0000:00:1c.0 MEMORY_LIMIT
f020
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
f021
root@grml:~# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
f041

> 4.
> then check if poweroff works for you?
> 

poweroff also works if the patch isn't applied.
Comment 251 Chen Yu 2016-08-15 12:38:18 UTC
Could you please check if the following one works ? 

Index: linux/drivers/pci/quirks.c
===================================================================
--- linux.orig/drivers/pci/quirks.c
+++ linux/drivers/pci/quirks.c
@@ -2775,6 +2775,15 @@ static void quirk_hotplug_bridge(struct
 
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
 
+static void quirk_apple_mbp_poweroff(struct pci_dev *dev)
+{
+	if (!dmi_match(DMI_BOARD_VENDOR, "Apple Inc.") &&
+	    !dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,4"))
+		dev->is_hotplug_bridge = 0;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
+
 /*
  * This is a quirk for the Ricoh MMC controller found as a part of
  * some mulifunction chips.
Comment 252 Chen Yu 2016-08-15 12:41:28 UTC
Oops, please ignore above patch, check this one below:

Index: linux/drivers/pci/quirks.c
===================================================================
--- linux.orig/drivers/pci/quirks.c
+++ linux/drivers/pci/quirks.c
@@ -2775,6 +2775,15 @@ static void quirk_hotplug_bridge(struct

 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);

+static void quirk_apple_mbp_poweroff(struct pci_dev *dev)
+{
+       if (dmi_match(DMI_BOARD_VENDOR, "Apple Inc.") &&
+           dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,4"))
+               dev->is_hotplug_bridge = 0;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
+
 /*
  * This is a quirk for the Ricoh MMC controller found as a part of
  * some mulifunction chips.
Comment 253 Tom B 2016-08-15 16:11:20 UTC
Apoligies I double checked and my mac is a 11,5. I changed the patch to accommodate `dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,5"))` unfortunately there's still no suspend.
Comment 254 Tom B 2016-08-15 16:25:54 UTC
I'm not sure if this is related but after unplugging the power lead after a clean boot, KDE still reports the battery as "Charging".

Is there any command I can run to display DMI_PRODUCT_NAME and DMI_BOARD_VENDOR to check that it's actually being applied?
Comment 255 Chen Yu 2016-08-16 03:01:11 UTC
(In reply to Tom B from comment #254)
> I'm not sure if this is related but after unplugging the power lead after a
> clean boot, KDE still reports the battery as "Charging".
> 
> Is there any command I can run to display DMI_PRODUCT_NAME and
> DMI_BOARD_VENDOR to check that it's actually being applied?

Yes, could you attach your  lspci -vvxx with this patch applied? How about poweroff?
Comment 256 Chen Yu 2016-08-16 03:02:44 UTC
(In reply to Tom B from comment #254)
> I'm not sure if this is related but after unplugging the power lead after a
> clean boot, KDE still reports the battery as "Charging".
> 
> Is there any command I can run to display DMI_PRODUCT_NAME and
> DMI_BOARD_VENDOR to check that it's actually being applied?

and please check
# setpci -s 0000:00:1c.0 MEMORY_BASE
# setpci -s 0000:00:1c.0 MEMORY_LIMIT
# setpci -s 0000:00:1c.0 PREF_MEMORY_BASE
# setpci -s 0000:00:1c.0 PREF_MEMORY_LIMIT
with patch applied.
Comment 257 thejoe 2016-08-19 02:36:26 UTC
I can confirm the updated patch works for both suspend & shutdown on a MacBookPro11,4
Comment 258 Chen Yu 2016-08-19 03:08:34 UTC
OK, thanks.
The patch has been posted at https://patchwork.kernel.org/patch/9288825/

@Bjorn Helgaas
Comment 259 Jonathan Dunlap 2016-08-29 01:05:23 UTC
(In reply to Chen Yu from comment #258)
> OK, thanks.
> The patch has been posted at https://patchwork.kernel.org/patch/9288825/
> 
> @Bjorn Helgaas

How does the process work from here? E.g. What are the steps required to get it from 'patchwork' into the kernel official release?
Comment 260 Luis de Bethencourt 2016-08-29 12:22:33 UTC
(In reply to Jonathan Dunlap from comment #259)
> (In reply to Chen Yu from comment #258)
> > OK, thanks.
> > The patch has been posted at https://patchwork.kernel.org/patch/9288825/
> > 
> > @Bjorn Helgaas
> 
> How does the process work from here? E.g. What are the steps required to get
> it from 'patchwork' into the kernel official release?

The patch has been sent to the proper mailing lists and maintainers.

These maintainers will review it soon. It depends on the subsystem but the normal window for review is a week. If they think it is OK and it doesn't need any fixes/cleanups, they will push it to the -next branch of the subsystem git tree.

This gets picked up by Stephen Rothwell's linux-next, where all that could potentially land in the next Linux release cycle gets merged to test for cross-system conflicts, cross-hardware, regressions, bugs, and more. If there is no problem arising from this patch in linux-next. When Linus Torvalds opens the merge window for the next release, the maintainers will push to the "for Linus" branch of the subsystem, and ask Linus to pull all changes in this branch of reviewed and tested changes.

In other words, when this patch is OK'd, it will spend the next few weeks in linux-next and if all goes well it will be merged into mainline in a month or so from now. We are currently in 4.8-rc4, so if this patch isn't tested in linux-next soon it might not spend enough time in linux-next to be in 4.9.

To summarize your direct question of what steps are required, it is all in the hands of the maintainers now. If they haven't reviewed in a week, send them a little nudge :)
Comment 261 Kahlil Hodgson 2016-08-29 22:54:39 UTC
I seem to be experiencing a similar problem but with my MacBookPro12,1.  Is there a simple way to check if this is the same issue?
Comment 262 Chen Yu 2016-08-29 23:41:06 UTC
(In reply to Luis de Bethencourt from comment #260)
> (In reply to Jonathan Dunlap from comment #259)
> > (In reply to Chen Yu from comment #258)
> > > OK, thanks.
> > > The patch has been posted at https://patchwork.kernel.org/patch/9288825/
> > > 
> > > @Bjorn Helgaas
> > 
> > How does the process work from here? E.g. What are the steps required to
> get
> > it from 'patchwork' into the kernel official release?
> 
> The patch has been sent to the proper mailing lists and maintainers.
> 
> These maintainers will review it soon. It depends on the subsystem but the
> normal window for review is a week. If they think it is OK and it doesn't
> need any fixes/cleanups, they will push it to the -next branch of the
> subsystem git tree.
> 
> This gets picked up by Stephen Rothwell's linux-next, where all that could
> potentially land in the next Linux release cycle gets merged to test for
> cross-system conflicts, cross-hardware, regressions, bugs, and more. If
> there is no problem arising from this patch in linux-next. When Linus
> Torvalds opens the merge window for the next release, the maintainers will
> push to the "for Linus" branch of the subsystem, and ask Linus to pull all
> changes in this branch of reviewed and tested changes.
> 
> In other words, when this patch is OK'd, it will spend the next few weeks in
> linux-next and if all goes well it will be merged into mainline in a month
> or so from now. We are currently in 4.8-rc4, so if this patch isn't tested
> in linux-next soon it might not spend enough time in linux-next to be in 4.9.
> 
> To summarize your direct question of what steps are required, it is all in
> the hands of the maintainers now. If they haven't reviewed in a week, send
> them a little nudge :)

Thanks for your explaination, Luis, and the latest patch is at:
https://patchwork.kernel.org/patch/9289777/
and got feedback from Lukas Wunner
Comment 263 Chen Yu 2016-08-30 00:24:44 UTC
(In reply to Kahlil Hodgson from comment #261)
> I seem to be experiencing a similar problem but with my MacBookPro12,1.  Is
> there a simple way to check if this is the same issue?
MBP 12,1 should be another issue, since they have different hardware.
May I know what the problem is? or you can move to 
https://bugzilla.kernel.org/show_bug.cgi?id=101681
and post there.
Comment 264 Kahlil Hodgson 2016-08-30 00:42:59 UTC
Issue is I suspend by closing the lid, or power button, or systemctl and it would wake up about 30 secs later.  Then maybe go back to sleep and wake up again.

Reboot or poweroff is problematic. Something would go wrong with bringing the system down and I'd have the force the poweroff.  This would put the system in a weird state where it would take an extremely long time to bring up the mac boot menu (dual boot system) and be extremely sluggish responding to key presses until I'd managed to get passed unlocking the root filesystem (lucks encrypted).  All up a reboot can take more than 5 minutes.

https://bugzilla.kernel.org/show_bug.cgi?id=101681

May be a better place to discuss. Hope to get a chance to read that thread tonight. Thanks for the pointer :-)
Comment 265 Kahlil Hodgson 2016-08-30 00:50:07 UTC
Also note. I have had some limited success with getting systemd to unload brcmfmac before suspend and reboot, and with unmounting /boot/efi and /boot before poweroff and reboot. 

Curiously all these issues seem to disappear on kernel-4.6.5, but have come back with the two more recent kernels.
Comment 266 Jonathan Dunlap 2016-09-02 00:54:42 UTC
Lukas Wunner seems skeptical of merging this patch in due to a larger issue at play: https://patchwork.kernel.org/patch/9289777/

Thoughts on pressing to get the patch into mainline.. or wait out a proper core-issue fix?
Comment 267 Vlad Lesin 2016-09-13 09:07:55 UTC
(In reply to Chen Yu from comment #252)
> Oops, please ignore above patch, check this one below:
> 
> Index: linux/drivers/pci/quirks.c
> ===================================================================
> --- linux.orig/drivers/pci/quirks.c
> +++ linux/drivers/pci/quirks.c
> @@ -2775,6 +2775,15 @@ static void quirk_hotplug_bridge(struct
> 
>  DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
> 
> +static void quirk_apple_mbp_poweroff(struct pci_dev *dev)
> +{
> +       if (dmi_match(DMI_BOARD_VENDOR, "Apple Inc.") &&
> +           dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,4"))
> +               dev->is_hotplug_bridge = 0;
> +}
> +
> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10,
> quirk_apple_mbp_poweroff);
> +
>  /*
>   * This is a quirk for the Ricoh MMC controller found as a part of
>   * some mulifunction chips.


Could you please remember this issue is absolutely the same for 11,5. I have tried this patch on 4.4.0-36 with the following modification:

=============================================
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2749,6 +2749,16 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)

 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);

+static void quirk_apple_mbp_poweroff(struct pci_dev *dev)
+{
+  if (dmi_match(DMI_BOARD_VENDOR, "Apple Inc.") &&
+      (dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,4") ||
+       dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,5")))
+    dev->is_hotplug_bridge = 0;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
+
 /*
  * This is a quirk for the Ricoh MMC controller found as a part of
  * some mulifunction chips.
===============================================

and it works well. At least poweroff and suspend to sleep work now.

Thank you for your work.
Comment 268 Kahlil Hodgson 2016-09-13 20:47:08 UTC
(In reply to Jonathan Dunlap from comment #266)
> Lukas Wunner seems skeptical of merging this patch in due to a larger issue
> at play: https://patchwork.kernel.org/patch/9289777/
> 
> Thoughts on pressing to get the patch into mainline.. or wait out a proper
> core-issue fix?

A few times now, I've chucked my MBP into my bag thinking it was suspended, only to find that its woken up sometime after and is very hot when I pull it out. Fortunately it only been in that state for 20 minutes or so.  If it was in that state overnight it might damage the hardware. Not sure if it would actually catch on fire though.

I would press to get the patch into mainline as a temporary fix until the core issue can be resolved.
Comment 269 Luis de Bethencourt 2016-09-14 08:07:19 UTC
(In reply to Kahlil Hodgson from comment #268)
> (In reply to Jonathan Dunlap from comment #266)
> 
> A few times now, I've chucked my MBP into my bag thinking it was suspended,
> only to find that its woken up sometime after and is very hot when I pull it
> out. Fortunately it only been in that state for 20 minutes or so.  If it was
> in that state overnight it might damage the hardware. Not sure if it would
> actually catch on fire though.
> 
> I would press to get the patch into mainline as a temporary fix until the
> core issue can be resolved.

Hi Kahlil,

For the safety of your hardware I recommend you do the work around for now. You can deactivate the lid sensor signal listening, so it won't wake up accidentally a few seconds after sleeping.

Then you can suspend the machine with your desktop's UI, or with pm-suspend.

Somewhere in this thread the work around is explained. Sorry I don't have time to look for it right now. Need to run but wanted to let you know it exists.

Luis
Comment 270 Kahlil Hodgson 2016-09-15 23:57:17 UTC
(In reply to Luis de Bethencourt from comment #269)

> For the safety of your hardware I recommend you do the work around for now.
> You can deactivate the lid sensor signal listening, so it won't wake up
> accidentally a few seconds after sleeping.
> 
> Then you can suspend the machine with your desktop's UI, or with pm-suspend.

Thanks for the pointer Luis. I've opted for disabling Xwake from S3 on XHC1 

cat /etc/udev/rules.d/90-xhc_sleep.rules

# disable wake from S3 on XHC1
SUBSYSTEM=="pci", KERNEL=="0000:00:14.0", ATTR{power/wakeup}="disabled"


At least that way, my lid still works :-)
Comment 271 Chen Yu 2016-09-18 03:10:30 UTC
(In reply to Jonathan Dunlap from comment #266)
> Lukas Wunner seems skeptical of merging this patch in due to a larger issue
> at play: https://patchwork.kernel.org/patch/9289777/
> 
> Thoughts on pressing to get the patch into mainline.. or wait out a proper
> core-issue fix?
Yes, I'm trying to look through the datasheet for this pci bridge, to find out if there is any explanation that memory address is mapped to io address on this chipset, e.g, Intel® 8 Series/C220 Series Chipset
Family Platform Controller Hub, at 
http://www.intel.com/content/www/us/en/chipsets/8-series-chipset-pch-datasheet.html
And I've sent an email to someone who's working with this device, and try to address the problem raised by Lukas and Bjorn.
Comment 272 Bjorn Helgaas 2016-09-19 13:29:55 UTC
[The following is from Lukas Wunner <lukas@wunner.de> on Aug 24 on linux-pci.  I'm adding it here to make the bugzilla more complete and because several recent comments refer to it.]

On Fri, Aug 19, 2016 at 04:30:25PM +0800, Chen Yu wrote:
> People reported that they can not do a poweroff nor a
> suspend to ram on their Mac Pro 11. After some investigations
> it was found that, once the PCI bridge 0000:00:1c.0 reassigns its
> mm windows to ([mem 0x7fa00000-0x7fbfffff] and
> [mem 0x7fc00000-0x7fdfffff 64bit pref]), the region of ACPI
> io resource 0x1804 becomes unaccessible immediately, where the
> ACPI Sleep register is located, as a result neither poweroff(S5)
> nor suspend to ram(S3) works.

To provide a bit more context:

The root port in question (0000:00:1c.0) is not listed in the DSDT.
On macOS, only devices present in the ACPI namespace are incorporated
into the I/O Kit registry. Consequently macOS pretends that this root
port doesn't exist. It's not listed in the "ioreg -l" output and thus
no driver is attached to this device.

So what we're dealing with is sloppiness on the part of Apple:
Some engineer probably forgot to disable this unused root port
and they didn't notice it during testing because their OS ignores
such devices.

We could in principle achieve the same behaviour by adding a PCI
device only if it has an ACPI companion, perhaps quirk this only
to Macs. I'm not sure if that's the right thing to do though.
What if they hide devices from macOS but we want to access them
on Linux?

What's really odd is that changing *memory* windows affects
accessibility of *I/O ports*.

One theory would be that I/O ports are somehow mapped into memory.
The GPIO pins of Intel chipsets are usually accessible through
I/O ports, but I've recently looked at the DSDT of the newest
MacBook9,1 (2016) and it looks like they're now accessed through
SystemMemory instead of SystemIO. Perhaps someone at Intel knows
about these intricacies of their chipsets.

If I/O ports are indeed mapped into memory, we need to find a way
to identify and reserve that region. So while this patch seems
like a workable and sufficiently small fix, it might mask a larger
underlying issue. It's certainly a problem though that these
machines currently cannot power off or suspend.

FWIW, we have a somewhat similar issue with the Apple gmux
(a microcontroller built into dual GPU MacBook Pros). That chip
is attached to the LPC bus and accessed through I/O ports.
It turns out that once VGA IO is locked to the discrete GPU
using vgaarb, gmux' I/O ports suddenly become inaccessible.
Apparently its I/O ports are routed to the secondary PCI bus
to which the discrete GPU is connected, and no longer to the
root bus on which the LPC bridge resides. However gmux' I/O ports
are in the 0x700-0x7ff range, whereas the VGA registers are in
the 0x3b0-0x3bb and 0x3c0-0x3df range. So that's another oddity
of Intel chipsets with regards to I/O accessibility.
Comment 273 Christopher Broome 2016-09-28 01:46:22 UTC
For the past week or two, I've been able to successfully power off my Macbook Pro 11,5 without it hanging. I am almost certain this coincided with the kernel update to 4.4.0-38-generic.

The machine is _also_ able to go into suspend, but it resumes immediately. After disabling wakeup for the usual suspects (XHC1 and ARPT) in /proc/acpi/wakeup, it still resumes immediately.

Interestingly, though, I've noticed that when resuming, the ARPT setting is set to enabled despite the fact that it was disabled before I suspended.

Some info about my system for reference:

$ sudo dmidecode -s system-product-name
MacBookPro11,5

$ uname -a
Linux chris-mbp-ubuntu 4.4.0-38-generic #57-Ubuntu SMP Tue Sep 6 15:42:33 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

# before suspend
$ cat /proc/acpi/wakeup 
Device	S-state	  Status   Sysfs node
PEG0	  S3	*disabled  pci:0000:00:01.0
GFX0	  S3	*disabled  pci:0000:01:00.0
PEG1	  S3	*disabled  pci:0000:00:01.1
PEG2	  S3	*disabled  pci:0000:00:01.2
EC	  S4	*disabled  platform:PNP0C09:00
GMUX	  S3	*disabled  pnp:00:03
HDEF	  S3	*disabled  pci:0000:00:1b.0
RP03	  S3	*disabled  pci:0000:00:1c.2
ARPT	  S4	*disabled  pci:0000:04:00.0
RP04	  S3	*disabled  pci:0000:00:1c.3
XHC1	  S3	*disabled  pci:0000:00:14.0
ADP1	  S4	*disabled  platform:ACPI0003:00
LID0	  S4	*enabled   platform:PNP0C0D:00

# after immediately resuming
$ cat /proc/acpi/wakeup 
Device	S-state	  Status   Sysfs node
PEG0	  S3	*disabled  pci:0000:00:01.0
GFX0	  S3	*disabled  pci:0000:01:00.0
PEG1	  S3	*disabled  pci:0000:00:01.1
PEG2	  S3	*disabled  pci:0000:00:01.2
EC	  S4	*disabled  platform:PNP0C09:00
GMUX	  S3	*disabled  pnp:00:03
HDEF	  S3	*disabled  pci:0000:00:1b.0
RP03	  S3	*disabled  pci:0000:00:1c.2
ARPT	  S4	*enabled   pci:0000:04:00.0
RP04	  S3	*disabled  pci:0000:00:1c.3
XHC1	  S3	*disabled  pci:0000:00:14.0
ADP1	  S4	*disabled  platform:ACPI0003:00
LID0	  S4	*enabled   platform:PNP0C0D:00
Comment 274 Bjorn Helgaas 2016-09-28 21:24:25 UTC
I think the reason you can power off with the Ubuntu 4.4.0-38-generic is because it includes the patch under discussion:

http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/commit/drivers/pci/quirks.c?id=5080ff61a438f3dd80b88b423e1a20791d8a774c

I'm still hoping that Chen Yu can find some information about how this Intel root port works.  The current patch is really a stop-gap and does not address the root cause of the problem.
Comment 275 Naftuli Tzvi Kay 2016-09-28 21:54:41 UTC
Bjorn Helgaas: Glad to see that they included it in the latest build, I didn't think they would do that as I thought their kernel releases were only security updates. This would explain why things are occasionally panicking here, as I have compiled in both patches :-|
Comment 276 Bjorn Helgaas 2016-09-28 22:49:11 UTC
Naftuli: My guess is that this affects a lot of people, and Canonical got tired of waiting for us to do something.  I'm really sick of it too, honestly, but the patch they included is completely unmaintainable -- it's possible other systems and future MacBooks could have similar problems, and we just don't know why.

Which two patches did you include?  I don't know why including both would cause panics, but of course it depends on what patches you included.
Comment 277 Naftuli Tzvi Kay 2016-09-28 22:59:59 UTC
I have a cron job that alerts me when a new kernel release has been dropped for the current LTS release. I then go apply the bypass patch and recompile the kernel and reboot. Therefore, I've got both of these bits of code running in the kernel on my MacBook on power actions, which is probably a bad idea. I'll have to test the stock kernel and see if the fix provided by Canonical fixes the issue.
Comment 278 Chen Yu 2016-09-29 01:59:15 UTC
(In reply to Bjorn Helgaas from comment #274)
> I think the reason you can power off with the Ubuntu 4.4.0-38-generic is
> because it includes the patch under discussion:
> 
> http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/commit/drivers/pci/
> quirks.c?id=5080ff61a438f3dd80b88b423e1a20791d8a774c
> 
> I'm still hoping that Chen Yu can find some information about how this Intel
> root port works.  The current patch is really a stop-gap and does not
> address the root cause of the problem.
Yup, due to confidential reason, currently I'm knocking at each possible chipset team's door to find out who's responsible for this pci bridge, and errata if there is any.
Comment 279 Bjorn Helgaas 2016-09-30 20:32:39 UTC
(In reply to Naftuli Tzvi Kay from comment #277)
> I'll have to test the stock kernel and see if the fix provided by
> Canonical fixes the issue.

Don't bother with that.  The "fix" provided by Canonical (http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/commit/drivers/pci/quirks.c?id=5080ff61a438f3dd80b88b423e1a20791d8a774c) is exactly the patch that has been discussed here.  It's essentially the same as comment #267.

We all agree this avoids the problem.  I have not applied it upstream because nobody can explain *why* it avoids the problem.  Applying voodoo patches like that does not lead to a robust system in the long term.
Comment 280 Naftuli Tzvi Kay 2016-09-30 20:35:46 UTC
Bjorn: I'm using _this_ patch: http://rfkrocktk.github.io/2016/07/macbook-suspend-bug
Comment 281 pkozlov 2016-10-03 05:55:08 UTC
Hi guys.

Linux-4.8 has just released https://lkml.org/lkml/2016/10/2/102
Have you included this patch there? If not, is it still compatible with linux-4.8? And do you have any plans to include it into 4.9?
Comment 282 Bjorn Helgaas 2016-10-04 17:52:15 UTC
No, this patch is not in v4.8 for the reasons I mentioned in comment #279.

Can someone please collect and attach "lspci -vvxxx" output both from a kernel with the workaround (where poweroff works) and a kernel without the workaround (where poweroff hangs)?

Some of the relevant chipset config is in device-specific config space, which is not included in the "lspci -vvxx" output already attached.  I want lspci output for the entire system, because there are several devices involved.
Comment 283 Bastian Triller 2016-10-04 22:05:56 UTC
Created attachment 240701 [details]
MacBookPro11,4 lspci -vvxxx quirk_apple_mbp_poweroff patched
Comment 284 Bastian Triller 2016-10-04 22:10:41 UTC
Created attachment 240711 [details]
MacBookPro11,4 lspci -vvxxx vanilla
Comment 285 Bastian Triller 2016-10-04 22:22:26 UTC
Created attachment 240721 [details]
MacBookPro11,4 lspci -vvxxx quirk_apple_mbp_poweroff patched (after reboot)
Comment 286 Cédric Le Goater 2016-10-05 06:28:27 UTC
Created attachment 240771 [details]
MacBookPro11,5 lspci -vvxxx
Comment 287 Cédric Le Goater 2016-10-05 06:29:28 UTC
Created attachment 240781 [details]
MacBookPro11,5 lspci -vvxxx with the quirk patch
Comment 288 Pablo Catalina 2016-10-05 08:16:56 UTC
Created attachment 240791 [details]
MBP 11,4 lscpi -vvxxx - patched kernel
Comment 289 Pablo Catalina 2016-10-05 08:17:38 UTC
Created attachment 240801 [details]
MBP 11,4 lscpi -vvxxx - unpatched kernel
Comment 290 Bjorn Helgaas 2016-10-08 03:00:26 UTC
Thank you very much for the lspci output.  It all looks sensible to me and I don't see any major issue yet, so let's keep poking.

Please run the following commands as root on both a vanilla kernel and
on one containing the workaround patch
(https://patchwork.kernel.org/patch/9288825/).

  setpci -s1f.0 0x02.w  # sec 12.1.2   DID
  setpci -s1f.0 0x40.l  # sec 12.1.13  PMBASE
  setpci -s1f.0 0x44.b  # sec 12.1.14  ACPI_CNTL
  setpci -s1f.0 0x48.l  # sec 12.1.15  GPIOBASE
  setpci -s1f.0 0x4c.b  # sec 12.1.16  GC
  setpci -s1c.0 0x42.w  # sec 19.1.24  XCAP
  setpci -s1c.0 0x24.l  # sec 19.1.17  PMBL
  setpci -s1c.0 0x28.l  # sec 19.1.18  PMBU32

Then build this package:
https://people.redhat.com/~rjones/ioport/files/ioport-1.2.tar.gz,
which contains programs for doing I/O port accesses from userspace.
Please run the following commands, again as root, on both a vanilla
and a patched kernel:

  ./inl --hex 0x1804    # sec 12.8.3.3  PM1_CNT
  ./inl --hex 0x1808    # sec 12.8.3.4  PM1_TMR
  ./inl --hex 0x1828    # sec 12.8.3.6  GPE0_EN

  for X in `seq 1 10`; do ./inl 0x1808; done    # PM1_TMR, should be counting

  ./inl --hex 0x0804    # sec 12.10.2   GP_IO_SEL
  ./inw --hex 0x0828    # sec 12.10.8   GPI_NMI_EN
  ./inl --hex 0x0830    # sec 12.10.11  GPIO_USE_SEL2

The setpci output from the vanilla kernel should match the following,
which I extracted from the lspci output from Bastian, Cédric, and
Pablo.  The spec references are to the one mentioned in comment #271.

  DID           0x8c4b
    LPC Controller (HM87 SKU) per sec 1.4, so this should be the right
    spec for the part.

  PMBASE        0x00001801
    The Power Management registers (sec 12.8.3) and TCO I/O registers
    are at [io 0x1800-0x187f].

  ACPI_CNTL     0x80
    ACPI Enable == 1 and decode of the PMBASE range is enabled.

  GPIOBASE      0x00000801
    The GPIO space (sec 12.10) is at [io 0x0800-0x087f].

  GC            0x10
    GPIO Enable == 1 and decode of the GPIOBASE range is enabled.

  XCAP          0x142
    Slot Implemented == 1, Device/Port Type == 4 (Root Port), and
    Capability Version == 2 (PCIe 2.0).  The Slot Implemented bit is
    R/WO (Read/Write Once) with a default value of zero (sec 19.1.24),
    so BIOS must have written a 1 there, which is probably a bug since
    there doesn't seem to be a slot.  That explains why Linux assigns
    resources to the Root Port, but of course, it doesn't explain why
    doing so changes anything with respect to the Power Management
    registers.

  PMBL          0x7fd17fc1
  PMBU32        0
    Prefetchable memory window of [mem 0x7fc00000-0x7fdfffff pref],
    and the window supports 64-bit addressing (this is on the vanilla
    kernel, without the workaround patch).

The setpci values from the patched kernel should be the same, except
for these:

  PMBL          0x0001fff1
  PMBU32        0
    Prefetchable memory limit == 0x0001
    Prefetchable memory base  == 0xfff0
    There are many ways to disable the window, and I think this is a
    perfectly valid one.
Comment 291 Cédric Le Goater 2016-10-08 08:57:03 UTC
Created attachment 241171 [details]
setpci/in* results on a MacBookPro11,5
Comment 292 Bjorn Helgaas 2016-10-12 15:51:13 UTC
Thank you very much, Cédric.  Your setpci data matches what I expected exactly.  The Power Management registers appear to be at [io 0x1800] as expected.  It is interesting that they are readable and show the same values (except PM1_TMR, of course) for both kernels -- with and without the quirk.  So programming the 00:1c.0 windows doesn't directly affect those registers.

Can you also fetch this program: http://cmp.felk.cvut.cz/~pisa/linux/rdwrmem.c and run the following on a vanilla kernel (without the quirk patch)?  I expect the last command will power off your machine.

  setpci -s1c.0 0x20.l                  # sec 19.1.16  MBL (non-prefetch)
  setpci -s1c.0 0x24.l                  # sec 19.1.17  PMBL (prefetchable)
  setpci -s1c.0 0x28.l                  # sec 19.1.18  PMBU32

  ./rdwrmem -m -b4 -l256 -s 0x7fa00000  # non-prefetch window
  ./rdwrmem -m -b4 -l256 -s 0x7fc00000  # prefetch window
  ./rdwrmem -m -b4 -l256 -s 0xf0200000  # unused space

  setpci -s1c.0 0x24.l=0xf031f021       # sec 19.1.17  PMBL (prefetchable)
  lspci -vvs1c.0

  ./rdwrmem -m -b4 -l256 -s 0x7fc00000  # now-unused space
  ./rdwrmem -m -b4 -l256 -s 0xf0200000  # new prefetch window

  ./inl  --hex 0x1830                   # sec 12.8.3.7  SMI_EN  
  ./inl  --hex 0x1834                   # sec 12.8.3.8  SMI_STS
  ./outl --hex 0x1804 0x3c00            # sec 12.8.3.3  PM1_CNT
Comment 293 Cédric Le Goater 2016-10-12 16:43:05 UTC
Created attachment 241601 [details]
rdwrmem results on a MacBookPro11,5
Comment 294 Cédric Le Goater 2016-10-12 16:44:29 UTC
Hello Bjorn,

Here are the results. I used a vanilla 4.8 for the test. The system
hanged at the end

C.
Comment 295 Bjorn Helgaas 2016-10-12 18:24:06 UTC
Thank you, Cédric.  I'm surprised that it hung instead of powering off.  I guess it's not enough to move the prefetchable window.

  SMI_EN        0x00000033
    APMC_EN == 1
    SLP_SMI_EN == 1
    EOS == 1
    GBL_SMI_EN == 1

    I *think* SLP_SMI_EN == 1 means that when we write the SLP_EN bit in
    PM1_CNT, we'll take an SMI, and it's up to the SMI handler in BIOS to
    complete the sleep state transition.

Can you try the following experiments, both on a vanilla kernel?  First, move only the non-prefetchable window:

  ./rdwrmem -m -b4 -l256 -s 0x7fa00000  # non-prefetch window
  ./rdwrmem -m -b4 -l256 -s 0x7fc00000  # prefetch window
  ./rdwrmem -m -b4 -l256 -s 0xf0000000  # unused space

  setpci -s1c.0 0x20.l=0xf011f001       # sec 19.1.16  MBL (non-prefetchable)
  lspci -vvs1c.0

  ./rdwrmem -m -b4 -l256 -s 0x7fa00000  # now-unused space
  ./rdwrmem -m -b4 -l256 -s 0xf0000000  # new non-prefetch window

  ./inl  --hex 0x1830                   # sec 12.8.3.7  SMI_EN  
  ./inl  --hex 0x1834                   # sec 12.8.3.8  SMI_STS
  ./outl --hex 0x1804 0x3c00            # sec 12.8.3.3  PM1_CNT

Then (on a new boot), move both the non-prefetchable and the prefetchable windows:

  ./rdwrmem -m -b4 -l256 -s 0x7fa00000  # non-prefetch window
  ./rdwrmem -m -b4 -l256 -s 0x7fc00000  # prefetch window
  ./rdwrmem -m -b4 -l256 -s 0xf0000000  # unused space
  ./rdwrmem -m -b4 -l256 -s 0xf0200000  # unused space

  setpci -s1c.0 0x20.l=0xf011f001       # sec 19.1.16  MBL (non-prefetchable)
  setpci -s1c.0 0x24.l=0xf031f021       # sec 19.1.17  PMBL (prefetchable)
  lspci -vvs1c.0

  ./rdwrmem -m -b4 -l256 -s 0x7fa00000  # now-unused space
  ./rdwrmem -m -b4 -l256 -s 0x7fc00000  # non-unused space
  ./rdwrmem -m -b4 -l256 -s 0xf0000000  # new non-prefetch window
  ./rdwrmem -m -b4 -l256 -s 0xf0200000  # new prefetch window

  ./inl  --hex 0x1830                   # sec 12.8.3.7  SMI_EN  
  ./inl  --hex 0x1834                   # sec 12.8.3.8  SMI_STS
  ./outl --hex 0x1804 0x3c00            # sec 12.8.3.3  PM1_CNT
Comment 296 Cédric Le Goater 2016-10-12 19:14:04 UTC
Created attachment 241611 [details]
rdwrmem results on a MacBookPro11,5 test2
Comment 297 Cédric Le Goater 2016-10-12 19:44:06 UTC
The system stopped twice with these tests.

Is that consistent with SLP_SMI_EN, bit 4 of "SMI_EN—SMI Control 
and Enable Register" ? 

  1 = A write of 1 to the SLP_EN bit (bit 13 in PM1_CNT register) 
  will generate an SMI#, and the system will not transition to the 
  sleep state based on that write to the SLP_EN bit.

Thanks,

C.
Comment 298 Bjorn Helgaas 2016-10-12 22:03:53 UTC
Thanks, Cédric.  When you say the system "stopped", I assume you mean that it hung and did not power off.  I expected it to power off when we moved both windows, since that's basically the same as the test from comment #238, and Bastian reported in comment #250 that it powered off.

I guess Bastian used the "poweroff" command instead of writing to PM1_CNT
directly, so what if you do this on a vanilla kernel instead:

  setpci -s1c.0 0x20.l=0xf011f001       # sec 19.1.16  MBL (non-prefetchable)
  setpci -s1c.0 0x24.l=0xf031f021       # sec 19.1.17  PMBL (prefetchable)
  lspci -vvs1c.0
  poweroff

If SLP_SMI_EN were 0, I would expect the hardware to transition to S5
(power off) immediately by itself when we write to ioport 0x1804.  Since
SLP_SMI_EN is 1, I think writing to ioport 0x1804 will cause an SMI, and the
BIOS can do whatever it wants before it turns off the power.  For example,
maybe the BIOS needs to do something with Thunderbolt during suspend-to-RAM.
Comment 299 Cédric Le Goater 2016-10-12 22:32:25 UTC
Sorry. I should be more precise. In this last test, the system "powered-off" 
instantly. No suspend, just a brutal off.

In the previous test, it hung. X still displayed some image but I could
not get anything out the machine. I had to force a power-off with the 
button.

The commands you pasted above work as expected and the system powers off 
cleanly.


As for Thunderbolt, I have one attached to the macbook currently, so may be 
it is causing problems in this test scenario. It certainly does with the 
quirk patch. Suspend does not necessarily work and if so, the Thunderbolt 
device disappears at next resume.

I will check without tomorrow, 

Thanks,

C.
Comment 300 Cédric Le Goater 2016-10-13 20:43:54 UTC
Here is a recap of the results on a MacBookPro11,5
running a vanilla 4.8 kernel

    test 1 - from comment #292
    test 2 - from comment #295, first sequence 
    test 3 - from comment #295, second sequence
    test 4 - from comment #298

Thunderbolt    With               Without

    test 1     freeze             freeze
    test 2     hard poweroff      hard reset
    test 3     hard poweroff      hard reset
    test 4     soft poweroff      soft poweroff

C.
Comment 301 Bjorn Helgaas 2016-10-13 21:40:44 UTC
Thanks for the nice summary!  I was trying to put all that together.

Test 1 moved the prefetchable window, and poweroff failed (freeze).

Test 2 moved the non-prefetchable window, and test 3 moved both non-prefetchable and prefetchble windows.  Same results for both, which suggests that the non-prefetchable window is the key.

Test 4 moved both windows.  Based on test 2, I suspect it's unnecessary to move the prefetchable window.

Can you try the following ("test 5") to verify that only the non-prefetchable window is important?  Do this with a vanilla kernel:

  setpci -s1c.0 0x20.l=0xf011f001       # sec 19.1.16  MBL (non-prefetchable)
  lspci -vvs1c.0
  ./rdwrmem -m --no-mmap -b4 -l0x100000 -s 0x7fa00000  # non-prefetch window
  ./rdwrmem -m --no-mmap -b4 -l0x100000 -s 0x7fb00000  # non-prefetch window
  poweroff

This will generate a lot of output, so please compress it and attach it.  I'm looking for something other than 0xff in the rdwrmem -- that would suggest there's some device or ROM in that area.
Comment 302 Cédric Le Goater 2016-10-13 22:42:26 UTC
Created attachment 241681 [details]
test5 results
Comment 303 Cédric Le Goater 2016-10-13 22:44:13 UTC
The dump is full of -1. Do you want a dump for a larger window ?
Comment 304 Bjorn Helgaas 2016-10-13 22:56:58 UTC
Nope, that dump should already cover the entire area of the non-prefetchable window.  I don't see anything other than 0xff either.  I would think if that area were important to BIOS for powering down, there would be some register or ROM in the area that would read as non-0xff.

I assume the box did power off correctly, though?
Comment 305 Cédric Le Goater 2016-10-13 23:02:00 UTC
yes it did.
Comment 306 Bjorn Helgaas 2016-10-14 17:27:45 UTC
Created attachment 241721 [details]
VGA relocation test patch

This is a test patch against v4.8 that does two things:
  1) Adds the quirk from comment #267
  2) Adds a boot option "reloc=" to move a VGA BAR

Linux does a soft power-off (transition to S5) by writing to PM1_CNT at
[io 0x1804].  The theory about why this doesn't work is:

  - The PM1_CNT write causes an SMI.
  - The BIOS SMI handler depends on something in [mem 0x7fa00000-0x7fbfffff].
  - When Linux assigns [mem 0x7fa00000-0x7fbfffff] to the 00:1c.0 bridge, it
    covers up whatever the SMI handler uses, so the SMI handler no longer
    works correctly.

The quirk keeps us from assigning that space to the bridge.  The problem is that the current quirk does nothing to keep us from assigning that space to another device, so we could trip over this again in the future.

Test6: Boot a kernel with this patch and try "poweroff".  It should work correctly.

Test7: Boot with "reloc=0x7fa00000" and try "poweroff".  This should fail.

Test8: Boot with "reloc=0x7fc00000" and try "poweroff".  This should work.
Comment 307 Cédric Le Goater 2016-10-15 07:13:25 UTC
Hello,

I have used on my system (see attachment 240771 [details] for the lspci) :

+	if (vga_addr && dev->bus->number == 1 &&
+	    dev->devfn == PCI_DEVFN(0, 1) && pos == 0x10) {

to match :

01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Venus XT [Radeon HD 8870M / R9 M270X/M370X] (rev 83)
01:00.1 0403: 1002:aab0

Is this correct ? If so, the results are : 

Thunderbolt    With               Without

    test 6     soft poweroff      soft poweroff
    test 7     soft poweroff      soft poweroff
    test 8     soft poweroff      soft poweroff

No issues noticed when powering off.

C.
Comment 308 Tom B 2016-11-28 15:32:57 UTC
Is it possible that the changes made to fix this have introduced this bug: https://bugzilla.kernel.org/show_bug.cgi?id=189231 as there are mentions of the GPU here and it seems to be a GPU/power issue.
Comment 309 Cédric Le Goater 2016-11-29 07:11:52 UTC
Hello, I don't think so. I am seeing the same issue with the F25 kernel which does not have the fix AFAICT.
Comment 310 Greg Oliver 2016-12-04 09:11:57 UTC
I hate to say it, but Tom B's comments are definitely related.   I have been runnning the fedora kernels, and just use the same version I have been using with this patch, and see the tearing.

MacPro(11,5) with AMD gpu.

--- a/drivers/pci/quirks.c	
+++ a/drivers/pci/quirks.c	
@@ -2775,6 +2775,15 @@ static void quirk_hotplug_bridge(struct pci_dev *dev)
 
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_HINT, 0x0020, quirk_hotplug_bridge);
 
+static void quirk_apple_mbp_poweroff(struct pci_dev *dev)
+{
+	if (dmi_match(DMI_BOARD_VENDOR, "Apple Inc.") &&
+	    (dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,4") ||
+	     dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,5")))
+		dev->is_hotplug_bridge = 0;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x8c10, quirk_apple_mbp_poweroff);
+
 /*
  * This is a quirk for the Ricoh MMC controller found as a part of
  * some mulifunction chips.
    relocate
--- a/drivers/pci/probe.c	
+++ a/drivers/pci/probe.c	
@@ -163,6 +163,14 @@ static inline unsigned long decode_bar(struct pci_dev *dev, u32 bar)
 
 #define PCI_COMMAND_DECODE_ENABLE	(PCI_COMMAND_MEMORY | PCI_COMMAND_IO)
 
+static u32 vga_addr;
+static int reloc(char *str)
+{
+	get_option(&str, &vga_addr);
+	return 0;
+}
+early_param("reloc", reloc);
+
 /**
  * pci_read_base - read a PCI BAR
  * @dev: the PCI device
@@ -180,6 +188,13 @@ int __pci_read_base(struct pci_dev *dev, enum pci_bar_type type,
 	u16 orig_cmd;
 	struct pci_bus_region region, inverted_region;
 
+	if (vga_addr && dev->bus->number == 0 &&
+	    dev->devfn == PCI_DEVFN(2, 0) && pos == 0x10) {
+		dev_info(&dev->dev, "** relocating BAR 0 to %#010x\n",
+			 vga_addr);
+		pci_write_config_dword(dev, pos, vga_addr);
+	}
+
 	mask = type ? PCI_ROM_ADDRESS_MASK : ~0;
 
 	/* No printks while decoding is disabled! */



### is the *only* difference between kernels.  Just booting X (wayland on login, and xorg on desktop)..

regardless, it is apparent the kernel is to blame.  I can boot the same kernel without the above patch applied and it does not happen.
Comment 311 berglh 2016-12-04 22:22:21 UTC
In addition to Tom B and Greg Oliver, I am also seeing the screen tearing/flickering with Linux 4.9.0-rc6 on Ubuntu Gnome with gnome-shell 3.18.5 with comment 306 patch. I have not tried the kernel option reloc=0x7fc00000, not sure if this is causing the issue.

Also note: This only occurs on the MacBook retina display. When I'm using DisplayPort and Cinema ThundberBolt display, the screen tearing is not present on this kernel - maybe that will provide some clues.

Also note, I have additionally applied the gmux backlight control fix here: https://bugzilla.kernel.org/show_bug.cgi?id=105051. I have not tried the kernel without this fix.
Comment 312 Matt Pelland 2016-12-05 13:54:45 UTC
I'm running 4.8.10 without this patch and have the screen flickering issue outlined in Tom B's bug report. So, I don't think they're related.
Comment 313 Greg Oliver 2016-12-05 23:55:53 UTC
Yeah - sorry - I was wrong.  It is driver related somehow.  If you don't mind using a HD resolution (which I usually do to line up my external monitor anyway), then moving -> 1920x1200 fixes it for me.
Comment 314 pkozlov 2017-01-19 11:52:13 UTC
Guys, do you have any updates on that? Do you need any additional info from mac book pro holders?

Do you have enough information to fix it?

The discussion became too long to try to find something useful here - a lot of information can be out of date for that moment (19.01.2017).
Comment 315 Luis de Bethencourt 2017-01-19 21:17:03 UTC
For what it's worth, I am running 4.10-rc4 in my Macbook and sleep doesn't work. 

The rest works well though.
Comment 316 mttdbrd 2017-01-20 04:33:59 UTC
(In reply to pkozlov from comment #314)
> Guys, do you have any updates on that? Do you need any additional info from
> mac book pro holders?
> 
> Do you have enough information to fix it?
> 
> The discussion became too long to try to find something useful here - a lot
> of information can be out of date for that moment (19.01.2017).

The patch in #267 (https://bugzilla.kernel.org/show_bug.cgi?id=103211#c267) contains a patch you can apply. You'll have to build your own kernel, but there are tons of guides on the internet showing you how to do it.

I agree that progress on this is slow and intermittent, but if Bjorn Helgaas is correct, it's a problem caused by Apple's own developers.
Comment 317 Cédric Le Goater 2017-01-20 09:29:56 UTC
(In reply to Luis de Bethencourt from comment #315)
> For what it's worth, I am running 4.10-rc4 in my Macbook and sleep doesn't
> work. 
> 
> The rest works well though.

I still use 3 patches to fix : 

  - poweroff/suspend 
  - screen flickering
  - brightness control 

The kernel is a 4.8.x stable, tree is here :

  https://github.com/legoater/linux/commits/macbook-4.8

and with it, I have kept the macbook running a couple of
weeks doing suspend/resume. I do not use thunderbolt adapters 
or external displays though.
Comment 318 Luis de Bethencourt 2017-01-20 10:22:19 UTC
(In reply to Cédric Le Goater from comment #317)
> (In reply to Luis de Bethencourt from comment #315)
> > For what it's worth, I am running 4.10-rc4 in my Macbook and sleep doesn't
> > work. 
> > 
> > The rest works well though.
> 
> I still use 3 patches to fix : 
> 
>   - poweroff/suspend 
>   - screen flickering
>   - brightness control 
> 
> The kernel is a 4.8.x stable, tree is here :
> 
>   https://github.com/legoater/linux/commits/macbook-4.8
> 
> and with it, I have kept the macbook running a couple of
> weeks doing suspend/resume. I do not use thunderbolt adapters 
> or external displays though.

I will apply https://github.com/legoater/linux/commit/6ac6e05036e9dd572eb7e7d7b79c34a8611c87e3
to my 4.10-rc4 kernel. I am sure it will solve the issue. Thanks :)
Comment 319 Cédric Le Goater 2017-01-20 17:39:20 UTC
That repo might interest Fedora users : 

https://copr.fedorainfracloud.org/coprs/pgier/macbook-kernel/

I haven't tried it yet but I will soon. That's a great initiative.
Comment 320 Dj 2017-02-23 01:08:08 UTC
I've seen a few reports that standby/shutdown is no longer an issue on the latest 4.4 kernel build (specifically, 4.4.0-63-generic as stated as a comment on http://askubuntu.com/posts/comments/1383001?noredirect=1). Can anyone confirm this? Do we know if Chen Yu's patch was ever worked into a release candidate?
Comment 321 Lee Lian Hoy 2017-02-23 02:45:41 UTC
(In reply to Dj from comment #320)
> I've seen a few reports that standby/shutdown is no longer an issue on the
> latest 4.4 kernel build (specifically, 4.4.0-63-generic as stated as a
> comment on http://askubuntu.com/posts/comments/1383001?noredirect=1). Can
> anyone confirm this? Do we know if Chen Yu's patch was ever worked into a
> release candidate?

The patch was not pushed to mainline because the underlying issue was not discovered (comment 279). The patch was applied to the Ubuntu version of the kernel only.
Comment 322 Hugo 2017-02-23 02:58:55 UTC
It's still an issue in 4.9.11, so I'm guessing all previous official releases still have it too.

Or did you mean the patch worked on that version?
Comment 323 Dj 2017-02-23 05:14:36 UTC
(In reply to Hugo Osvaldo Barrera from comment #322)
> It's still an issue in 4.9.11, so I'm guessing all previous official
> releases still have it too.
> 
> Or did you mean the patch worked on that version?

No I meant that people were reporting that there was no longer a standby/shutdown issue on Ubuntu's 4.4.0-63-generic kernel. As Lee Lian Hoy mentioned (comment 321), someone pushed Chen Yu's path into the Ubuntu version of the kernel. The issue is still unresolved in the mainline.
Comment 324 Tom B 2017-02-23 18:20:11 UTC
I'm on Arch linux with the 4.9.11 kernel. Shutdown/suspend/wakeup works perfectly, it didn't used to.

As far as I'm aware, Arch doesn't use this patch when compiling the kernel.
Comment 325 Hugo 2017-02-23 18:25:34 UTC
I'm also on ArchLinux, 4.9.11, with the stock kernel. Are you sure you've the same MacBook model? No kernel special kernel params?
Comment 326 Luis de Bethencourt 2017-03-08 23:00:55 UTC
I am still having problems with the laptop waking itself up from sleep.

I've applied Chen Yu's patch (from comment 267) on top of 4.11-rc1, and it wakes up a few second after putting the laptop to sleep via gnome-shell's sleep button.

Am I doing something wrong?

How can I check the correct DMI Board Vendor and Product Name are being used to activate that fix in drivers/pci/quirks.c?
Comment 327 Zach Norman 2017-03-08 23:05:33 UTC
Created attachment 255145 [details]
attachment-28579-0.html

Did you also do the acpi wakeup changes in proc?

sudo sh -c 'echo LID0 > /proc/acpi/wakeup'
sudo sh -c 'echo XHC1 > /proc/acpi/wakeup'

                                                            sudo sh -c
'echo ARPT > /proc/acpi/wakeup'

You may not need all of these. If this works the best way to make permanent
is with a udev rule.

On Wed, Mar 8, 2017 at 5:01 PM <bugzilla-daemon@bugzilla.kernel.org> wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=103211
>
> --- Comment #326 from Luis de Bethencourt (luis@debethencourt.com) ---
> I am still having problems with the laptop waking itself up from sleep.
>
> I've applied Chen Yu's patch (from comment 267) on top of 4.11-rc1, and it
> wakes up a few second after putting the laptop to sleep via gnome-shell's
> sleep
> button.
>
> Am I doing something wrong?
>
> How can I check the correct DMI Board Vendor and Product Name are being
> used to
> activate that fix in drivers/pci/quirks.c?
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
Comment 328 Luis de Bethencourt 2017-03-09 12:16:48 UTC
(In reply to Zach Norman from comment #327)
> Created attachment 255145 [details]
> attachment-28579-0.html
> 
> Did you also do the acpi wakeup changes in proc?
> 
> sudo sh -c 'echo LID0 > /proc/acpi/wakeup'
> sudo sh -c 'echo XHC1 > /proc/acpi/wakeup'
> 
> sudo sh -c 'echo ARPT > /proc/acpi/wakeup'
> 
> You may not need all of these. If this works the best way to make permanent
> is with a udev rule.
> 

That did the trick! Combined with adding
|| dmi_match(DMI_PRODUCT_NAME, "MacBookPro11,1")
To the patch by Chen Yu.

I discovered the dmi product name of my machine using the dmidecode tool. Mentioning this for future people arriving to this thread.

Do you happen to know what XHC1 and ARPT represent in the devices in /proc/acpi/wakeup?
LID0 is self-explanatory :P
Comment 329 thejoe 2017-05-05 20:01:46 UTC
@Bjorn Helgaas

I've tested all 3 cases of reloc from comment #306.

In all 3 cases (with no reloc, with reloc=0x7fa00000, and with reloc=0x7fc00000) the system is able to poweroff successfully.  I did check dmesg to see that the "relocating BAR 0 to" message was displaying in both reloc cases, so I believe that I was actually testing the desired paths.
Comment 330 Luis de Bethencourt 2017-05-05 23:25:33 UTC
(In reply to thejoe from comment #329)
> @Bjorn Helgaas
> 
> I've tested all 3 cases of reloc from comment #306.
> 
> In all 3 cases (with no reloc, with reloc=0x7fa00000, and with
> reloc=0x7fc00000) the system is able to poweroff successfully.  I did check
> dmesg to see that the "relocating BAR 0 to" message was displaying in both
> reloc cases, so I believe that I was actually testing the desired paths.

Hi! This testing is great.

Could you test pm-suspend as well?

It is the main reason I run a custom kernel, because of the off-tree patches listed in this thread.
Comment 331 thejoe 2017-05-08 20:41:41 UTC
confirmed pm-suspend works with all 3 cases of reloc as well
Comment 332 pkozlov 2017-06-28 08:07:02 UTC
I have noticed that this patch is included in gentoo-sources since 4.9.10 https://gitweb.gentoo.org/proj/linux-patches.git/commit/?h=4.9&id=a33c122c9c5757113124bcd857704ec49864687a
Comment 333 Zhang Rui 2017-06-28 08:33:17 UTC
Hi, Bjorn, what's the status of the patch in cooment #306? is it targeted for upstream?
Comment 334 Bjorn Helgaas 2017-06-29 19:23:47 UTC
Created attachment 257231 [details]
proposed v4.13 patch

I think the previous patches are a little too general -- they clear dev->is_hotplug_bridge for *all* bridges in the system.  There are other bridges, including a Thunderbolt controller, where we probably do want to support hotplug.

Please test this patch and report results.
Comment 335 thejoe 2017-07-01 04:16:53 UTC
Confirmed poweroff & suspend both work with the patch from Comment 334.
Comment 336 Bjorn Helgaas 2017-07-01 12:57:42 UTC
Created attachment 257281 [details]
proposed v4.13 patch, v2

Thanks, thejoe, for all your help with previous debugging and testing this patch.

I hate to iterate yet again, but I think this is a better patch because it focuses directly on the address space, not the indirect connection of hotplug-capable -> assign window -> choose the 0x7fa00000-0x7fbfffff region.  Any of those things could easily change as Linux evolves, which would break the quirk.

So here's another patch.  I'd be grateful for any testing.  You should see this in dmesg:

  pci 0000:00:1c.0: claimed MacBook Pro poweroff workaround [mem 0x7fa00000-0x7fbfffff]

and this in /proc/iomem:

    7fa00000-7fbfffff : MacBook Pro poweroff workaround
Comment 337 thejoe 2017-07-02 04:31:15 UTC
Patch from Comment 336 works, and I see the dmesg and /proc/iomem as indicated.  I realized I tested the wrong kernel build for the comment 334 patch, let me know if you'd like that retested, but sounds like the 336 patch is preferred, and I actually tested that one correctly.
Comment 338 Pablo Catalina 2017-07-08 09:43:18 UTC
Patch from Comment 336 works fine on MacBookPro 11,4
Comment 339 Bastian Triller 2017-07-08 10:06:28 UTC
Patch attachment 257281 [details] from comment 336 works for me too
Comment 341 Arnaud Astruc 2017-07-17 10:00:02 UTC
So patch attachment 257281 [details] from comment 336 will be on 4.13 ?
Comment 343 Hugo 2017-07-17 13:11:03 UTC
Thanks!
Comment 344 Leo Ufimtsev 2017-07-31 15:09:26 UTC
Does anyone know which version of Fedora/fedora kernel will contain this patch?
Comment 345 Paolo Bonzini 2017-08-02 11:13:15 UTC
It's going to be in 4.12.4.
Comment 346 Leo Ufimtsev 2017-08-02 15:07:43 UTC
(In reply to Paolo Bonzini from comment #345)
> It's going to be in 4.12.4.

Verified on Fedora 26 w/ rawhide kernel repo:
4.13.0-0.rc3.git1.2.fc27.x86_64

(Mac Pro 15', 2016, non-touchbar).

Shutdown/reboot/standby work.

As added bonus, with "kernel-devel" package, also got the webcam to work:
https://github.com/patjak/bcwc_pcie/wiki/Get-Started
Comment 347 Simon 2017-09-25 08:31:22 UTC
(In reply to Leo Ufimtsev from comment #346)
> (In reply to Paolo Bonzini from comment #345)
> > It's going to be in 4.12.4.
> 
> Verified on Fedora 26 w/ rawhide kernel repo:
> 4.13.0-0.rc3.git1.2.fc27.x86_64
> 
> (Mac Pro 15', 2016, non-touchbar).
> 
> Shutdown/reboot/standby work.
> 
> As added bonus, with "kernel-devel" package, also got the webcam to work:
> https://github.com/patjak/bcwc_pcie/wiki/Get-Started

Did you have to execute these echo statements as well or did it work without them?

> sudo sh -c 'echo LID0 > /proc/acpi/wakeup'
> sudo sh -c 'echo XHC1 > /proc/acpi/wakeup'
Comment 348 Bastian Triller 2017-09-25 10:29:41 UTC
I only echo XHC1 to /proc/acpi/wakeup to prevent spurious wakeups from
suspend.
(Respectively a script in /lib/systemd/system-sleep)

On Mon, 2017-09-25 at 08:31 +0000, bugzilla-daemon@bugzilla.kernel.org
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=103211
> 
> Simon (simon.vanderveldt@gmail.com) changed:
> 
>            What    |Removed                     |Added
> -------------------------------------------------------------------
> ---------
>                  CC|                            |simon.vanderveldt@gm
> ail.com
> 
> --- Comment #347 from Simon (simon.vanderveldt@gmail.com) ---
> (In reply to Leo Ufimtsev from comment #346)
> > (In reply to Paolo Bonzini from comment #345)
> > > It's going to be in 4.12.4.
> > 
> > Verified on Fedora 26 w/ rawhide kernel repo:
> > 4.13.0-0.rc3.git1.2.fc27.x86_64
> > 
> > (Mac Pro 15', 2016, non-touchbar).
> > 
> > Shutdown/reboot/standby work.
> > 
> > As added bonus, with "kernel-devel" package, also got the webcam to
> > work:
> > https://github.com/patjak/bcwc_pcie/wiki/Get-Started
> 
> Did you have to execute these echo statements as well or did it work
> without
> them?
> 
> > sudo sh -c 'echo LID0 > /proc/acpi/wakeup'
> > sudo sh -c 'echo XHC1 > /proc/acpi/wakeup'
> 
>
Comment 349 Bastian Triller 2017-09-25 10:31:31 UTC
Created attachment 258579 [details]
disable XCH1 on systemd suspend
Comment 350 Bjorn Helgaas 2017-09-25 16:03:14 UTC
This issue is closed because we believe it is fixed by 13cfc732160f ("PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11") [1], which appeared in v4.13.

If there are other issues, e.g., the spurious wakeups mentioned in comment 347, comment 348, comment 349, please open a new bugzilla report for them.  This report is long enough :)


[1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=13cfc732160f