Bug 61721

Summary: Sony Vaio Z (VPCZ1, 2010) power off / reboot is very slow unless I pass 'reboot=pci'
Product: ACPI Reporter: Adam Williamson (adamw)
Component: Power-OffAssignee: Lan Tianyu (tianyu.lan)
Status: CLOSED WILL_FIX_LATER    
Severity: normal CC: tianyu.lan
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.10 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmidecode from 3.10.12
debug.patch
de

Description Adam Williamson 2013-09-20 05:52:40 UTC
One of my systems is a Sony Vaio Z, the 2010 model often referred to as VPC-Z1 or VPCZ1. I don't know precisely when this started - some months ago - but at some point, system shutdown / reboot became very slow.

It's not a systemd-level issue, the shutdown process gets beyond the systemd level. I see:

Rebooting.
[timestamp] mei_me 0000:00:16.0: stop
[timestamp] Restarting system.

(or equivalent messages for a shut down), and *then* it sticks, for about one or two minutes before finally actually shutting down/rebooting. This is very late in the shutdown process AIUI.

Thanks to a tip from Dave Airlie, I tried 'reboot=bios', 'reboot=acpi' and 'reboot=pci' parameters. With both 'bios' and 'acpi' the shutdown remains slow, but 'pci' seems to fix the problem: I get fast shutdowns again!

I'm assuming this 'ought to' work without the user having to pass a magic kernel parameter; it used to. Please let me know what further info is needed, I really don't know what to include. I'm fairly sure I've been seeing this since at least 3.8, but it was probably earlier than that. Current kernel on the affected box is 3.10.12; I can quite easily try a 3.11 kernel if necessary, trying 3.12 may be a bit harder.
Comment 1 Lan Tianyu 2013-09-22 01:47:13 UTC
Please provide the output of dmidecode. Could you try old kernel and check whether this is a regression?
Comment 2 Adam Williamson 2013-09-23 23:05:50 UTC
Just installed the oldest F18 kernel I can find, 3.6.something, and it happens with that too. Which isn't how I remembered it, but...that's how it appears to be. Will attach dmidecode from 3.10 in a minute.

The system's disk activity light seems to be on a lot while it's in the 'stuck' state, but I'm not sure what that could be (and I can't tell if the 'disk' is actually doing something by the time-honoured 'hold your ear to the case' technique, as it's an SSD).
Comment 3 Adam Williamson 2013-09-23 23:16:30 UTC
Created attachment 109341 [details]
dmidecode from 3.10.12
Comment 4 Lan Tianyu 2013-09-24 02:26:11 UTC
Created attachment 109351 [details]
debug.patch

Please try this patch. Thanks.
Comment 5 Adam Williamson 2013-09-25 08:29:13 UTC
Will do, but note that VPCZ112GD is an *extremely* specific model number. There are many 'variants' of the VPCZ11 line which are pretty much identical - just different loadouts of RAM and CPU spec sold in different markets, I think, and I'd be very surprised if they didn't suffer the same issue. The Z12 and Z13 models were minor spec bumps over the Z11, sold in the same chassis, and may well have it too.

There is actually a Linux-on-Vaio Z mailing list which is still somewhat active, and there's a thread there which rather looks like this issue (I rather suspect they were too impatient to discover it actually *does* reboot if you wait long enough):

https://lists.launchpad.net/sony-vaio-z-series/msg02964.html

The OP there has a different Z11 sub-model, and another poster mentions a Z13 which 'hangs' on reboot:

https://lists.launchpad.net/sony-vaio-z-series/msg02968.html

This older thread has several similar reports from Z11 and Z13 owners:

https://lists.launchpad.net/sony-vaio-z-series/msg02868.html

so you may want to go with a more general match than my specific submodel of the Z11 :)
Comment 6 Adam Williamson 2013-09-25 08:30:30 UTC
VPCZ14 and Z15 seem to exist but by the Google results may be Japan-only, I don't know anything about them.
Comment 7 Adam Williamson 2013-09-25 08:35:53 UTC
also - I'm not familiar with the code in question and the name reboot.c may be misleading, I suppose, but note this bug affects poweroff too: would a similar patch be needed to a 'poweroff.c' file or something? :)
Comment 8 Adam Williamson 2013-09-25 09:13:22 UTC
Passing on a report from list member Brett Howard:

"Thanks much for the work kind sir.  I've not got a Bugzilla account so I can't post this to the bug but I will state that my VPCZ12CGX also has this issue and it is resolved by passing the kernel parameter.  

Here is my output from dmidecode:

...
System Information
	Manufacturer: Sony Corporation
	Product Name: VPCZ12CGX
	Version: J004A1P1
	Serial Number: 54024526-0001410
	UUID: B04FA3FA-E388-DF11-9C53-0024BED6DF9D
	Wake-up Type: Power Switch
	SKU Number: N/A
	Family: VAIO
...
"

I can attach the full dmidecode output he sent me if it's needed for anything.
Comment 9 Adam Williamson 2013-09-25 23:01:04 UTC
Another report:

"Thank you! My VPC-Z11C5E now reboots normally."
Comment 10 Adam Williamson 2013-09-27 17:12:08 UTC
And one more:

"The parameter fixed the reboot hang/looonnng-reboot. The shutdown worked well. The details on my model:"

...
System Information
        Manufacturer: Sony Corporation
        Product Name: VPCZ13M9E
        Version: J004CYHD
        Serial Number: XXXX
        UUID: XXXX
        Wake-up Type: Power Switch
        SKU Number: N/A
        Family: VAIO
...

I don't know what pattern matching style is used for these things, but I think what we want is roughly:

DMI_MATCH(DMI_PRODUCT_NAME, "VPCZ1[1-5]???"),

I think that should be safe and match most/all affected systems (I'm not 100% sure of whether the VPCZ16 and VPCZ17 actually exist or not; there's some results indicating they exist in Australia/New Zealand, but nothing definitive).
Comment 11 Lan Tianyu 2013-10-08 06:51:19 UTC
The code uses strstr(...) to match machine. Please see dmi_matches() in the drivers/firmware/dmi_scan.c.

We can use 'DMI_MATCH(DMI_PRODUCT_NAME, "VPCZ1")' to match all these machine.
Comment 12 Lan Tianyu 2013-10-08 07:57:48 UTC
Created attachment 110441 [details]
de
Comment 13 Lan Tianyu 2013-10-08 07:58:25 UTC
Please try the patch. Thanks.
Comment 14 Adam Williamson 2013-10-08 19:02:35 UTC
Just 'VPCZ1' may match some older models that don't need the patch. I'll have to look it up. I will test the patch when I can - sorry I didn't yet, I've been crazy busy lately :(
Comment 15 Lan Tianyu 2013-10-11 08:48:16 UTC
(In reply to Adam Williamson from comment #14)
> Just 'VPCZ1' may match some older models that don't need the patch. I'll
> have to look it up. I will test the patch when I can - sorry I didn't yet,

I think one entry for one machine is comparative secure. Could gou collect these machines dmidecode? Thanks.

> I've been crazy busy lately :(
Comment 16 Lan Tianyu 2013-10-23 02:13:20 UTC
ping ...
Comment 17 Adam Williamson 2013-10-23 19:07:01 UTC
Sorry for the delay, I am building a test kernel for this fix right now.

I've also been looking at model numbers. It looks like, after all, just 'VPCZ1' should be OK. Looking at https://en.wikipedia.org/wiki/Sony_Vaio_Z_series and other sources, I think all other series of 'Z' have different model names:

late 2011 'VPCZ2.....'
2012 'SVZ1....'
2008-9 'VGNZ....'

So I believe 'VPCZ1' should be a unique identifier for the 2010 series, and the feedback I gathered seems to indicate pretty strongly that all VPCZ1 models are affected by the bug.

Will update shortly with test results.
Comment 18 Adam Williamson 2013-10-23 19:46:33 UTC
Patch doesn't seem to work :/ after dropping 'reboot=pci' from cmdline, I get a slow reboot.
Comment 19 Adam Williamson 2013-10-23 19:57:28 UTC
Huh - my second reboot with the new kernel was fast...odd.
Comment 20 Adam Williamson 2013-10-23 20:42:43 UTC
Next four reboots have all been good too, so I probably just hit something else on my first try. Fix looks good.
Comment 21 Lan Tianyu 2013-10-24 07:24:07 UTC
Ok. Thanks for test.
The fix patch has been sent to x86 and ACPI maillist. So mark this bug as code fix.
https://patchwork.kernel.org/patch/3090271/
Comment 22 Lan Tianyu 2013-10-26 13:03:50 UTC
Reopen this bug since the patch is not acceptable.

Hi Adam:
        Could you provide the output of acpidump?
Comment 23 Lan Tianyu 2013-11-21 05:38:58 UTC
Hi Adam:
        I found another power off issue on the Acer Aspire V5-573G and the issue cause it that the SATA host controller's pci master bit is cleared during shutdown. Could you please try the patch in the bug 63861's Comment 11 to check whether this bug is the same bug?
        https://bugzilla.kernel.org/show_bug.cgi?id=63861
Comment 24 Adam Williamson 2013-11-21 05:59:29 UTC
thanks lan - i'll throw it on the list of Things I Really Ought To Be Doing But Which Aren't Fedora 20 Validation So I Never Get Around To Doing Them :)
Comment 25 Lan Tianyu 2013-11-21 11:18:17 UTC
Ok. I get it. So mark this bug as WILL_FIX_LATER and feel free to reopen it when you have time.