Bug 36302 - SandyBridge can't do s2disk with linux 3.0-rc1
SandyBridge can't do s2disk with linux 3.0-rc1
Status: CLOSED CODE_FIX
Product: Drivers
Classification: Unclassified
Component: Video(DRI - Intel)
x86-64 Linux
: P1 high
Assigned To: drivers_video-dri-intel@kernel-bugs.osdl.org
:
Depends on:
Blocks: 7216 36912
  Show dependency treegraph
 
Reported: 2011-05-30 23:10 UTC by Alex Zhavnerchik
Modified: 2011-06-29 19:22 UTC (History)
6 users (show)

See Also:
Kernel Version: 3.0-rc5
Tree: Mainline
Regression: Yes


Attachments
kernel.log_3.0.0_rc2 (419.09 KB, text/plain)
2011-06-07 20:23 UTC, Alex Zhavnerchik
Details
kernel.log_3.0.0_rc3 (326.72 KB, text/plain)
2011-06-14 07:19 UTC, Alex Zhavnerchik
Details
kernel.log_2.6.39 (221.73 KB, text/plain)
2011-06-16 13:12 UTC, Alex Zhavnerchik
Details
kernel.log_2.6.39_vanila (178.21 KB, text/plain)
2011-06-19 22:37 UTC, Alex Zhavnerchik
Details
kernel.log_3.0.0_rc5 (363.36 KB, text/plain)
2011-06-28 09:41 UTC, Alex Zhavnerchik
Details

Description Alex Zhavnerchik 2011-05-30 23:10:20 UTC
Laptop can't suspend to disk with 3.0-rc1. Here is trace from dmesg.

[ 1279.227422] ------------[ cut here ]------------
[ 1279.227425] WARNING: at drivers/gpu/drm/i915/i915_drv.c:340 gen6_gt_force_wake_put+0x1f/0x45 [i915]()
[ 1279.227426] Hardware name: 4170CTO
[ 1279.227427] Modules linked in: aesni_intel cryptd aes_x86_64 aes_generic parport_pc ppdev lp parport acpi_cpufreq cpufreq_stats cpufreq_userspace mperf rfcomm bnep cpufreq_conservative cpufreq_powersave uinput fuse loop snd_hda_codec_hdmi snd_hda_codec_conexant uvcvideo btusb videodev joydev bluetooth arc4 media v4l2_compat_ioctl32 i915 snd_hda_intel snd_hda_codec drm_kms_helper drm snd_hwdep snd_pcm i2c_algo_bit tpm_tis snd_seq i2c_i801 thinkpad_acpi snd_timer i2c_core snd_seq_device tpm iwlagn mac80211 snd mei(C) soundcore nvram tpm_bios battery ac video wmi evdev snd_page_alloc psmouse serio_raw cfg80211 power_supply rfkill button processor pcspkr ext4 mbcache jbd2 crc16 dm_mod sd_mod crc_t10dif xhci_hcd thermal thermal_sys ahci libahci sdhci_pci sdhci ehci_hcd libata mmc_core e1000e usbcore scsi_mod [last unloaded: scsi_wait_scan]
[ 1279.227448] Pid: 4893, comm: kworker/u:25 Tainted: G        WC  3.0.0-rc1 #3
[ 1279.227448] Call Trace:
[ 1279.227450]  [<ffffffff810407ac>] ? warn_slowpath_common+0x78/0x8c
[ 1279.227454]  [<ffffffffa03374c2>] ? gen6_gt_force_wake_put+0x1f/0x45 [i915]
[ 1279.227458]  [<ffffffffa034112c>] ? i915_read8+0x38/0x5b [i915]
[ 1279.227462]  [<ffffffffa0343107>] ? i915_restore_display+0xfe3/0x1029 [i915]
[ 1279.227466]  [<ffffffffa0343371>] ? i915_restore_state+0x4c/0x1e3 [i915]
[ 1279.227469]  [<ffffffffa03370db>] ? i915_drm_thaw+0x47/0xcd [i915]
[ 1279.227472]  [<ffffffffa03372fc>] ? i915_resume+0x3a/0x4e [i915]
[ 1279.227474]  [<ffffffff8123347e>] ? pm_op+0x118/0x140
[ 1279.227475]  [<ffffffff8105e72f>] ? async_schedule+0xc/0xc
[ 1279.227477]  [<ffffffff812336ff>] ? device_resume+0x89/0xc1
[ 1279.227479]  [<ffffffff8123374b>] ? async_resume+0x14/0x38
[ 1279.227480]  [<ffffffff8105e7c5>] ? async_run_entry_fn+0x96/0x141
[ 1279.227482]  [<ffffffff810551c1>] ? process_one_work+0x163/0x284
[ 1279.227483]  [<ffffffff810560f4>] ? worker_thread+0xc2/0x145
[ 1279.227485]  [<ffffffff81056032>] ? manage_workers.isra.24+0x15b/0x15b
[ 1279.227487]  [<ffffffff81058fc9>] ? kthread+0x76/0x7e
[ 1279.227488]  [<ffffffff813218a4>] ? kernel_thread_helper+0x4/0x10
[ 1279.227490]  [<ffffffff81058f53>] ? kthread_worker_fn+0x139/0x139
[ 1279.227492]  [<ffffffff813218a0>] ? gs_change+0x13/0x13
[ 1279.227492] ---[ end trace a7c30e4bceb9b906 ]---
Comment 1 Alex Zhavnerchik 2011-06-07 20:23:44 UTC
Created attachment 61152 [details]
kernel.log_3.0.0_rc2

I've attached new kernel log for linux 3.0.0-rc2 with s2disk and s2ram attempts
Comment 2 m.b.lankhorst@gmail.com 2011-06-12 21:07:58 UTC
Can you paste the 50 lines above it?
Comment 3 Alex Zhavnerchik 2011-06-12 21:12:47 UTC
I've attachaed full kernel log, you can find all information there
Comment 4 m.b.lankhorst@gmail.com 2011-06-12 21:32:34 UTC
nm, didn't see the rc2 log :)

does reverting fcca7926299944841569515da321bef9655b7703 help?
Comment 5 Alex Zhavnerchik 2011-06-13 08:56:43 UTC
kernel log is attached to bugzilla bugreport, and no I think that commit fcca7926299944841569515da321bef9655b7703 doesn't affect because 2.6.39 can do s2disk and s2ram without any problem. Actually that is why this issue marked as a regression ;)
Comment 6 m.b.lankhorst@gmail.com 2011-06-13 09:17:36 UTC
woops my bad, was using 'git log', seems I need to specify a time frame too.

What is exactly the issue? Does it not finish resume, or is it just the output being disabled? Seems like latter.

If you feel like isolating the offending commit, could try with git bisect like this:

git bisect start v3.0-rc1 v2.6.39 -- drivers/gpu/drm/i915/
Comment 7 Alex Zhavnerchik 2011-06-13 15:50:12 UTC
Unfortunatelly I can't bisect because kernel crash with kernel panic on some commits during bisection and I can't guarantie that these builds can do s2disk or s2ram correctly
Comment 8 Alex Zhavnerchik 2011-06-13 15:53:06 UTC
And one more the issue appears during freezing the system - i.e. it can't freez current state and return back.
Comment 9 m.b.lankhorst@gmail.com 2011-06-13 23:02:08 UTC
can you modprobe drm with debug=6? (or append drm.debug=6 to commandline if drm is builtin)
Comment 10 Alex Zhavnerchik 2011-06-14 07:19:08 UTC
Created attachment 61942 [details]
kernel.log_3.0.0_rc3

I've attached kernel log for 3.0-rc3 with drm.debug=6
Comment 11 m.b.lankhorst@gmail.com 2011-06-16 08:43:18 UTC
Can you add a log for v2.6.39 to compare with?
Comment 12 Alex Zhavnerchik 2011-06-16 13:12:24 UTC
Created attachment 62252 [details]
kernel.log_2.6.39

I've attached kernel log for 2.6.39 with enabled debugging for drm
Comment 13 m.b.lankhorst@gmail.com 2011-06-16 13:59:24 UTC
Only 1 thing seemed to really differ between 2.6.39 and 3.0 if I purely look at the resume part:

failing:
[drm:gen6_fdi_link_train], FDI_RX_IIR 0x100
[drm:gen6_fdi_link_train], FDI train 1 done.
[drm:gen6_fdi_link_train], FDI_RX_IIR 0x600
[drm:gen6_fdi_link_train], FDI train 2 done.
[drm:gen6_fdi_link_train], FDI train done.

working:
[drm:gen6_fdi_link_train], FDI_RX_IIR 0x700
[drm:gen6_fdi_link_train], FDI train 1 done.
[drm:gen6_fdi_link_train], FDI_RX_IIR 0x600
[drm:gen6_fdi_link_train], FDI train 2 done.
[drm:gen6_fdi_link_train], FDI train done.

Earlier in the log, from boot, FDI_RX_IIR reports 0x700/0x600.

Rest seems to be the same.
Comment 14 m.b.lankhorst@gmail.com 2011-06-17 14:07:25 UTC
You're using a debian kernel, is it possible to test against v2.6.39.0 just to be sure they have no patches for it?
Comment 15 Alex Zhavnerchik 2011-06-19 22:37:12 UTC
Created attachment 62922 [details]
kernel.log_2.6.39_vanila

I've attached vanila kernel log for 2.6.39
Comment 16 Alex Zhavnerchik 2011-06-28 09:41:07 UTC
Created attachment 63682 [details]
kernel.log_3.0.0_rc5

The warning still in place with kernel 3.0.0-rc5

Jun 28 12:25:45 alex kernel: [  144.676598] ------------[ cut here ]------------                                                                 
Jun 28 12:25:45 alex kernel: [  144.676609] WARNING: at drivers/gpu/drm/i915/i915_drv.c:322 gen6_gt_force_wake_get+0x21/0x95 [i915]()            
Jun 28 12:25:45 alex kernel: [  144.676610] Hardware name: 4170CTO                                                                               
Jun 28 12:25:45 alex kernel: [  144.676611] Modules linked in: acpi_cpufreq cpufreq_stats parport_pc mperf ppdev lp parport speedstep_lib rfcomm\
 bnep bluetooth uinput fuse loop snd_hda_codec_hdmi snd_hda_codec_conexant thinkpad_acpi arc4 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss \
snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq joydev snd_timer iwlagn uvcvideo videodev mac80211 i2c_i801 media snd_\
seq_device v4l2_compat_ioctl32 cfg80211 rfkill i915 mei(C) drm_kms_helper tpm_tis tpm tpm_bios drm snd nvram pcspkr wmi soundcore snd_page_alloc\
 i2c_algo_bit i2c_core evdev battery button processor video ac power_supply psmouse serio_raw ext4 mbcache jbd2 crc16 dm_mod sd_mod crc_t10dif x\
hci_hcd ahci libahci libata scsi_mod ehci_hcd thermal thermal_sys usbcore sdhci_pci sdhci mmc_core e1000e [last unloaded: scsi_wait_scan]        
Jun 28 12:25:45 alex kernel: [  144.676641] Pid: 3863, comm: kworker/u:21 Tainted: G        WC  3.0.0-rc5 #1                                     
Jun 28 12:25:45 alex kernel: [  144.676642] Call Trace:                                                                                          
Jun 28 12:25:45 alex kernel: [  144.676647]  [<ffffffff8103e001>] ? warn_slowpath_common+0x78/0x8c                                               
Jun 28 12:25:45 alex kernel: [  144.676651]  [<ffffffffa0256464>] ? gen6_gt_force_wake_get+0x21/0x95 [i915]                                      
Jun 28 12:25:45 alex kernel: [  144.676656]  [<ffffffffa025ffb7>] ? i915_read32+0x27/0x5a [i915]                                                 
Jun 28 12:25:45 alex kernel: [  144.676661]  [<ffffffffa02622d9>] ? i915_save_state+0xea/0x1d8 [i915]                                            
Jun 28 12:25:45 alex kernel: [  144.676664]  [<ffffffffa025606d>] ? i915_drm_freeze+0x6d/0x85 [i915]                                             
Jun 28 12:25:45 alex kernel: [  144.676668]  [<ffffffffa025620f>] ? i915_pm_suspend+0x4c/0x6d [i915]                                             
Jun 28 12:25:45 alex kernel: [  144.676671]  [<ffffffff811a713f>] ? pci_pm_suspend+0x73/0xf5                                                     
Jun 28 12:25:45 alex kernel: [  144.676674]  [<ffffffff81232349>] ? pm_op+0x83/0x140                                                             
Jun 28 12:25:45 alex kernel: [  144.676676]  [<ffffffff81232511>] ? __device_suspend+0xb6/0x116                                                  
Jun 28 12:25:45 alex kernel: [  144.676679]  [<ffffffff8105c3d3>] ? async_schedule+0xc/0xc                                                       
Jun 28 12:25:45 alex kernel: [  144.676680]  [<ffffffff81232585>] ? async_suspend+0x14/0x38                                                      
Jun 28 12:25:45 alex kernel: [  144.676682]  [<ffffffff8105c469>] ? async_run_entry_fn+0x96/0x141                                                
Jun 28 12:25:45 alex kernel: [  144.676685]  [<ffffffff81052c2c>] ? process_one_work+0x16d/0x298                                                 
Jun 28 12:25:45 alex kernel: [  144.676687]  [<ffffffff81053b73>] ? worker_thread+0xc2/0x145                                                     
Jun 28 12:25:45 alex kernel: [  144.676688]  [<ffffffff81053ab1>] ? manage_workers.isra.25+0x15b/0x15b                                           
Jun 28 12:25:45 alex kernel: [  144.676691]  [<ffffffff81056aae>] ? kthread+0x76/0x7e                                                            
Jun 28 12:25:45 alex kernel: [  144.676694]  [<ffffffff813224e4>] ? kernel_thread_helper+0x4/0x10                                                
Jun 28 12:25:45 alex kernel: [  144.676696]  [<ffffffff81056a38>] ? kthread_worker_fn+0x139/0x139                                                
Jun 28 12:25:45 alex kernel: [  144.676698]  [<ffffffff813224e0>] ? gs_change+0x13/0x13                                                          
Jun 28 12:25:45 alex kernel: [  144.676700] ---[ end trace af99b0efc7bd6887 ]---

But there is following error during freeze process before this warning:
Jun 28 12:24:49 alex kernel: [   98.408433] pci_pm_suspend(): mei_pci_suspend+0x0/0x81 [mei] returns 2500                                        
Jun 28 12:24:49 alex kernel: [   98.408449] pm_op(): pci_pm_suspend+0x0/0xf5 returns 2500                                                        
Jun 28 12:24:49 alex kernel: [   98.408457] PM: Device 0000:00:16.0 failed to suspend async: error 2500

I've attached full kernel log for further investigation.
Comment 17 Alex Zhavnerchik 2011-06-28 12:06:16 UTC
Suspend to ram and to disk seems fixed (at least it worked in several last attempts) with patch from https://lkml.org/lkml/2011/6/13/148.

WARNING: at drivers/gpu/drm/i915/i915_drv.c:322 gen6_gt_force_wake_get+0x21/0x95 [i915]() and other stuff still here but don't brake anything for now. Any way I think will be good to fix this.
Comment 18 m.b.lankhorst@gmail.com 2011-06-28 12:34:03 UTC
so i915 was not the problem, can you confirm that rc1 works without the mei driver?
Comment 19 Alex Zhavnerchik 2011-06-28 12:37:26 UTC
I'll compile rc1 and inform you, but a little bit later today
Comment 20 Alex Zhavnerchik 2011-06-29 16:59:28 UTC
Yeah, rc1 can do s2ram and s2disk without mei. Sorry guys for panic :)
Comment 21 m.b.lankhorst@gmail.com 2011-06-29 17:01:34 UTC
Can someone close bug as fixed please?
Comment 22 Florian Mickler 2011-06-29 19:19:50 UTC
Fix merged for v3.0-rc6:
commit a534bb6eea72c0d082dd2faab85450e5554ba1c8
Author: Tomas Winkler <tomas.winkler@intel.com>
Date:   Mon Jun 13 16:39:31 2011 +0300

    Staging: mei: fix suspend failure

Note You need to log in before you can comment on or make changes to this bug.