Kernel Bug Tracker – Bug 10620
VT switching broken - X does not resume (intel chipset)
Last modified: 2008-05-20 03:05:08 UTC
Latest working kernel version:220.127.116.11
Earliest failing kernel version:2.6.26-rc1
Distribution:Ubuntu 8.04 Hardy
Hardware Environment:see attached lshw.txt
Software Environment:xserver-xorg-video-intel 2:2.2.1-1ubuntu12
Suspend to ram, using the distribution scripts or just echo mem > /sys/power/state.
On resume, the screen stays blank (most of the time), with just the cursor shown (sometime you can move it, sometime no). The system is nevertheless working. Sometime you can recover by hitting ctr-alt-backspace and reenter X, sometime not.
I can't test on console because after X starts, console is busted for me (that's not a regression, it has been like that since at least 2.6.23, and it was decided it was a x driver problem, but it's still not fixed for me...
see https://bugs.launchpad.net/bugs/182865 )
I attach .config, lshw output, and two syslogs. syslog_2.6.25.txt shows a successfull suspend/resume cycle in 18.104.22.168.
syslog_new.txt shows a boot of the new kernel, then a suspend/resume with ended with a blank video, resumed with a ctrl-alt-backspace, and finally a new suspend/resume ended with a blank screen that could not be resumed by a ctrl-alt-backspace, so sysrq-b was needed.
Strange things I see:
There is a WARN_ON at line 1485 of the log (sysdev_driver_register()) but seems unrelated.
A lot of "ACPI handle has no context!" during suspend/resume
A very scary "trying to get vblank count for disabled pipe 0" that smells as a possible guilty failure...
Created attachment 16066 [details]
Created attachment 16067 [details]
Created attachment 16068 [details]
good suspend/resume log with 22.214.171.124
Created attachment 16069 [details]
Log of two failed resume with new kernel
There are comments in the log, search for "*****"
Len, Rafael, Dave: I'm not even sure which subsystem might have caused
this. Perhaps acpi? Can you take a look, please?
Maybe someone on the intel group could be Cc:ed... Jesse for example. He was quite helpful when trying to solve bug#10319 (which I feel could be related, but that's just a feeling...)
Can you try reproducing this with Dave's latest DRM patch applied? It reverts the new vblank code, which may be the culprit...
On Thu, 2008-05-08 at 12:04 -0700, email@example.com
> Can you try reproducing this with Dave's latest DRM patch applied? It reverts
> the new vblank code, which may be the culprit...
It's in current -git? Otherwise, can you point me to the patch?
v2.6.26-rc1-279-g28a4acb: nothing has changed. Still buggy.
Regressions list annotation:
References : http://lkml.org/lkml/2008/5/8/378
Note that my problem was solved with Hugh's patch here: http://lkml.org/lkml/2008/5/13/188
My symptoms were quite different from what he experienced (and what is described in this bug report), but I booted one kernel that didn't have his patch, and could reproduce the problem, and another kernel where the only difference was the application of his patch, and my problem with the X server ignoring keyboard/button input shortly after a suspend/resume, and subsequent restarts of the X server totally malfunctioning, went away after application of his patch.
So you might want to try to see if this patch solves your problem too. (My system was an X61s with an Intel video chipset).
Romano, can you test 2.6.26-rc2-git5 when it's out, please?
> Romano, can you test 2.6.26-rc2-git5 when it's out, please?
as soon es I can. I'm in the middle of a flight sitting in the floor in
Vancouver airport... :-)
Tested with today git, v2.6.26-rc2-433-gf26a398, which as far as I know it's
-git5 or better.
No joy: the things are maybe worst. After resume I have the same symptom than before (black screen with the pointer in the upper left corner). If I can recover the system with ctrl-alt-backspace, now X does not recognize any more the card (I have a dialog saying that the card is not recognized, and ask me if I want to run
the laptop in 640x480 mode).
As a side note, I run i915resolution at boot. Should I try to let this out? Last time I tried it was necessary to have 1280-wide display.
Nice. I discovered that you do not need a suspend/resume: just switching VC can cause the exact same symptoms.
To reproduce, I simply switch to VC-1 (is busted with this X driver, I have to use "setupcon" to have it works ok), and then switch back to the X VC. Black screen. If I then switch again to VC 1, there is a 10 second delay, then the X screen flashes on for a moment, and I have the console back. Killing X make it restart ok, but then the same pattern is on.
So: the problem is VC switching.
Time to start a bisect run? I will try to downgrade the X driver as per
https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/182865/comments/19 , then I'll start bisect.
1888 revs to go... I noticed that I have PAT=Y in my .config. Could be a hint? Should I try with PAT=n?
Uf, seems more complex than ever. I have hit two times a build error...
ERROR: "__locks_copy_lock" [fs/lockd/lockd.ko] undefined! WARNING: modpost: Found 14 section mismatch(es). To see full details build your kernel with: 'make CONFIG_DEBUG_SECTION_MISMATCH=y' make: *** [__modpost] Error 1 make: *** [modules] Error 2
I'm trying to jump around to see if I can get away, but it seems quite nasty.
Hmmm. Additional thing: the black screen, switching VT or doing s2ram, happens only if gnome is started. If the X screen displayed is the gdm one, no problem.
Uff. Now I know that a workaround for the lockd buck is to compile in lockd, will do. 648 revs to go.
Is there anybody out there :-)?
...maybe it's the most painful bisect ever...
It failed to compile in two other points...
Now at 10c993a6b5418cb1026775765ba4c70ffb70853d (which is good), compiling the
kernel started to corrupt files with a 0xf0 in it... I rebooted on a known well
kernel, git reset --hard, and started again to compile. Will see.
Ok, I do not think that I can make much more than this... not this night at least. I have restricted this like that:
but then 2c14f28be2a3f2a2e9861b156d64fbe2bc7000c3 makes my laptop oops on boot. I have to sleep now... the log is promising (all video related things):
d9c04d678418fe42646de641f499209ca00fd94f Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6
4d9c55e44336602f8b2880b972fb55f67bc51dd0 Merge branch 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
09aa356b5584090aab6810ec8002936d710cd4ac agp: convert drivers/char/agp/frontend.c to use unlocked_ioctl
4ab92bcf773e7b9e1367897047d5fa4d151d9e90 agp: fix shadowed variable warning in amd-k7-agp.c
b74e2082f8e7b8f37af3fc39e8ee0dd0d218c589 drm: _end is shadowing real _end, just rename it.
ac741ab71bb39e6977694ac0cc26678d8673cda4 drm/vbl rework: rework how the drm deals with vblank.
2c14f28be2a3f2a2e9861b156d64fbe2bc7000c3 drm: reorganise minor number handling using backported modesetting code.
7b832b56bd971348329c3f4c753ca0abfdf3a3d1 drm/i915: Handle tiled buffers in vblank tasklet
a36b7dcc05bc4c4580f11cf78e95edfefa86b8a6 drm/i965: On I965, use correct 3DSTATE_DRAWING_RECTANGLE command in vblank
f1c3e67eb73a4a1db31e235883156ac098e29ff6 drm: Remove unneeded dma sync in ATI pcigart alloc
5ff64611333fd282793ff8997e02138aa2f6aab9 drm: Fix mismerge of non-coherent DMA patch
I hope this helps.
See you tomorrow...
Notice that git bisect accepts a path, so that it will only bisect between the commits that affect to a given path. If you are sure that it's a drm-related thing, you can try to do bisection on drivers/char/drm
On Mon, 2008-05-19 at 14:44 -0700, firstname.lastname@example.org
> Notice that git bisect accepts a path, so that it will only bisect between the
> commits that affect to a given path. If you are sure that it's a drm-related
> thing, you can try to do bisection on drivers/char/drm
I know. The problem was that I didn't know if this was due to the drm,
video, acpi or x86... so it seems that is quite restricted now.
To resume (see http://bugzilla.kernel.org/show_bug.cgi?id=10620 )
- VT switching is broken after gnome has started (acceleration?)
- the symptom is that the X VT stays black, which just the cursor (and
sometime some residual bit of the panel) shown.
- it happens almost all the time with suspend/resume
- sometime you can kill the X server with ctrl-alt-backspace, sometime
you have to reboot.
My (painful) bisect stopped here:
but then 2c14f28be2a3f2a2e9861b156d64fbe2bc7000c3 makes my laptop oops on boot.
Fixed in v2.6.26-rc3-119-g8033c6e. I'll close this bug.
Thanks to all!