Bug 15969

Summary: radeon regression couldn't schedule IB on resume with 2.6.34-rc7
Product: Drivers Reporter: cedric (cedric)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED CODE_FIX    
Severity: normal CC: alexdeucher, bugz.kernel.tormod, glisse, maciej.rutecki, Martin, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34-rc7 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216, 15310    
Attachments: kernel config
lspci -(n)vv of the vga adapter
dmesg after bad resume
release agp bridge at suspend

Description cedric 2010-05-13 19:06:34 UTC
Created attachment 26367 [details]
kernel config

Hello,

When resuming my laptop from a -rc7, my X session reacted weirdly.
Reading the dmesg log, I had :

May 12 21:58:03 enea kernel: [  122.627332] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(0).
May 12 21:58:03 enea kernel: [  122.627342] [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 12 21:58:03 enea kernel: [  123.251198] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(1).
May 12 21:58:03 enea kernel: [  123.251207] [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 12 21:58:04 enea kernel: [  123.626341] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(2).
May 12 21:58:04 enea kernel: [  123.626351] [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 12 21:58:04 enea kernel: [  124.256505] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(3).
May 12 21:58:04 enea kernel: [  124.256514] [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 12 21:58:05 enea kernel: [  124.626191] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(4).
May 12 21:58:05 enea kernel: [  124.626200] [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 12 21:58:05 enea kernel: [  125.059095] [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(5).

And so on until IB(15)

This was not the case with a -rc5 kernel nor with a kernel without KMS (radeon.modeset=0).

Bisection gives me (on drivers/gpu):

# bad: [b57f95a38233a2e73b679bea4a5453a1cc2a1cc9] Linux 2.6.34-rc7
# good: [01bf0b64579ead8a82e7cfc32ae44bc667e7ad0f] Linux 2.6.34-rc5
git bisect start 'v2.6.34-rc7' 'v2.6.34-rc5' '--' 'drivers/gpu/'
# bad: [3515387ba90ef2c38602f4d52c4d5ec5fc95ae5c] drm/radeon/kms: fix panel scaling adjusted mode setup
git bisect bad 3515387ba90ef2c38602f4d52c4d5ec5fc95ae5c
# good: [88b045077a1462a47503137fd4ca0c31772819ca] drm/radeon: Fix sparc regression in r300_scratch()
git bisect good 88b045077a1462a47503137fd4ca0c31772819ca
# bad: [404b017d00a9f472bdf725a06892d42f1cba5ed8] drivers/gpu/drm/drm_memory.c: fix check for end of loop
git bisect bad 404b017d00a9f472bdf725a06892d42f1cba5ed8
# bad: [ccb2ad579f910e6146adf4eb3aa50325253ee8c9] drm/radeon/kms/agp The wrong AGP chipset can cause a NULL pointer dereference
git bisect bad ccb2ad579f910e6146adf4eb3aa50325253ee8c9

with 797fd5b9dad12a100c81b5782573a41259728cb1 being the culprit

commit 797fd5b9dad12a100c81b5782573a41259728cb1
Author: Marek Olšák <maraeo@gmail.com>
Date:   Tue Apr 13 02:33:36 2010 +0200

    drm/radeon/kms: r300 fix CS checker to allow zbuffer-only fastfill

    Signed-off-by: Marek Olšák <maraeo@gmail.com>
Comment 1 cedric 2010-05-13 19:08:17 UTC
Created attachment 26368 [details]
lspci -(n)vv of the vga adapter
Comment 2 Rafael J. Wysocki 2010-05-13 20:32:53 UTC
*** Bug 15971 has been marked as a duplicate of this bug. ***
Comment 3 cedric 2010-05-13 21:01:43 UTC
Hm,

I tried a kernel with this patch reverted, but the problem persists. So i did a full bisect and still come to the same commit.

I'm really puzzled.
Comment 4 Jérôme Glisse 2010-05-14 08:58:52 UTC
I don't think :
797fd5b9dad12a100c81b5782573a41259728cb1
drm/radeon/kms: r300 fix CS checker to allow zbuffer-only fastfill

is the culprit, i will try to see if i can reproduce this issue.
Comment 5 Jérôme Glisse 2010-05-14 09:25:33 UTC
Can you test if rc6 is working or not ?
Comment 6 cedric 2010-05-14 16:46:48 UTC
tested and not working
Comment 7 Jérôme Glisse 2010-05-16 10:44:48 UTC
So to sumup rc5 is working and rc6 isn't ?
Comment 8 cedric 2010-05-16 12:10:15 UTC
yes that's it.
Comment 9 cedric 2010-05-17 09:53:08 UTC
oh my... I'm very sorry, but I should have not test the good one :-s rc6 is OK.
(btw I tested 2.6.34 final but with no luck)
Comment 10 Jérôme Glisse 2010-05-19 12:20:57 UTC
Does suspend work with radeon.agpmode=-1 boot parameter ?
Comment 11 Jérôme Glisse 2010-05-19 12:24:08 UTC
Also please attach full kernel log after resume
Comment 12 cedric 2010-05-19 13:55:41 UTC
Created attachment 26436 [details]
dmesg after bad resume

the problem appears as soon as Xorg is started
Comment 13 cedric 2010-05-19 13:56:11 UTC
with the boot parameter, the resume is ok
Comment 14 Jérôme Glisse 2010-05-21 12:21:30 UTC
Created attachment 26484 [details]
release agp bridge at suspend

Please test if attached patch fix the issue
Comment 15 cedric 2010-05-21 14:03:41 UTC
compiled, tested and working. Thanks !
Comment 16 Tormod Volden 2010-05-27 22:08:28 UTC
Thanks so much, this seems to have fixed the same issue on Mobility X700. I also had the same symptoms on 2.6.33, would the patch be applicable there as well? 2.6.34 up to rc6 was working though, like reported here.
Comment 17 Alex Deucher 2010-05-27 22:16:17 UTC
mobility x700 is pcie, so I suspect something different was the cause there.
Comment 18 Tormod Volden 2010-05-28 20:07:33 UTC
After looking at the code I think I can agree on that :) It got broken between 2.6.34 rc6 and rc7, and got fixed between master 20100523 and drm-next 20100524, but now that it works anyway I won't spend time on pinpointing it further. Thanks to whoever had their magic hands in there!
Comment 19 Martin Steigerwald 2010-05-31 13:42:17 UTC
I see this as well on my ThinkPad T42 with 2.6.34, while 2.6.33 works okay. I will try with the boot param of comment #10 and if that works with the patch from comment #14

May 31 14:46:53 shambhala kernel: [drm] Initialized radeon 2.3.0 20080528 for 0000:01:00.0 on minor 0
May 31 14:52:42 shambhala kernel: [drm:radeon_agp_init] *ERROR* Unable to acquire AGP: -16
May 31 14:52:42 shambhala kernel: [drm] GPU reset succeed (RBBM_STATUS=0x00000140)
May 31 14:52:42 shambhala kernel: [drm] radeon: 1 quad pipes, 1 Z pipes initialized.
May 31 14:52:42 shambhala kernel: [drm] radeon: cp idle (0x10000C03)
May 31 14:52:42 shambhala kernel: [drm] radeon: ring at 0x00000000D0000000
May 31 14:52:42 shambhala kernel: [drm:r100_ring_test] *ERROR* radeon: ring test failed (sracth(0x15E4)=0xCAFEDEAD)
May 31 14:52:42 shambhala kernel: [drm:r100_cp_init] *ERROR* radeon: cp isn't working (-22).
May 31 14:52:42 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(4).
May 31 14:52:42 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:42 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(5).
May 31 14:52:42 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:42 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(6).
May 31 14:52:42 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:43 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(7).
May 31 14:52:43 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:43 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(8).
May 31 14:52:43 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:44 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(9).
May 31 14:52:44 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:44 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(10).
May 31 14:52:44 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:44 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(11).
May 31 14:52:44 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:44 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(12).
May 31 14:52:44 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:44 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(13).
May 31 14:52:44 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:44 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(14).
May 31 14:52:44 shambhala kernel: [drm:radeon_cs_ioctl] *ERROR* Faild to schedule IB !
May 31 14:52:45 shambhala kernel: [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
May 31 14:52:45 shambhala kernel: [drm:radeon_ib_schedule] *ERROR* radeon: couldn't schedule IB(15).

Hardware is:

shambhala:~> lspci -nn | grep VGA
01:00.0 VGA compatible controller [0300]: ATI Technologies Inc RV350 [Mobility Radeon 9600 M10] [1002:4e50]

Driver is:

martin@shambhala:~> apt-show-versions | grep xserver-xorg-video-radeon/
xserver-xorg-video-radeon/sid uptodate 1:6.13.0-2
Comment 20 Martin Steigerwald 2010-05-31 13:53:57 UTC
Kernel boot parameter radeon.agpmode=-1 seems to work okay, now compiling a kernel with patch from comment #14 applied. Thanks.
Comment 21 Martin Steigerwald 2010-05-31 14:43:29 UTC
Patch from comment #14 appears to work. I suggested it to the stable kernel team.

Thanks.
Comment 22 Rafael J. Wysocki 2010-06-13 11:54:08 UTC
Fixed by commit 10b06122afcc78468bd1d009633cb71e528acdc5 .