Bug 58901

Summary: "trying to bind memory to uninitialized GART" error at resume from suspend to memory
Product: Drivers Reporter: Christian Casteyde (casteyde.christian)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED UNREPRODUCIBLE    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.9.3 Subsystem:
Regression: Yes Bisected commit-id:

Description Christian Casteyde 2013-05-28 11:49:25 UTC
Acer Aspire 7750G
Core i7-2630QM, 6Go
AMD Radeon HD6650M, no Intel graphics
Slackware64-current

Since kernel 3.9.x, my laptop cannot resume from suspend to memory with X completly frozen and no other way to switch off/restart.

My rc scripts save dmesg at shutdown so I managed to get the following kernel logs:

usb 1-1.4: reset high-speed USB device number 4 using ehci-pci
PM: resume of devices complete after 1013.038 msecs
Restarting tasks ... done.
video LNXVIDEO:01: Restoring backlight state
ata1.00: configured for UDMA/133
ata1: EH complete
EXT4-fs (sda2): re-mounted. Opts: discard,commit=0
EXT4-fs (sda3): re-mounted. Opts: discard,commit=0
eth0: deauthenticated from XXXX (Reason: 6)
cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
eth0: authenticate with XXXX
eth0: send auth to XXXX
eth0: authenticated
ath9k 0000:03:00.0 eth0: disabling HT as WMM/QoS is not supported by the AP
ath9k 0000:03:00.0 eth0: disabling VHT as WMM/QoS is not supported by the AP
eth0: associate with XXXX (try 1/3)
eth0: RX AssocResp from XXXX (capab=0x411 status=0 aid=2)
eth0: associated
------------[ cut here ]------------
WARNING: at drivers/gpu/drm/radeon/radeon_gart.c:280 radeon_gart_bind+0xe1/0xf0()
Hardware name: Aspire 7750G
trying to bind memory to uninitialized GART !
Modules linked in:
Pid: 2305, comm: X Not tainted 3.9.3 #6
Call Trace:
 [<ffffffff813a05a1>] ? radeon_gart_bind+0xe1/0xf0
 [<ffffffff8106b70b>] warn_slowpath_common+0x6b/0xa0
 [<ffffffff8106b787>] warn_slowpath_fmt+0x47/0x50
 [<ffffffff813a05a1>] radeon_gart_bind+0xe1/0xf0
 [<ffffffff8139df82>] radeon_ttm_backend_bind+0x32/0x90
 [<ffffffff8137ce47>] ttm_tt_bind+0x47/0x60
 [<ffffffff8137f06f>] ttm_bo_handle_move_mem+0x54f/0x5e0
 [<ffffffff8137f7a1>] ? ttm_bo_mem_space+0x161/0x340
 [<ffffffff8137fe9f>] ttm_bo_move_buffer+0x11f/0x140
 [<ffffffff8137ff52>] ttm_bo_validate+0x92/0x110
 [<ffffffff81380279>] ttm_bo_init+0x2a9/0x3c0
 [<ffffffff8139f2f6>] radeon_bo_create+0x176/0x1d0
 [<ffffffff8139efe0>] ? radeon_bo_clear_va+0x50/0x50
 [<ffffffff813b0a2b>] radeon_gem_object_create+0x9b/0x160
 [<ffffffff813b0e0b>] radeon_gem_create_ioctl+0x5b/0x130
 [<ffffffff81364ac1>] drm_ioctl+0x4d1/0x580
 [<ffffffff813b0db0>] ? radeon_gem_pwrite_ioctl+0x30/0x30
 [<ffffffff811288a5>] do_vfs_ioctl+0x2e5/0x4d0
 [<ffffffff81128ad0>] sys_ioctl+0x40/0x80
 [<ffffffff811186cc>] ? sys_read+0x6c/0x90
 [<ffffffff81763612>] system_call_fastpath+0x16/0x1b
---[ end trace 44b14b5d0d1cf7ab ]---
[drm:radeon_ttm_backend_bind] *ERROR* failed to bind 1175 pages at 0x0240D000
[drm:radeon_gem_object_create] *ERROR* Failed to allocate GEM object (4812800, 2, 4096, -22)
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffff813860a3>] ttm_dma_populate+0x6a3/0x960
PGD 1c61c0067 PUD 1c4df8067 PMD 0 
Oops: 0002 [#1] PREEMPT SMP 
Modules linked in:
CPU 6 
Pid: 2305, comm: X Tainted: G        W    3.9.3 #6 Acer Aspire 7750G/JE70_HR
RIP: 0010:[<ffffffff813860a3>]  [<ffffffff813860a3>] ttm_dma_populate+0x6a3/0x960
RSP: 0018:ffff8801c4c6b9c0  EFLAGS: 00010093
RAX: ffff88019b74c100 RBX: 0000000000000202 RCX: ffff88019b74c180
RDX: 0000000000000000 RSI: ffff8801c7202928 RDI: ffff8801c7202914
RBP: ffff8801c4c6ba88 R08: 00000000000146c0 R09: ffff8801cf5946c0
R10: ffffea0007190a80 R11: ffff8801c8802600 R12: ffff88019b74c100
R13: ffff8801c642a320 R14: ffff8801c7202900 R15: 0000000000000004
FS:  00007fe093e7c8c0(0000) GS:ffff8801cf580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 00000001c619f000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process X (pid: 2305, threadinfo ffff8801c4c6a000, task ffff8801c712da20)
Stack:
 ffff8801c7202964 ffff8801c762c098 ffff8801c7202928 ffff8801c7202971
 ffff88019fb67558 ffff8801c699bc00 0000000000000000 00000000ffc0192a
 ffff88019fb67500 ffff8801c7202914 4000000000000004 ffff880100000004
Call Trace:
 [<ffffffff81129c5a>] ? do_select+0x5fa/0x670
 [<ffffffff8111269a>] ? kmem_cache_alloc+0x9a/0xa0
 [<ffffffff81112520>] ? __kmalloc+0xd0/0xe0
 [<ffffffff8139e917>] radeon_ttm_tt_populate+0x1c7/0x220
 [<ffffffff8139e04a>] ? radeon_ttm_tt_create+0x6a/0xb0
 [<ffffffff8137ce36>] ttm_tt_bind+0x36/0x60
 [<ffffffff8137f06f>] ttm_bo_handle_move_mem+0x54f/0x5e0
 [<ffffffff8137f7a1>] ? ttm_bo_mem_space+0x161/0x340
 [<ffffffff8137fe9f>] ttm_bo_move_buffer+0x11f/0x140
 [<ffffffff8137ff52>] ttm_bo_validate+0x92/0x110
 [<ffffffff81380279>] ttm_bo_init+0x2a9/0x3c0
 [<ffffffff8139f2f6>] radeon_bo_create+0x176/0x1d0
 [<ffffffff8139efe0>] ? radeon_bo_clear_va+0x50/0x50
 [<ffffffff813b0a2b>] radeon_gem_object_create+0x9b/0x160
 [<ffffffff813b0e0b>] radeon_gem_create_ioctl+0x5b/0x130
 [<ffffffff81364ac1>] drm_ioctl+0x4d1/0x580
 [<ffffffff813b0db0>] ? radeon_gem_pwrite_ioctl+0x30/0x30
 [<ffffffff811288a5>] do_vfs_ioctl+0x2e5/0x4d0
 [<ffffffff81128ad0>] sys_ioctl+0x40/0x80
 [<ffffffff811186cc>] ? sys_read+0x6c/0x90
 [<ffffffff81763612>] system_call_fastpath+0x16/0x1b
Code: 00 48 8d 75 b0 48 89 c3 48 8b 45 b0 48 39 f0 74 1e 49 8b 56 28 48 8b 4d b8 48 8b b5 48 ff ff ff 48 89 70 08 49 89 46 28 48 89 11 <48> 89 4a 08 8b 45 90 49 83 46 58 01 41 01 46 44 85 c0 0f 85 5d
RIP  [<ffffffff813860a3>] ttm_dma_populate+0x6a3/0x960
 RSP <ffff8801c4c6b9c0>
CR2: 0000000000000008
---[ end trace 44b14b5d0d1cf7ac ]---
note: X[2305] exited with preempt_count 1
ata1.00: configured for UDMA/133
ata1: EH complete
sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
EXT4-fs (sda2): re-mounted. Opts: discard,commit=600
EXT4-fs (sda3): re-mounted. Opts: discard,commit=600
...

I've also got once a kernel crash but haven't managed to get the dmesg output.
3.8.x and previous kernels were perfectly stable. 3.9 crashes roughly 1 over 10 resumes.
Comment 1 Michel Dänzer 2013-05-28 16:31:21 UTC
Can you bisect between 3.8.x and 3.9.x?
Comment 2 Christian Casteyde 2013-05-30 20:47:54 UTC
No I don't think so, because it is too difficult to reproduce these times.
I wonder if suspending while playing a video helps.
It hasn't appear this week at all for instance (the warning I reported was from last week).
Comment 3 Christian Casteyde 2013-07-23 14:11:03 UTC
Didn't managed to reproduce the problem, closing.