Bug 12441

Summary: Xorg can't use dri on radeon X1950 AGP
Product: Drivers Reporter: Daniel Vetter (daniel)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED CODE_FIX    
Severity: normal CC: 1i5t5.duncan, mmvinni, rjw, venki, wrar
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 12398    
Attachments: dmesg of 2.6.29-rc1
Xorg.log when run on 2.6.29-rc1
two approach to fix
dmesg after a fix by Dave Airlie
Exact patch I created for other testers

Description Daniel Vetter 2009-01-13 05:20:30 UTC
Latest working kernel version:2.6.28-rc7
Earliest failing kernel version:2.6.29-rc1
Distribution: Debian unstable
Hardware Environment: amd64, radeon x1950 in a amd81xx AGP port
Software Environment: Xorg 1.5 + latest radeon driver from debian experimental
Problem Description:
Xorg can't enable drm anymore with the new kernel, which disables about any hw accelled rendering (kde4 is dogslow). There are some messages in dmesg and Xorg.log, I'll attach them.
Comment 1 Daniel Vetter 2009-01-13 05:21:50 UTC
Created attachment 19766 [details]
dmesg of 2.6.29-rc1
Comment 2 Daniel Vetter 2009-01-13 05:22:17 UTC
Created attachment 19767 [details]
Xorg.log when run on 2.6.29-rc1
Comment 3 Daniel Vetter 2009-01-14 06:48:34 UTC
Tested with v2.6.29-rc1-227-gdf0c6c3 (because of the PAT related merge).
The problem persisted, but the dmesg changed. radeon related parts:

[  104.396902] pci 0000:05:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ
16
[  104.540411] [drm] Initialized drm 1.1.0 20060810
[  104.622604] [drm] Initialized radeon 1.29.0 20080528 on minor 0
[  105.215427] agpgart-amd64 0000:04:00.0: AGP 3.0 bridge
[  105.221140] agpgart-amd64 0000:04:00.0: putting AGP V3 device into 4x
mode
[  105.228423] pci 0000:05:00.0: putting AGP V3 device into 4x mode
[  105.297109] [drm:radeon_do_init_cp] *ERROR* could not find ioremap agp
regions!
[  164.170472] warning: `pulseaudio' uses 32-bit capabilities (legacy
support in use)
[  313.021867] agpgart-amd64 0000:04:00.0: AGP 3.0 bridge
[  313.027334] agpgart-amd64 0000:04:00.0: putting AGP V3 device into 4x
mode
[  313.034560] pci 0000:05:00.0: putting AGP V3 device into 4x mode
[  313.061972] [drm:radeon_do_init_cp] *ERROR* could not find ioremap agp
regions!
[  332.323294] __ratelimit: 4 callbacks suppressed

-Daniel
Comment 4 Daniel Vetter 2009-01-15 05:59:35 UTC
It looks like

http://bugzilla.kernel.org/show_bug.cgi?id=12417

is related. Similar dmesg outputs like I have in rc-1, but with the intel drm. Venki, should I give your patch at http://marc.info/?l=linux-kernel&m=123129207628297&w=4 a spin?
Comment 5 Daniel Vetter 2009-01-15 08:40:58 UTC
On Thu, Jan 15, 2009 at 05:59:36AM -0800, bugme-daemon@bugzilla.kernel.org wrote:
> It looks like
> 
> http://bugzilla.kernel.org/show_bug.cgi?id=12417
> 
> is related. Similar dmesg outputs like I have in rc-1, but with the intel
> drm.
> Venki, should I give your patch at
> http://marc.info/?l=linux-kernel&m=123129207628297&w=4 a spin?

I've just tested your patch ontop of 2.6.29-rc1. Doesn't fix it. Tail of
my dmesg:

[   81.859519] pci 0000:05:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ
16
[   82.027573] [drm] Initialized drm 1.1.0 20060810
[   82.082027] [drm] Initialized radeon 1.29.0 20080528 on minor 0
[   82.656113] agpgart-amd64 0000:04:00.0: AGP 3.0 bridge
[   82.661827] agpgart-amd64 0000:04:00.0: putting AGP V3 device into 4x
mode
[   82.669188] pci 0000:05:00.0: putting AGP V3 device into 4x mode
[   82.686784] Xorg:3993 map pfn expected mapping type write-back for
c0000000-c0101000, got write-combining
[   82.697614] Xorg:3993 freeing invalid memtype c0000000-c0101000

I then realized that I already tested your merged fix (see comment #3 in
the bugzilla), so I seem to be hitting something slightly different.

-Daniel
Comment 6 Rafael J. Wysocki 2009-01-25 10:51:44 UTC
On Sunday 25 January 2009, Daniel Vetter wrote:
> On Mon, Jan 19, 2009 at 10:32:13PM +0100, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.28.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=12441
> > Subject             : Xorg can't use dri on radeon X1950 AGP
> > Submitter   : Daniel Vetter <daniel@ffwll.ch>
> > Date                : 2009-01-13 05:20 (7 days old)
> 
> Just tested with v2.6.29-rc2-13-gf3b8436 and the problem still exists.
Comment 7 Duncan 2009-01-27 21:47:15 UTC
I'm seeing this too, after updating to git commit 5ee810072175042775e39bdd3eaaa68884c27805 today (my first 2.6.29 trial).

Gentoo ~amd64 (OP was Debian unstable x86_64), AMD 8xxx chipset (same as OP), Radeon 9200SE AGP card (older than OP's X1950), radeonfb (OP unknown), xorg-server 1.5.3 (gentoo's -r1), xf86-video-ati 6.10.0, radeon driver, kernel radeon DRM.  OP mentioned KDE4, I'm still running KDE 3.5.10, (with 4.1.x merged but not really usable, will be upgrading to 4.2.0 after I reboot back to a 2.6.28.x kernel).  Of course KDE 3.5 is slow too.  (80-ish FPS glxgears, normally 500-ish, 2D windows was slow too.)

I'm running reiserfs so have to be careful bisecting .29, which I haven't tried yet.  (That's also why I don't try until after -rc2 or so.)

From dmesg:

mtrr: no more MTRRs available
mtrr: no more MTRRs available
mtrr: no more MTRRs available
mtrr: no more MTRRs available
mtrr: no more MTRRs available
agpgart-amd64 0000:04:00.0: AGP 3.0 bridge
agpgart-amd64 0000:04:00.0: putting AGP V3 device into 4x mode
radeonfb 0000:05:00.0: putting AGP V3 device into 4x mode
[drm:radeon_do_init_cp] *ERROR* could not find ioremap agp regions!
X:2384 conflicting memory types a0000000-a8000000 write-combining<->uncached-minus
reserve_memtype failed 0xa0000000-0xa8000000, track write-combining, req write-combining
X:2384 conflicting memory types a0000000-a8000000 write-combining<->uncached-minus
reserve_memtype failed 0xa0000000-0xa8000000, track write-combining, req write-combining
Machine check events logged
Machine check events logged

I don't know if the MCEs are referring to the above or not but wasn't seeing them on previous kernels.

From Xorg.0.log:

(II) AIGLX: Screen 0 is not DRI capable
(II) AIGLX: Loaded and initialized /usr/lib64/dri/swrast_dri.so
(II) GLX: Initialized DRISWRAST GL provider for screen 0

If you need the full logs and/or kernel config posted, ask.  I can try to (carefully, reiserfs) do a bisect, but may decide to merge and play around a bit with KDE 4.2 first.  Again, ask if you find yourself needing it to better isolate the problem and I've not gotten to it yet.

Duncan
Comment 8 Shunichi Fuji 2009-01-28 02:39:31 UTC
Created attachment 20028 [details]
two approach to fix

>[  313.061972] [drm:radeon_do_init_cp] *ERROR* could not find ioremap agp
This fail is happened in ioremap() due to type mismatch.
ioremap() will be translated to nocache_ioremap() on x86/64 default.

>Xorg:3993 map pfn expected mapping type write-back for
>c0000000-c0101000, got write-combining
This indicates that trying to maping with write-back(=nocache),
but region is write-combine, be as explained in above.

On the calling io_remap_pfn_range() from drm_mmap_locked() via mmap() in drm_vm.c,
vma->vm_page_prot=0, it indicates do mapping with regions original prot.
Thus, with most AGP mtrr system, it is setted to write-combine.

there are two approach,
In drm_mmap_locked(), use nocache mapping with agp too.
or
In drm_core_ioremap(), try to use ioremap_wc().
(i'm sure best with ioremap_default() behaviour, but it not exported yet and it's in only x86.)


BTW I still got 
>[10503.406254] glxinfo:5854 freeing invalid memtype e0102000-e0112000
on exit some apps that using drm.

with my print debug, strangely, first cp_ring is only succeed the ioremap().
>[drm:radeon_do_init_cp] *ERROR* cp_ring:f8900000, ring_rptr:(null),
>agp_buffer_map:(null)
Because PAT has UC- type for few GART region already immediate after boot.

There may be something wrong PAT and MTRR interactions, and AGP madness.
Comment 9 Mikko Vinni 2009-01-28 04:34:58 UTC
Created attachment 20029 [details]
dmesg after a fix by Dave Airlie

I had a similar problem ([drm:radeon_do_init_cp] *ERROR* could not find ioremap agp regions!) with a bit different hardware (Ati IGP320/M). The following fix by Dave Airlie made 3D acceleration work again on this laptop, perhaps it might have an effect on your computers as well:

diff --git a/drivers/gpu/drm/drm_memory.c b/drivers/gpu/drm/drm_memory.c
index 803bc9e..bcc869b 100644
--- a/drivers/gpu/drm/drm_memory.c
+++ b/drivers/gpu/drm/drm_memory.c
@@ -171,9 +171,14 @@ EXPORT_SYMBOL(drm_core_ioremap);

void drm_core_ioremap_wc(struct drm_map *map, struct drm_device *dev)
{
-    map->handle = ioremap_wc(map->offset, map->size);
+    if (drm_core_has_AGP(dev) &&
+        dev->agp && dev->agp->cant_use_aperture && map->type == _DRM_AGP)
+        map->handle = agp_remap(map->offset, map->size, dev);
+    else
+        map->handle = ioremap_wc(map->offset, map->size);
}
EXPORT_SYMBOL(drm_core_ioremap_wc);
+
void drm_core_ioremapfree(struct drm_map *map, struct drm_device *dev)
{
    if (!map->handle || !map->size)
diff --git a/drivers/gpu/drm/radeon/radeon_cp.c b/drivers/gpu/drm/radeon/radeon_cp.c
index 63212d7..df4cf97 100644
--- a/drivers/gpu/drm/radeon/radeon_cp.c
+++ b/drivers/gpu/drm/radeon/radeon_cp.c
@@ -1039,9 +1039,9 @@ static int radeon_do_init_cp(struct drm_device *dev, drm_radeon_init_t *init,

#if __OS_HAS_AGP
    if (dev_priv->flags & RADEON_IS_AGP) {
-        drm_core_ioremap(dev_priv->cp_ring, dev);
-        drm_core_ioremap(dev_priv->ring_rptr, dev);
-        drm_core_ioremap(dev->agp_buffer_map, dev);
+        drm_core_ioremap_wc(dev_priv->cp_ring, dev);
+        drm_core_ioremap_wc(dev_priv->ring_rptr, dev);
+        drm_core_ioremap_wc(dev->agp_buffer_map, dev);
        if (!dev_priv->cp_ring->handle ||
            !dev_priv->ring_rptr->handle ||
            !dev->agp_buffer_map->handle) {

---

More information here: http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg37874.html
Comment 10 Daniel Vetter 2009-01-28 11:01:20 UTC
On Wed, Jan 28, 2009 at 04:34:59AM -0800, bugme-daemon@bugzilla.kernel.org wrote:
> I had a similar problem ([drm:radeon_do_init_cp] *ERROR* could not find
> ioremap
> agp regions!) with a bit different hardware (Ati IGP320/M). The following fix
> by Dave Airlie made 3D acceleration work again on this laptop, perhaps it
> might
> have an effect on your computers as well:

I can confirm that these changes indeed fix the problem (I recreated the
patch 'cause it was whitespace-mangled).

> diff --git a/drivers/gpu/drm/drm_memory.c b/drivers/gpu/drm/drm_memory.c
> index 803bc9e..bcc869b 100644
> --- a/drivers/gpu/drm/drm_memory.c
> +++ b/drivers/gpu/drm/drm_memory.c
> @@ -171,9 +171,14 @@ EXPORT_SYMBOL(drm_core_ioremap);
> 
> void drm_core_ioremap_wc(struct drm_map *map, struct drm_device *dev)
> {
> -    map->handle = ioremap_wc(map->offset, map->size);
> +    if (drm_core_has_AGP(dev) &&
> +        dev->agp && dev->agp->cant_use_aperture && map->type == _DRM_AGP)
> +        map->handle = agp_remap(map->offset, map->size, dev);
> +    else
> +        map->handle = ioremap_wc(map->offset, map->size);
> }
> EXPORT_SYMBOL(drm_core_ioremap_wc);
> +
> void drm_core_ioremapfree(struct drm_map *map, struct drm_device *dev)
> {
>     if (!map->handle || !map->size)
> diff --git a/drivers/gpu/drm/radeon/radeon_cp.c
> b/drivers/gpu/drm/radeon/radeon_cp.c
> index 63212d7..df4cf97 100644
> --- a/drivers/gpu/drm/radeon/radeon_cp.c
> +++ b/drivers/gpu/drm/radeon/radeon_cp.c
> @@ -1039,9 +1039,9 @@ static int radeon_do_init_cp(struct drm_device *dev,
> drm_radeon_init_t *init,
> 
> #if __OS_HAS_AGP
>     if (dev_priv->flags & RADEON_IS_AGP) {
> -        drm_core_ioremap(dev_priv->cp_ring, dev);
> -        drm_core_ioremap(dev_priv->ring_rptr, dev);
> -        drm_core_ioremap(dev->agp_buffer_map, dev);
> +        drm_core_ioremap_wc(dev_priv->cp_ring, dev);
> +        drm_core_ioremap_wc(dev_priv->ring_rptr, dev);
> +        drm_core_ioremap_wc(dev->agp_buffer_map, dev);
>         if (!dev_priv->cp_ring->handle ||
>             !dev_priv->ring_rptr->handle ||
>             !dev->agp_buffer_map->handle) {
> 
> ---
> 
> More information here:
> http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg37874.html
> 
> 
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
Comment 11 Daniel Vetter 2009-01-28 11:02:43 UTC
Created attachment 20032 [details]
Exact patch I created for other testers
Comment 12 Mikko Vinni 2009-01-28 12:24:59 UTC

> I can confirm that these changes indeed fix the problem (I recreated the
> patch 'cause it was whitespace-mangled).

Yeah, sorry about that. Apparently yahoo thinks people aren't satisfied with a plain '\n' and I didn't realize to copy the corrected version.


      
Comment 13 Dmitry Kazakov 2009-01-30 10:27:53 UTC
I had this problem with Radeon8500(agp). This patch really solves it.

Tested on the latest commit:
f2257b70b0f9b2fe8f2afd83fc6798dca75930b8
Comment 14 Rafael J. Wysocki 2009-02-07 06:38:25 UTC
Patch : http://bugzilla.kernel.org/attachment.cgi?id=20032&action=view
Handled-By : Dave Airlie <airlied@linux.ie>
Comment 15 Duncan 2009-02-10 06:04:52 UTC
As of git describe 2.6.29-rc4-58-g4c098bc, it seems to work again, here anyway.  I still have the following in syslog, but it still works. =:^)

Feb 10 06:46:30 h2 mtrr: no more MTRRs available
Feb 10 06:46:30 h2 mtrr: no more MTRRs available
Feb 10 06:46:30 h2 mtrr: no more MTRRs available
Feb 10 06:46:30 h2 mtrr: no more MTRRs available
Feb 10 06:46:30 h2 mtrr: no more MTRRs available
Feb 10 06:46:30 h2 agpgart-amd64 0000:04:00.0: AGP 3.0 bridge
Feb 10 06:46:30 h2 agpgart-amd64 0000:04:00.0: putting AGP V3 device into 4x mode
Feb 10 06:46:30 h2 radeonfb 0000:05:00.0: putting AGP V3 device into 4x mode
Feb 10 06:46:30 h2 [drm] Setting GART location based on new memory map
Feb 10 06:46:30 h2 [drm] Loading R200 Microcode
Feb 10 06:46:30 h2 [drm] writeback test succeeded in 1 usecs
Feb 10 06:46:31 h2 mtrr: no more MTRRs available
Feb 10 06:46:31 h2 mtrr: no more MTRRs available
Feb 10 06:46:32 h2 X:2393 conflicting memory types a0000000-a8000000 write-combining<->uncached-minus
Feb 10 06:46:32 h2 reserve_memtype failed 0xa0000000-0xa8000000, track write-combining, req write-combining
Feb 10 06:46:32 h2 X:2393 conflicting memory types a0000000-a8000000 write-combining<->uncached-minus
Feb 10 06:46:32 h2 reserve_memtype failed 0xa0000000-0xa8000000, track write-combining, req write-combining
Feb 10 06:50:01 h2 Machine check events logged
Comment 16 Daniel Vetter 2009-02-10 09:04:16 UTC
> ------- Comment #15 from 1i5t5.duncan@cox.net  2009-02-10 06:04 -------
> As of git describe 2.6.29-rc4-58-g4c098bc, it seems to work again, here
> anyway.
>  I still have the following in syslog, but it still works. =:^)

I just tested v2.6.29-rc4-58-g4c098bc, and it works flawless here, without
any strange output in dmesg. I'm gonna close this bz entry now for I
suspect your problem is some other PAT/caching-type problem (the kernel
seems to have gotten way more picky recently wrt stuff like that).

-Daniel

> Feb 10 06:46:30 h2 mtrr: no more MTRRs available
> Feb 10 06:46:30 h2 mtrr: no more MTRRs available
> Feb 10 06:46:30 h2 mtrr: no more MTRRs available
> Feb 10 06:46:30 h2 mtrr: no more MTRRs available
> Feb 10 06:46:30 h2 mtrr: no more MTRRs available
> Feb 10 06:46:30 h2 agpgart-amd64 0000:04:00.0: AGP 3.0 bridge
> Feb 10 06:46:30 h2 agpgart-amd64 0000:04:00.0: putting AGP V3 device into 4x
> mode
> Feb 10 06:46:30 h2 radeonfb 0000:05:00.0: putting AGP V3 device into 4x mode
> Feb 10 06:46:30 h2 [drm] Setting GART location based on new memory map
> Feb 10 06:46:30 h2 [drm] Loading R200 Microcode
> Feb 10 06:46:30 h2 [drm] writeback test succeeded in 1 usecs
> Feb 10 06:46:31 h2 mtrr: no more MTRRs available
> Feb 10 06:46:31 h2 mtrr: no more MTRRs available
> Feb 10 06:46:32 h2 X:2393 conflicting memory types a0000000-a8000000
> write-combining<->uncached-minus
> Feb 10 06:46:32 h2 reserve_memtype failed 0xa0000000-0xa8000000, track
> write-combining, req write-combining
> Feb 10 06:46:32 h2 X:2393 conflicting memory types a0000000-a8000000
> write-combining<->uncached-minus
> Feb 10 06:46:32 h2 reserve_memtype failed 0xa0000000-0xa8000000, track
> write-combining, req write-combining
> Feb 10 06:50:01 h2 Machine check events logged
Comment 17 Duncan 2009-02-10 21:32:58 UTC
(In reply to comment #16)
> > Comment #15
> > As of git describe 2.6.29-rc4-58-g4c098bc,
> > it seems to work again, here anyway.
> > I still have the following in syslog,
> > but it still works. =:^)
> 
> I just tested v2.6.29-rc4-58-g4c098bc,
> and it works flawless here, without any
> strange output in dmesg. I'm gonna close
> this bz entry now for I suspect your problem
> is some other PAT/caching-type problem
> (the kernel seems to have gotten way more
> picky recently wrt stuff like that).

Agreed.  The logged stuff must be a different bug.  

Thanks! =:^)