Description Sebastian Volke 2006-04-07 06:56:37 UTC
Most recent kernel where this bug did not occur: ? Distribution: Gentoo Linux Hardware Environment: AMD Sempron64 3000+, NForce3 250 Mainboard (ASROCK K8UpgradeNF3), ATI Radeon 9550 (256MB) Problem Description: After boot, I dont't have hardware acceleration, because my graphic driver can't acquire agp, since aperture size is unknown. Relevant dmesg output: ... Linux agpgart interface v0.101 (c) Dave Jones agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. agpgart: aperture base > 4G ... Kernel 2.6.15 sais ... Linux agpgart interface v0.101 (c) Dave Jones agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. ... on this point. I think, that my BIOS causes this problem with not setting up the north bridge, but I can't really tell. :-( But fact is, when I boot with windows und then warm boot with linux aperture size is recognized and DRI will work. dmesg sais the following: ... Linux agpgart interface v0.101 (c) Dave Jones agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. agpgart: AGP aperture is 256M @ 0xe0000000 ... This output is the same with every kernel version from 2.6.12 to 2.6.16, and seems to be the "normal" output, means in case everything works fine. ;-) I noticed the problem because of ATI's fglrx not loading, with kernel 2.6.12 and 2.6.15 (also tried 2.6.14). --> tainted kernel (sorry) But then I was told to test it with 2.6.16 (see http://bugs.gentoo.org/show_bug.cgi?id=126571) and I tested it with the pure kernel (only a few gentoo specific patchset), without installing one single binary into the kernel (would be nvsound and fglrx in case). So a not tainted 2.6.16 kernel gives the error, too. Steps to reproduce: None, I hope it's the same with every system, that meets my hardware environment. ;-)
Comment 1 Sebastian Volke 2006-04-07 06:57:50 UTC
Created attachment 7802 [details] This is the whole dmesg output of a cold boot with linux --> agp failure
Comment 2 Sebastian Volke 2006-04-07 06:58:23 UTC
Created attachment 7803 [details] This is the dmesg output of a warm boot from windows --> working agp
Comment 3 Sebastian Volke 2006-04-07 06:59:04 UTC
Created attachment 7804 [details] The kernel config I tested with
Comment 4 Daniel Drake 2006-04-10 07:51:43 UTC
This is the error that is appearing on cold boot: Linux agpgart interface v0.101 (c) Dave Jones agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. agpgart: aperture base > 4G (just pasted from an above attachment to aid others who may be searching)
Comment 5 Sebastian Volke 2006-04-14 00:16:07 UTC
After reading the different kernel source files (especially agpgart.c,amd64-agp.c usw) I tried a kernel with IOMMU-support, which includes an agpgart. After trying this, i'm sure, that it is a BIOS problem. relevant dmesg: ... Setting APIC routing to flat Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000) Checking aperture... CPU 0: aperture @ d0e0000000 size 256 MB Aperture from northbridge cpu 0 beyond 4GB. Ignoring. Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 128 MB of RAM Mapping aperture over 131072 KB of RAM @ 8000000 Built 1 zonelists Kernel command line: root=/dev/ram0 init=/linuxrc ramdisk=8192 real_root=/dev/hda1 udev vga=795 iommu=memaper=2 Initializing CPU#0 ... agpgart: Detected AGP bridge 0 agpgart: Aperture conflicts with PCI mapping. agpgart: Aperture from AGP @ e0000000 size 4096 MB agpgart: Aperture too small (0 MB) agpgart: No usable aperture found. agpgart: Consider rebooting with iommu=memaper=2 to get a good aperture. PCI-DMA: Disabling IOMMU. ... So obviously my BIOS won't work with the kernel (or the other way round). But since Windows has no problems, I would tread it as a bug, though.
Comment 6 wicker 2006-04-19 06:25:53 UTC
I observe identical behaviour on ASRock K8Upgrade-NF3 mainboard with latest bios (P1.60) and ATI Radeon 9000 PRO, with two additional comments: * The above behaviour appears at the first boot after turning the computer on - after reboot, the symptoms may be different. * The aperture address may be different on different boots, e.g. 38f8000000, or 30f8000000. I reported the bug to ASRock, and received the following (not particularly helpful) answer: "Dear Sir, I am very sorry. We don
Comment 7 Sebastian Volke 2006-04-19 08:20:25 UTC
Yeah, asrock seems to be predistined for _very_ useful answers (I read a view similar answers). Thank you for your research. Did you experience the problem with agpgart enabled or with IOMMU? I currently run kernel 2.6.15 with IOMMU enabled and get the same output as I posted above at every boot, regardless if cold or warm boot. So it is, in my opinion, a better solution, because it wants to set up my northbridge on every boot. Pitty, that it fails in it, so I have to deal without any hardware acceleration. I'll try with the newest BIOS from Asrock, which is currently v1.7 (I found it on the global, not the US only site). Hope, that it works a little better ;-)
Comment 8 Sebastian Volke 2006-04-19 10:44:32 UTC
At last I found a workaround. Adding iommu=force to the kernel parameters sets up the northbridge correctly on cold boot, so that on a reboot I am abled to use my agp aperture and hence also hardware acceleration. I'll post a few dmesg outputs later on. (Btw. I failed to update my BIOS, since I can't activate windows and the providet program won't run in SAFE MODE --> sucking closed source ;-))
Comment 9 Sebastian Volke 2006-04-20 00:29:51 UTC
Now the promised dmesg outputs. I booted with kernel parameter iommu=force. On cold boot I experienced following: ... Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000) Checking aperture... CPU 0: aperture @ d4e0000000 size 256 MB Aperture from northbridge cpu 0 beyond 4GB. Ignoring. AGP bridge at 00:00:00 Aperture from AGP @ e0000000 size 4096 MB (APSIZE 0) Aperture from AGP bridge too small (0 MB) Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Mapping aperture over 65536 KB of RAM @ 4000000 Built 1 zonelists ... agpgart: Detected AGP bridge 0 agpgart: Aperture conflicts with PCI mapping. agpgart: Aperture from AGP @ e0000000 size 4096 MB agpgart: Aperture too small (0 MB) agpgart: No usable aperture found. agpgart: Consider rebooting with iommu=memaper=2 to get a good aperture. PCI-DMA: Disabling AGP. PCI-DMA: aperture base @ 4000000 size 65536 KB PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture ... which results in ENODEV. When I rebooted (without having touched my windows installation, every boot linux only), dmesg said the following: ... Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000) Checking aperture... CPU 0: aperture @ e0000000 size 256 MB Built 1 zonelists ... agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. agpgart: AGP aperture is 256M @ 0xe0000000 PCI-DMA: Reserving 128MB of IOMMU area in the AGP aperture ... (I'll atach the complete dmesg outputs) My thought is, that on first boot, IOMMU sets up my bridges so that on the next boot, everything works fine. Of course it would be better to have it working on first boot, but getting hardware acceleration without booting windows is a very good thing ;-)
Comment 10 Sebastian Volke 2006-04-20 00:30:44 UTC
Created attachment 7908 [details] dmesg on cold boot (IOMMU)
Comment 11 Sebastian Volke 2006-04-20 00:31:11 UTC
Created attachment 7909 [details] dmesg on warm boot (IOMMU)
Comment 12 Sebastian Volke 2006-04-20 00:31:55 UTC
Created attachment 7910 [details] kernel config I used to test the IOMMU thing (2.6.15)
Comment 13 Peter Dahlberg 2006-05-07 04:39:32 UTC
I have exactly the same Problem. My hardware: AMD Athlon 64 3000+ ATI Technologies Inc M9+ 5C63 [Radeon Mobility 9200 (AGP)] (rev 01) Asus K8N (newest bios) I hope that the workaround helps. I don't want to boot Windows to get direct rendering working.
Comment 14 Sebastian Volke 2006-07-11 08:59:55 UTC
Just to bring some fresh air in here ;-) Did anyone advance in this topic? What are your iommu options on the kernel command line? As said before, using iommu=force, gives some warning messages on the first boot with linux and fails to set the aperture (but writes correct values into the northbridge, if i understood that right ??? I'm a noob in hardware topics). At the next boot with linux the aperture is set up correctly and DRI works fine. Interestingly, it doesn't matter, if I boot windows first or between. I had the idea (careful, only a VERY slutty hack) to add a kernel option: iommu=twice, which forces the iommu-module to load twice (or at least loads the concerning parts twice), so that the aperture works on the very first boot. Unfortunately I have NO idea, how to code this. Please help.
Comment 15 Mattias Holmlund 2006-07-11 11:33:53 UTC
I am running an AMD64 on an Asus K8N-E Deluxe motherboard. I have been suffering from what I think is this bug for several kernel versions. However, I read a tip on another site that claimed that the problem could be solved by downgrading the bios to version 1006. With BIOS version 1011, kernel 22.214.171.124 for i386, and dri from CVS 2006-07-03, my dmesg looked like this: Linux agpgart interface v0.101 (c) Dave Jones ... agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. agpgart: aperture base > 4G and then Jul 10 11:36:42 localhost kernel: [drm] Initialized radeon 1.25.0 20060524 on minor 0: Jul 10 11:37:01 localhost kernel: [drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held, held 0 owner 00000000 ec287ec0 Jul 10 11:37:01 localhost kernel: [drm:drm_unlock] *ERROR* Process 24247 using kernel context 0 After a downgrade to bios 1006 with the same kernel, I now get Linux agpgart interface v0.101 (c) Dave Jones ... agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. ... agpgart: AGP aperture is 256M @ 0xe0000000 ... [drm] Initialized drm 1.0.1 20051102 ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 16 ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKE] -> GSI 16 (level, low) -> IRQ 225 [drm] Initialized radeon 1.25.0 20060524 on minor 0: agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0. agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode [drm] Setting GART location based on new memory map [drm] Loading R200 Microcode [drm] writeback test succeeded in 1 usecs So downgrading the bios solved it for me. I'll be happy to run more tests with another bios if you want me to.
Comment 16 Sebastian Volke 2006-07-12 07:13:53 UTC
Well, I would consider downgrading the BIOS a workaround. I really have NO clue how redmont successes in getting a good aperture, but windows has no problems. It is certainly a missing feature, if linux doesn't manage with these newer BIOSes too. For ASROCK: upgrading the BIOS to the newest version (now 2.0) doesn't help. But I'm still sure, that somehow it has to be possible to get it completly working. Btw: I testet kernel-2.6.18-rc1 but it didn't work. Though many changes have been applied to the IOMMU-module
Comment 17 j. ortega 2006-10-01 11:50:37 UTC
As stated by one Nvidia employee at: http://www.nvnews.net/vbulletin/showthread.php?t=56714 Windows doesn't rely on the BIOS to get correct information for the board. Probably the BIOS was designed purposely to be Windows compilant only, but not Linux compilant, as any of my tryings to get support for ASUS end in a "closed" or "locked" bug report. The only workaround known for this behavior is to downgrade the Bios where is possible to a version that have a linux friendly behavior, for K8N-E deluxe Bios is 1005, for K8N Bios is 1006, and there's no BIOS replacement for K8N-E (This board can be considered "for windows only") I think i would change the Motherboard of my PC :). Regards J.
Comment 18 Sebastian Volke 2006-10-02 06:54:58 UTC
Thank you for this information. I've tried almost every combination of kernels (2.6.15 to 2.6.18) with differing BIOS versions from 1.7 to 2.1. It's always the same. As you stated, the easiest solution would be, to by a new board. (Btw: do you have a suggestion, J.?) But the easiest, is not always the best solution. Why can
Comment 19 j. ortega 2006-10-02 18:05:18 UTC
Sugestions??? I buyed an VIA chipset based board and i'm quite happy with it. The board is a MSI K8T Neo and seems to be fully linux compilant,as everithing is recognized when it should (i use archlinux, so i must do some things manually). About the K8*** problem, probably having the right values for the bios, and patching agpgart or amd64-agp modules with these hardcore values can be a solution for the problem.... Hacky and ugly, sure, but for this case, reliable. At least IMHO. Good Luck. J. A good read: http://www.kroah.com/log/linux/ols_2006_keynote.html
Comment 20 Sebastian Volke 2006-10-27 00:32:17 UTC
Hello, it's me again, and I've got some news on this issue. ;-) First, I stumbled over a curious value concerning the aperture configuration in the northbridge. Generally, my BIOS writes the correct aperture base (e0000000) into the AGP bridge, but not the correct size. The northbridge contains the correct size, but not the correct base. Almost. it contains d4e0000000. If you leave out the first two digits, it fits the correct aperture base. After rebooting, the first two digits are gone and the mapping works. Just to compare, I listed the dmesg output of a cold boot and a warm boot again: Cold boot: Checking aperture... CPU 0: aperture @ d4e0000000 size 256 MB Aperture from northbridge cpu 0 beyond 4GB. Ignoring. AGP bridge at 00:00:00 Aperture from AGP @ e0000000 size 4096 MB (APSIZE 0) Aperture from AGP bridge too small (0 MB) Your BIOS doesn't leave a aperture memory hole Please enable the IOMMU option in the BIOS setup This costs you 64 MB of RAM Mapping aperture over 65536 KB of RAM @ 4000000 [...] agpgart: Detected AGP bridge 0 agpgart: Aperture conflicts with PCI mapping. agpgart: Aperture from AGP @ e0000000 size 4096 MB agpgart: Aperture too small (0 MB) agpgart: No usable aperture found. agpgart: Consider rebooting with iommu=memaper=2 to get a good aperture. PCI-DMA: Disabling AGP. PCI-DMA: aperture base @ 4000000 size 65536 KB PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture Warm boot: Checking aperture... CPU 0: aperture @ e0000000 size 256 MB [...] agpgart: Detected AGP bridge 0 agpgart: Setting up Nforce3 AGP. agpgart: AGP aperture is 256M @ 0xe0000000 PCI-DMA: Reserving 128MB of IOMMU area in the AGP aperture One more thing, I noticed: If allocating the aperture failes, because of bad values in northbridge and agp-bridge, the IOMMU maps the aperture with a certain size (64MB, depending on the boot options) at a certain base (4000000). See the dmesg of a cold boot for this. But: it doesn't write this values to the agp-bridge, WHAT IS A BUG. The agpgart will ignore the values of the northbridge, since they conflict with PCI mapping, which is OK, IMHO. Then it checks the agp-bridge and triggers an AGP_ENODEV, because it contains rubbish: agpgart: Detected AGP bridge 0 agpgart: Aperture conflicts with PCI mapping. agpgart: Aperture from AGP @ e0000000 size 4096 MB agpgart: Aperture too small (0 MB) agpgart: No usable aperture found. If we solved that bug and the aperture values would be written into the agp-bridge, when mapping it over the RAM, agpgart would have a working aperture and everything would be fine, even with a screwed up BIOS. I'm sorry for my ignorance, but I can't fix this myself. So please, however reads this, create a patch and post it here. The related files are: /path/to/kernel/source/arch/x86_64/kernel/pci-gart.c and /path/to/kernel/source/arch/x86_64/kernel/aperture.c. I hope these two are all, that needs to be edited. Thanks for your help, if you fix this. ;-) Sebastian Volke
Comment 21 j. ortega 2006-11-05 08:45:08 UTC
I have setup another machine with the old k8n-E mobo and put ubuntu on it. The 32 Bit version ***really*** needs windows to allow DRI acceleration with ATI cards (this time an old Radeon 9000 pro), but the 64 bit version just requires that the system being rebooted twice (warm reboot) to get the correct aperture required by AGP. For the 32 bits version the solution can be just make some comprobations if one have a 64-bit processor on the buggy motherboard and apply the corrections needed... for the 64-bit kernel compilation, it will be needed to have the aperture.c algorithm modified to allow double initialization of the board to get correct values. For my viewpoint that should solve the problem (well... also one can wait for ASUS/Gigabyte/AMI to fix their buggy/M$ only BIOS, but i think that hell will freeze first before that would happen) Any developer here that can help us??? (or just post a patch to allow us to workaround the problem???) Regards. J.
Comment 22 Sebastian Volke 2006-11-05 09:00:52 UTC
I fully agree. I already sent a mail to ASROCK, but who would await an answer? And of course, there was none. So, there are two possible solutions to the problem for amd64 systems: 1. Modify aperture.c, so that it initializes the board twice. 2. Modify aperture.c, so that it puts the forced aperture size and base into the AGP bridge, but leaves the northbridge alone. Explanation for the latter could be read in Comment 20 (http://bugzilla.kernel.org/show_bug.cgi?id=6350#c20). Please help in fixing this. I don't have a clue of kernel hacking in the moment, but I'll need to learn some, if none of the devs helps us here. (That shouldn't bring the devs to not helping us so that I learn kernel hacking. ;-))
Comment 23 Sebastian Volke 2006-11-22 06:59:35 UTC
Created attachment 9591 [details] Patch, that fixes the aperture issue on some 64-bit enabled K8 motherboards Finally I did it. At least for my own box ;-) This Patch has some simple task: it deletes some crap from the north bridge. I detected, that my BIOS writes an 0x4878 into the position of the aperture base in the northbridge, which results in the aperture tried to be mapped into 0x90f0000000, which is far beyond 4G. After reboot, this specific value is 0x0078, so the aperture gets mapped into 0x00f0000000, which is all right. The patch now detects, whether these unnecessary bytes are in place and then clears them out by adding 0x7F on a binary basis. This results in every bit, that would produce an aperture beyond 4G being cleared out. 0x7F on this specific position in the northbridge is the dead line, btw. It would result in an aperture mapped exactly to the 4G borderline. Please try this patch with your BIOSes and let me know, if something breaks or if it doesn't work. Thank you.
Comment 24 j. ortega 2006-11-22 19:10:51 UTC
Tha patch works in Ubuntu 6.10 and archlinux 0.7.2: Ubuntu 6.10, kernel 126.96.36.199, patch needs to be applied manually and configured manually, but works (Radeon 9000 Pro, opensource driver works well and propietary driver works... well, at least at the semifunctional extend this driver is capable of :). Archlinux 0.7.2, custom vanilla kernel 188.8.131.52, patch also needed to be installed manually... (are you building this patch against a gentoo kernel???), anyway no big deal here, agpgart is happy with this patch, and fglrx loads ok. Hardware tested: - Asus K8N-E Bios 0411. - Asus K8N is the lastest Bios, (i'm not in front of the machine to write the number, but was flashed today). For the 32-bit compilled kernels, since aperture.c doesn't compile on these machines, the hardware is still uninitialized, but at least we have a solution now for 64-bits kernels... Yeeeaaahhhh!! Kudos and many thanks for the patch!!!! J.
Comment 25 Sebastian Volke 2006-11-23 06:17:50 UTC
Hi again ;-) I'm glad to here it works not only for my machine. YEAHHH ;-) I copied the /path/to/kernel/src/arch/x86_64/kernel/aperture.c in a working dir and did all the work and patching there. I'll do another this weekend, and patch against the kernel dir, so that it works a little better. I didn't have had a look at arch/i386/kernel/aperture.c, so it isn't very surprising, the patch doesn't work there. Maybe I'll go through it in the weekend, too.
Comment 26 Sebastian Volke 2006-11-25 04:59:33 UTC
Created attachment 9622 [details] next version of the patch with better compatibility now, this patch should work better. You can apply it with cd'ing into you kernel source dir and then running "patch -p1 < aperture_fix.patch". It just includes the fix for 64-bit kernels, since I wasn't able to track the 32-bit problem down to the kernel source file. I myself didn't use a 32-bit kernel on this computer, since I decided to run in 64bit mode always. I searched the arch/i386 directory a little and came to the conclusion, that the aperture base and so on are acquired using acpi, and I have NO CLUE how to get on this. Maybe later...
Comment 27 Daniel Drake 2006-12-08 17:28:18 UTC
Please attach "lspci -xxx" output for your northbridge on both a bad and good boot (i.e. revert this patch first). If you can't identify the northbridge, just attach the entire output of "lspci -xxx"
Comment 28 Sebastian Volke 2006-12-11 09:50:59 UTC
Created attachment 9784 [details] output of dmesg and lspci -xxx for cold/warm boots with/without the fix OK. Here is the requested data. Unfortunately, I'm not able to evaluate it myself. I'm sorry. Though I see, that there are differences between the logs, I don't know what they mean. Help's appreciated ;-)
Comment 29 Sebastian Volke 2006-12-11 09:54:06 UTC
Created attachment 9785 [details] Again, an improvement with the patch Here, again, is an improved version of the patch. Now it adds a new option to the kernel command line. iommu=biosfix enables the fix. Otherwise the related piece of code isn't executed.
Comment 30 Daniel Drake 2006-12-11 17:41:35 UTC
Created attachment 9793 [details] pci quirks patch I think this would be best done as a PCI quirk, so here is a preliminary patch. Right now this will run on too many devices - please provide output of "lspci -vn" so that I can restrict it to just the buggy devices. Same applies to everyone with this bug - you might have slightly different hardware. Also, I think the IOMMU mangling code will run before this, which isn't desired. For now, please remove this from your kernel - disable CONFIG_IOMMU. If you have a lot of memory this might result in some of it not being available - but this is only temporary. This should also work on x86 32-bit. Please upload dmesg after trying this regardless of whether it succeeds or fails.
Comment 31 Sebastian Volke 2006-12-12 07:17:44 UTC
Created attachment 9796 [details] output of lspci -vn OK. Here is the desired output of lspci -vn. That thingy with pci-quirks looks good and reminds me, that I should get into understanding the kernel a little deeper than I do at the moment ;-) I've NO clue of any conventions, so thanks a lot for your help with this patch.
Comment 32 Luis Parrilla 2006-12-12 13:46:27 UTC
The pci quirk patch proposed by Daniel works ok for an ASUS K8V-X-SE motherboard with AMD Sempron (32 bits), with the VIA K8T800 chipset and ATI Radeon 9250 graphics. Tell me if you need the lspci output for refining the patch, but probably it is ok.
Comment 33 Daniel Drake 2006-12-12 14:09:12 UTC
Yes, please post lspci output as instructed as your hardware may be slightly different
Comment 34 Daniel Drake 2006-12-12 14:09:49 UTC
and dmesg (from a boot where the patch has made a difference) too
Comment 35 Daniel Drake 2006-12-12 15:53:20 UTC
Created attachment 9799 [details] new pci quirk Here's a nicer patch which restricts itself to certain hardware. It might not work on Luis' system as we haven't seen the lspci output, the hardware might be different. This should work on Sebastian's system but he has not indicated success or failure yet. After establishing that this is working on x86_64 without CONFIG_IOMMU we then need to see how enabling that option affects it. I guess the IOMMU code will bail out, but the quirk will be applied before agpgart init's so things might work and this might be acceptable.
Comment 36 Sebastian Volke 2006-12-13 09:30:21 UTC
Hm. It completely ticked out for me. OK. It's right, I'm running it with IOMMU enabled. But I also tried it with iommu=off and the same result. I've to add, that the patch doesn't work out of the box. The compiler complains about "too few arguments for function pci_get_subsystem". I followed the common usage of pci_get_device, where the corresponding argument is filled with the variable you want to fill with pci_get_device. So I added sb to the arguments. Maybe this causes some kernel oops or whatever, but, because this code runs so early, I don't even see it on my screen. It just stays blank. I've no clue about it. Btw: why should this patch make the IOMMU "bail out"? I don't understand this.
Comment 37 Luis Parrilla 2006-12-13 13:01:36 UTC
Created attachment 9804 [details] lspci with and without the first quirk patch Sorry, but I didn't have time yesterday to post the lspci outputs. Here they are, the hardware is quite different, and probably its better to patch all k8 northbridges as in the first patch (the second one will not take effect with my hardware configuration, by example). I will try to collect more information about different hardware configurations suffering this bug. Another question is the aperture, it seems to be incorrect also with the patch (there is no difference in dmesg output).
Comment 38 Daniel Drake 2006-12-13 15:36:40 UTC
Luis, Patching all K8 northbridges is not an option. Even as-is this patch may be declared too dangerous for inclusion, opening it up further will not help. I'm also confused by your comments, in comment #32 you said it worked but now in comment #37 you are saying it doesn't? Which patch were you trying in comment #37? I note that you still haven't uploaded dmesg as has been requested. Sebastian, Looks like I forgot to refresh before uploading that patch, will post another shortly. Sounds like your modification was wrong, and this may or may not be what is causing the new crash. The patch will also not modify the IOMMU behaviour because I think the IOMMU code runs before that. It is already bailing out due to aperture base beyond 4G, but I'm not focusing on this issue right now...
Comment 39 Daniel Drake 2006-12-13 15:40:37 UTC
Created attachment 9806 [details] new pci quirk will also quirk Luis' system
Comment 40 Mattias Holmlund 2006-12-13 18:55:01 UTC
I have an Asus K8N-E Deluxe with an AMD 64 that I run in 32-bit mode. I have run tests with kernel 2.6.19 without any patches and with patch 9793. With bios version 1006, the aperture is set up correctly both with and without the patch. With bios 1011, the aperture is above 4G without the patch and correct with the patch. lspci and dmesg coming up.
Comment 41 Mattias Holmlund 2006-12-13 18:56:06 UTC
Created attachment 9807 [details] lspci -nv for Asus K8N-E Deluxe
Comment 42 Mattias Holmlund 2006-12-13 18:57:27 UTC
Created attachment 9808 [details] dmesg for Asus K8N-E Deluxe with vanilla 2.6.19
Comment 43 Mattias Holmlund 2006-12-13 18:59:35 UTC
Created attachment 9809 [details] dmesg for Asus K8N-E Deluxe kernel 2.6.19 with patch 9793.
Comment 44 Daniel Drake 2006-12-13 19:29:56 UTC
Created attachment 9810 [details] new pci quirk Thanks, added yours to the list, although this is getting silly. Time to find someone with agpgart knowledge and see what they think...
Comment 45 Luis Parrilla 2006-12-14 03:08:32 UTC
Created attachment 9816 [details] dmesg without patch on ASUS K8V-X SE with AMD Sempron, ATI Radeon 9250 pro The aperture size is 64MB in BIOS, agpgart reports 32MB (trhough amd64_agp module). System hungs when starting X11. The workaround is to disable amd64_agp module and work without 3D acceleration.
Comment 46 Luis Parrilla 2006-12-14 03:13:53 UTC
Created attachment 9817 [details] dmesg with patch #9793 on ASUS K8V-X SE with AMD Sempron, ATI Radeon 9250 pro The aperture size is 64MB in BIOS, agpgart reports 32MB, but amd64_agp does not cause problems, X11 starts ok, and 3D acceleration works fine...
Comment 47 Sebastian Volke 2006-12-14 23:33:25 UTC
At my box, the patch doesn't work. I'm just running into agp_ENODEV, just as it was without the patch (I'm talking about the latest pci quirk). I think, the cause of all this is, that the code runs too late. I mean: aperture.c is processed _before_ the pci subsystem is initialized, therefore it needs an own routine to check the pci devices for the north bridge. Since the patch corrects the north bridge when the pci subsystem is loaded, _after_ the aperture.c, nothing happens. Now: all this applies to an x86_64 kernel, because the x86 code doesn't have an aperture.c. So it may be, that the quirk thingy works for 32-bit, but for 64-bit it doesn't.
Comment 48 Daniel Drake 2006-12-15 05:06:41 UTC
Luis, Please increase CONFIG_LOG_BUF_SHIFT so that we can see all of your kernel logs. Sebastian, Please start with the patch in comment #30 and post dmesg when CONFIG_IOMMU is disabled. This way arch/x86_64/kernel/aperture.c is not even compiled into your kernel so ordering doesn't matter. The PCI quirk will definitely run before drivers/char/agp/amd64-agp.c is initialized, which is the important part.
Comment 49 Sebastian Volke 2006-12-18 08:54:04 UTC
Just a stupid question, but ... I just can't figure out, how to leave out the IOMMU option. Yes, I omitted the IBM Calgary IOMMU option in menuconfig and so on (--> CONFIG_CALGARY_IOMMU=n in the .config), but CONFIG_IOMMU is still turned on. Writing CONFIG_IOMMU=n in the .config doesn't help. Also: I can't select the amd64-agp thingy in the configuration section for the aperture module (using menuconfig), it simply doesn't show up. Furthermore: I just can select 3 kernel architectures: x86_64, AMD Athlon64/Opteron and EMT64. Is there something wrong with the make system, or the menuconfig, is it me, who's just stupid, or whatever? I'm a little confuzzled about this, as I can't find ANY information about it. Please help
Comment 50 Luis Parrilla 2006-12-18 12:28:33 UTC
Created attachment 9868 [details] full dmesg without patch on ASUS K8V-X SE with AMD Sempron, ATI Radeon 9250 pro
Comment 51 Luis Parrilla 2006-12-18 12:30:18 UTC
Created attachment 9869 [details] full dmesg with patch #9793 on ASUS K8V-X SE with AMD Sempron, ATI Radeon 9250 pro
Comment 52 Daniel Drake 2006-12-18 14:54:12 UTC
Luis: thanks, patch seems to be working for you. Sebastian: Oops, IOMMU is unconditionally enabled. amd64-agp is almost certainly already enabled for you, but you can check by using the search feature: type /AGP_AMD64<enter> in menuconfig. Please post full dmesg from using the suggested patch, even if it doesn't work.
Comment 53 Sebastian Volke 2006-12-25 09:40:05 UTC
Created attachment 9949 [details] full dmesg output of 64-bit kernel with quirks patch applied Now, after exactly one week of no further notice (sorry), here is my dmesg output. The kernel I used, was a 64-bit one, without CALGARY iommu support, but still with iommu enabled (see previous comment). amd64_agp is also hard enabled. With the quirks patch applied, I get a bad aperture. I think, that is, because the pci initialization code, which runs that quirk, runs too late, after the aperture.c code.
Comment 54 Daniel Drake 2006-12-25 15:05:06 UTC
Sebastian, You don't appear to have used the suggested patch, that is the one in comment #30. Regardless of whether any fixes are made or not, and regardless of the order in which things happen, this patch will add 2 messages to your kernel log starting with the word QUIRK.
Comment 55 Sebastian Volke 2006-12-29 01:37:43 UTC
Created attachment 9960 [details] Full dmesg of a 64-bit kernel with patch of comment 30 You were right. I applied the most recent patch, that was available. Now, this is full dmesg with the patch of comment 30. Sad but true: it shows, that the quirk runs after the IOMMU code (which was already known), but also after the agpgart module.
Comment 56 Tuxedo 2007-01-01 07:03:56 UTC
I just stumbled across this conversation. I have exactly the same problem using an ASRock (AM2NF3-VSTA) board in combination with a ATI 9250 (AGP), AMD 3200 (AM2) in that 3D only functions when first booting XP and therafter rebooting into Linux. I'm no expert and I do not know how to complete a kernel patch, however, I can find in the bios settings of my board various "Chipset Settings" options, eg.: AGP Data Rate: 8x or 4x AGP Aperture Size: 32, 64, 128, 256 or 512MB AGP Fast Write: Enabled or Disabled AGP SideBand address: Enabled or Disabled Could modifying any of these in a particular combination solve the problem? Alternatively, did anyone here succed in getting 3D working on first boot on a 32 bit Linux? If so, by which patch and exactly how would that patch be applied? I use kernel version 184.108.40.206 with a 32 bit Kanotix 2005-04 install.
Comment 57 j. ortega 2007-01-08 15:49:51 UTC
AFAIK and using any of the patches, no 3d accel can be done in first boot in 32 bits kernels at least on my machine. To get proper AGP functinality and proper 3d Acceleration in 32-bits kernels you really need a boot with windows so the northbridge will be modified to work... I'm still using the first patch posted by Sebastian Volke with no problems at all on a 64-bit kernel. But since i'm working mostly in a VIA chipset, i don't have any problem at all, but i'm still have the asus K8N-E in a linux box, mostly for a home router and for testing purposes. Regards J.
Comment 58 Mattias Holmlund 2007-01-08 16:43:35 UTC
As I reported in comment #40, patch 9793 sets up the aperture correctly for me in 32-bit linux, without any use of Windows (I don't have Windows installed on that machine).
Comment 59 j. ortega 2007-01-10 19:03:18 UTC
Yes, you have the deluxe model of the mobo, i have the K8N-E, not deluxe, Bios 0411 (i sell the other K8N). Probably there's some diferences between the two. Once i'm back@home this weekend i will post the lspci or add myself to the pci quirks table to see what happens with this mobo anyway. Best Regards. J.
Comment 60 Tobe Deprez 2007-01-27 03:32:30 UTC
On my machine it seems to work (with the patch and without starting windows). I have a AMD Sempron (32bit), ATI Radeon 9200SE and ASUS K8N-E (Nforce 3 chipset).
Comment 61 M. B. 2007-01-28 16:17:49 UTC
Linux 2.6.18-gentoo-r6, x86_64, AMD Athlon, Asus K8N patched succesfully. Is inclusion into the kernel sources in sight?
Comment 62 M. B. 2007-01-28 16:21:15 UTC
mhmmm forgot to mention i've got an ATI Radeon 9600 (RV350 AQ chipset). The Mobo's chipset is an nForce3 250Gb.
Comment 63 Andrew Morton 2007-01-30 23:48:03 UTC
Is this bug still live? Daniel, do you have a patch which helps? Thanks
Comment 64 Sebastian Volke 2007-01-31 04:39:13 UTC
Comment #63: Is this bug still live? Well, it is, I think, since no solution working both with x86 and x86_64 was found. But, since my patch (patch #9785) works on x86_64 boards and Daniels quirks patch works on x86, it progresses at a more hesitating pace. Summary: the quirks modification works on the x86 architecture, because the pci-module and thus the quirk is loaded _before_ the amd64_gart. On x86_64, the gart is loaded before the pci system, thus the quirks patch isn't working. If someone has a solution to this problem, other than modifying aperture.c on x86_64, please speak up. PLEASE NOTICE: if you post your dmesg and what you experienced with either of the patches, please take you time and point out which patch was used and which architecture you are running on. The main board isn't important, as far as I can see, since this problem applies to almost every K8-mainboard and to nothing else.
Comment 65 Daniel Drake 2007-01-31 06:33:46 UTC
It certainly doesn't apply to most K8 boards, only a handful. It also appears to be dependent on BIOS version. Dave Jones would like to see the 'fix' verified. On a boot where the PCI quirk (or whatever incarnation of the fix) has modified the base value, someone should run testgart (google for it) and verify that things aren't broken. Eric W. Biederman says: ----------- So I do agree that it appears that the BIOS is letting the upper address bits float, and giving you a 32bit value. Fixing this with a board specific pci quirk is questionable but it may be ok. A reliable fix is probably if the address is sufficiently questionable to allocate a new aperture ourselves, and scream that the BIOS messed up. arch/x86_64/kernel/aperture.c appears to do that when we use the agp aperture for an iommu. I don't think a agp aperture above 64bits is actually very interesting, in practice as most agp cards are only 32bits so won't be able to use it. And we are talking bus addresses here. ----------- This is probably more correct than mangling the address, but I haven't had a chance to look into implementing it.
Comment 66 Stefan Lucke 2007-04-13 00:59:25 UTC
I suffer from the same problem on an Asrock AM2NF3-VSTA board (AMD X2 3800+) Most times I got those messages: Apr 8 00:44:07 jarada [ 11.132000] agpgart: aperture base > 4G After installing both patches #9785 (modified for 220.127.116.11) and #9793: Apr 13 08:43:25 jarada [ 10.912000] agpgart: aperture base is (0x00000078) Apr 13 08:43:25 jarada [ 10.916000] agpgart: AGP aperture is 128M @ 0xf0000000 Apr 13 09:07:52 jarada [ 11.172000] agpgart: AGP aperture is 128M @ 0xf0000000 Apr 13 09:10:03 jarada [ 11.036000] agpgart: AGP aperture is 128M @ 0xf0000000 Apr 13 09:17:53 jarada [ 11.152000] agpgart: aperture base is (0x00000078) Apr 13 09:17:53 jarada [ 11.156000] agpgart: AGP aperture is 128M @ 0xf0000000 and: Apr 12 15:29:49 jarada [ 0.508000] QUIRK aper_base = f78 Apr 12 15:29:49 jarada [ 0.508000] QUIRK aper_base changed to 78 Apr 13 08:43:25 jarada [ 0.508000] QUIRK aper_base = 2f78 Apr 13 08:43:25 jarada [ 0.508000] QUIRK aper_base changed to 78 Apr 13 09:17:53 jarada [ 0.516000] QUIRK aper_base = f78 Apr 13 09:17:53 jarada [ 0.516000] QUIRK aper_base changed to 78 Radeon issue seems to be solved. I'd run testgart several times. 1st time after terminating my X-session I got a reboot :-( . Any subsequent run had no problems (warm boot to an unpatched kernel + cold and warm boot to a patched kernel) Hope that helps to resolve that issue.
Comment 67 Sebastian Volke 2007-04-29 08:58:46 UTC
Hello guys ;-) I'm very sorry, but I discovered a problem with the patch of attachement 9785, which I've been using for several months now. I'm not sure, if this happens with the quirks patch, though, so please try it out. Thanks. Now, to the problem: As far as I experienced it, the USB system, more specifically the ehci_hcd suffers from the patch. I've been experiencing problems with USB devices for a long time now, that is, the ehci can't initialize the gadget and I can't use it until the kernel tries the ohci system. I once booted a kernel without using the patch a proposed and there was no problem with the ehci anymore. I then found the DRI not working, recompiled the kernel with the patch just to notice, that the USB problem has returned. Please test it, how USB gadgets work with and without using the patch. Thank you.
Comment 68 Sebastian Volke 2007-06-05 00:30:10 UTC
It's me again. My warning from the last post seems to have been a false positive. I haven't included any patches in my current kernel (2.6.20) and ehci works, _until I reboot_. The moment, my DRI is working, ehci passes out. So it seems to be a general problem related to DRI or maybe something with the fglrx driver. I can't say at the moment. So don't worry about the patches ;-)
Comment 69 Stefan Lucke 2007-07-01 00:20:05 UTC
I switch to kernel 18.104.22.168. It is still the same issue. I still have to use a modified version of patch #9810. What else could be done to resolve that issue finally ?
Comment 70 Stefan Lucke 2007-07-01 00:24:00 UTC
Created attachment 11908 [details] k8 agp aperture quirk Differrence to #9810 is commented out return is case of !pci_dev_present(quirk_k8_agp_aperture_sbs) holds TRUE. From syslog: Jul 1 09:01:29 jarada [ 0.484000] PCI: QUIRKed sbs for buggy board not found Jul 1 09:01:29 jarada [ 0.484000] PCI: QUIRKed buggY ASRock aperture base from f7c to 7c
Comment 71 Natalie Protasevich 2008-03-20 00:52:34 UTC
Is this still the same with recent kernel? Thanks.
Comment 72 Natalie Protasevich 2008-03-20 01:05:59 UTC
What is the status on this bug, should the patch in #70 be submitted for inclusion?
Comment 73 Sebastian Volke 2008-03-20 02:28:31 UTC
Still occurs on Kubuntu 7.10 with a 2.6.22 kernel. Sorry for beeing a bit lazy with this bug, but recently I switched to a different computer, so I'm not affected anymore ^^. As to the patch: I noticed, that setting up the aperture area using these patches killes USB-2.0. It's no problem using ehci without the patches, but using these patches results in timeouts on connecting the usb device. I don't know, how far these two things are related, it's just a short observation.
Comment 74 Ermenegido Fiorito 2008-07-05 04:11:54 UTC
What is the status on this bug? Any possible solution?
Comment 75 Sebastian Volke 2008-07-07 11:22:55 UTC
(In reply to comment #74) > What is the status on this bug? > > Any possible solution? > I don't know... I don't use the hardware that is hit by this bug, anymore, so I can't work this out. Maybe someone with affected hardware and some better knowledge of all this can fix the problem later. So, hoping the best, I'm closing this bug for now.
Comment 76 Albert Gall 2008-07-30 22:17:00 UTC
Hello: I have the motherboard http://www.asrock.com/mb/overview.asp?Model=K8Upgrade-NF3&s = Upgrade to the latest version of bios and the problem persists. With some older versions of the bios is possible to solve this problem? Thanks.
Comment 77 Stefan Lucke 2008-08-04 15:22:04 UTC
(In reply to comment #76) > Hello: > > I have the motherboard > http://www.asrock.com/mb/overview.asp?Model=K8Upgrade-NF3&s = > > Upgrade to the latest version of bios and the problem persists. Did you test any of the patches ? If not yet, can you test 11908 from my comment #70. I still have to use this with latest kernel 22.214.171.124 . As it comes to USB issues: I noticed no issues, neither with my USB-2.0 dvb-t receiver cinergy-t2 nor with my attached usb-cdrom. Don't know if they are ehci affected.
Comment 78 Albert Gall 2008-08-05 14:46:59 UTC
Hello Stefan. I tried the patches and work. But I would like to use a kernel or a bios that solved the problem. I'm scared to potential instabilities to use solutions in the form of patches that are not even in a stable kernel. As a final solution will use the patches until Asrock published a version of the bios to resolve this problem or wait for a version of the kernel that it solves. Thanks.
Comment 79 Albert Gall 2008-09-09 16:04:21 UTC
Hello: With this version of the kernel http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.27-rc5.bz2 and the following options enabled Symbol: ACPI_PCI_SLOT [=] Symbol: AMD_IOMMU [=] seems that the problem is solved. I would like to prove what someone else and comment on their results. /usr/src/linux-2.6.27-rc5# uname --all Linux localhost 2.6.27-rc5 #2 PREEMPT Tue Sep 9 23:19:27 WEST 2008 x86_64 AMD Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux /usr/src/linux-2.6.27-rc5# glxinfo |grep render libGL warning: 3D driver claims to not support visual 0x4b direct rendering: Yes OpenGL renderer string: Mesa DRI R200 20060602 AGP 1x TCL /usr/src/linux-2.6.27-rc5# dmesg |grep -i agp AGP bridge at 00:00:00 Aperture from AGP @ f0000000 old size 32 MB Aperture from AGP @ f0000000 size 32 MB (APSIZE 0) agpgart-amd64 0000:00:00.0: AGP bridge [10de/00e1] agpgart-amd64 0000:00:00.0: setting up Nforce3 AGP agpgart-amd64 0000:00:00.0: AGP aperture is 32M @ 0xf0000000 agpgart-amd64 0000:00:00.0: calling nv_msi_ht_cap_quirk+0x0/0x106 agpgart-amd64 0000:00:00.0: calling quirk_cardbus_legacy+0x0/0x1d agpgart-amd64 0000:00:00.0: calling quirk_usb_early_handoff+0x0/0x449 agpgart-amd64 0000:00:00.0: calling pci_fixup_video+0x0/0xaf Linux agpgart interface v0.103 agpgart-amd64 0000:00:00.0: AGP 3.0 bridge agpgart-amd64 0000:00:00.0: putting AGP V3 device into 4x mode pci 0000:01:00.0: putting AGP V3 device into 4x mode /usr/src/linux-2.6.27-rc5# Greetings
Comment 80 Ermenegido Fiorito 2008-09-10 03:42:56 UTC
the opening is still 32mb(In reply to comment #79) > Hello: > > With this version of the kernel > > http://www.kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.27-rc5.bz2 > > and the following options enabled > > Symbol: ACPI_PCI_SLOT [=] > > Symbol: AMD_IOMMU [=] > > > seems that the problem is solved. > > I would like to prove what someone else and comment on their results. > > /usr/src/linux-2.6.27-rc5# uname --all > Linux localhost 2.6.27-rc5 #2 PREEMPT Tue Sep 9 23:19:27 WEST 2008 x86_64 AMD > Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux > /usr/src/linux-2.6.27-rc5# glxinfo |grep render > libGL warning: 3D driver claims to not support visual 0x4b > direct rendering: Yes > OpenGL renderer string: Mesa DRI R200 20060602 AGP 1x TCL > /usr/src/linux-2.6.27-rc5# dmesg |grep -i agp > AGP bridge at 00:00:00 > Aperture from AGP @ f0000000 old size 32 MB > Aperture from AGP @ f0000000 size 32 MB (APSIZE 0) > agpgart-amd64 0000:00:00.0: AGP bridge [10de/00e1] > agpgart-amd64 0000:00:00.0: setting up Nforce3 AGP > agpgart-amd64 0000:00:00.0: AGP aperture is 32M @ 0xf0000000 > agpgart-amd64 0000:00:00.0: calling nv_msi_ht_cap_quirk+0x0/0x106 > agpgart-amd64 0000:00:00.0: calling quirk_cardbus_legacy+0x0/0x1d > agpgart-amd64 0000:00:00.0: calling quirk_usb_early_handoff+0x0/0x449 > agpgart-amd64 0000:00:00.0: calling pci_fixup_video+0x0/0xaf > Linux agpgart interface v0.103 > agpgart-amd64 0000:00:00.0: AGP 3.0 bridge > agpgart-amd64 0000:00:00.0: putting AGP V3 device into 4x mode > pci 0000:01:00.0: putting AGP V3 device into 4x mode > /usr/src/linux-2.6.27-rc5# > > > Greetings >
Comment 81 Albert Gall 2008-09-10 08:11:46 UTC
True, the opening is 32mb but 3D is achieved in the first start and programs seem to respond well. The parameters in the bios seems to ignore the opening in the bios I have 256mb.
Comment 82 Ermenegido Fiorito 2008-09-11 02:33:14 UTC
sorry for the english: can you paste me your: cat /var/log/Xorg.0.log | grep "(EE)" ? Thanks
Comment 83 Albert Gall 2008-09-11 06:06:09 UTC
Yes # grep EE /var/log/Xorg.0.log Current Operating System: Linux localhost 2.6.27-rc6 #1 PREEMPT Wed Sep 10 17:26:38 WEST 2008 x86_64 (WW) warning, (EE) error, (NI) not implemented, (??) unknown. (II) Loading extension MIT-SCREEN-SAVER # Greetings
Comment 84 Ermenegido Fiorito 2008-09-11 06:16:38 UTC