Bug 34772
Summary: | [radeon] [R300] GPU lockups with when KMS is enabled | ||
---|---|---|---|
Product: | Drivers | Reporter: | Rogério Brito (rbrito) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | REOPENED --- | ||
Severity: | normal | CC: | alan, alexdeucher, rbrito, schwab, szg00000, xerofoify |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.38 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg output right after the lock up, obtained via the network
A dmesg log from 2.6.39-rc7 showing problems. The log of X with the 2.6.39-rc7 kernel A dmesg log with 2.6.38 kernel Log from X with the kernel 2.6.38 dmesg log with 2.6.39-rc7 with KMS + agpmode=-1 + no_wb=1 X log with 2.6.39-rc7 + KMS + agpmode=-1 + no_wb=1 Allow forcing on all GPU clocks |
Description
Rogério Brito
2011-05-09 23:26:44 UTC
Just for the record, I can provide further messages of these: this is as reproducible as I like. In fact, I am now able to reproduce it with kernel 2.6.38 if I boot the iBook G4 with the options: "video=radeonfb:off radeon.agpmode=-1 radeon.modeset=1" and play a video with mplayer. If, OTOH, I leave off the KMS, then I don't get the GPU lockups that I reported. Anyway, things are *way* better with 2.6.38 than with 2.6.39, as with 2.6.39 the kernel doesn't even get the colors correctly---everything that should be red becomes blue and so forth (any kind of endianness problem?). I am attaching here another stacktrace, in case it helps. Regards, Rogério Brito. Created attachment 58602 [details]
A dmesg log from 2.6.39-rc7 showing problems.
Created attachment 58612 [details]
The log of X with the 2.6.39-rc7 kernel
Created attachment 58622 [details]
A dmesg log with 2.6.38 kernel
Please, notice the GPU hang with kernel 2.6.38.
Created attachment 58632 [details]
Log from X with the kernel 2.6.38
(In reply to comment #1) > Anyway, things are *way* better with 2.6.38 than with 2.6.39, as with 2.6.39 > the kernel doesn't even get the colors correctly---everything that should be > red becomes blue and so forth (any kind of endianness problem?). That's probably nothing to do with the kernel directly but endianness bugs in the X driver when acceleration is not available. It would be interesting if you could bisect what broke acceleration with radeon.agpmode=-1. Note that you should boot with radeon.no_wb=1 as well for this, as CP writeback was only fixed during the 2.6.39 cycle (in commit dc66b325f161bb651493c7d96ad44876b629cf6a). I was able to reproduce the acceleration initialization failure with the Debian 2.6.39-rc7-powerpc kernel, but not with a self-built 2.6.39 kernel. So this was probably just an intermittent problem during the 2.6.39 cycle, e.g. due to the intermittent broken usage of the DMA API by TTM. As for the GPU lockups, does radeon.dynclks=1 help for those? radeon.dynclks=1 causes the wrong resolution to be selected. It thinks something is conncted to the S-video port with a max resolution of 800x600, so it selects this instead of the native resolution (1024x768). -<6>Console: switching to colour frame buffer device 128x48 +<6>[drm] crtc 1 is connected to a TV +<6>Console: switching to colour frame buffer device 100x37 +(II) RADEON(0): Printing probed modes for output S-video +(II) RADEON(0): Modeline "800x600"x59.9 38.25 800 832 912 1024 600 603 607 624 -hsync +vsync (37.4 kHz) +(II) RADEON(0): Modeline "640x480"x59.9 25.18 640 656 752 800 480 490 492 525 -hsync -vsync (31.5 kHz) +(II) RADEON(0): Modeline "320x240"x60.1 12.59 320 328 376 400 240 245 246 262 doublescan -hsync -vsync (31.5 kHz) (II) RADEON(0): Output LVDS connected (II) RADEON(0): Output VGA-0 disconnected -(II) RADEON(0): Output S-video disconnected +(II) RADEON(0): Output S-video connected (II) RADEON(0): Using exact sizes for initial modes -(II) RADEON(0): Output LVDS using initial mode 1024x768 +(II) RADEON(0): Output LVDS using initial mode 800x600 +(II) RADEON(0): Output S-video using initial mode 800x600 Hi, Michel. On Fri, May 20, 2011 at 12:11, <bugzilla-daemon@bugzilla.kernel.org> wrote: > --- Comment #6 from Michel Dänzer <michel@daenzer.net> 2011-05-20 12:11:38 > --- > (In reply to comment #1) >> Anyway, things are *way* better with 2.6.38 than with 2.6.39, as with 2.6.39 >> the kernel doesn't even get the colors correctly---everything that should be >> red becomes blue and so forth (any kind of endianness problem?). > > That's probably nothing to do with the kernel directly but endianness bugs in > the X driver when acceleration is not available. OK, then that's a separate issue. Good to know. > It would be interesting if you could bisect what broke acceleration with > radeon.agpmode=-1. Oooh, I guess that I made some mess in your head here, taking into account the other messages of us. To clear things up: When I use 2.6.38, it works mostly OK if I use radeon.agpmode=-1. It is sufficiently stable to the point that I told you that this setting was OK. But, in fact, if I play a video with mplayer, then it always (so far, 100% reproducible) causes those GPU lockups, but the computer is still accessible via the network, so that I can take the logs etc. If, instead, I use 1 instead of -1, then, even with kernel 2.6.38, I get those lysergide-like :-) pictures that I put on my homepage (but, for documentation purposes, I am thinking of uploading here as attachments, as I am quite short of space there). With kernel 2.6.39, I have not been able to get anything working, whether or not I pass any option to the kernel. Summary: * 2.6.38 with KMS and agpmode=-1: OK, up to me trying to play some video, then GPU lockups. * 2.6.38 with KMS and agpmode=1: GPU lockups a few seconds after X loads (it *does* show up, but locks up a few seconds latter). * 2.6.39 with KMS and agpmode=-1: Not OK, even if I don't use anything accelerated (problems with colors and software rendering). So, I am not quite sure if it would be the case of bisecting or, at least, what would be a good starting point. I can, though, try to boot with many other kernels to see if I can (provided that udev doesn't stop me). > Note that you should boot with radeon.no_wb=1 as well for OK. I can try no_wb=1 with agpmode=-1 and report back in a few moments, to see if the lockups are still there or not. > this, as CP writeback was only fixed during the 2.6.39 cycle (in commit > dc66b325f161bb651493c7d96ad44876b629cf6a). Right. Thanks for that fix of yours (just read the commit). Regards, Hi there. On Sat, May 21, 2011 at 09:16, <bugzilla-daemon@bugzilla.kernel.org> wrote: > OK. I can try no_wb=1 with agpmode=-1 and report back in a few > moments, to see if the lockups are still there or not. Just for the record, 2.6.38 with KMS + agpmode=-1 + no_wb=1 still locks up the GPU when I play a video with mplayer. I will try with 2.6.39 with the same settings. Thanks, Another test. On Sat, May 21, 2011 at 09:23, <bugzilla-daemon@bugzilla.kernel.org> wrote: > On Sat, May 21, 2011 at 09:16, <bugzilla-daemon@bugzilla.kernel.org> wrote: >> OK. I can try no_wb=1 with agpmode=-1 and report back in a few >> moments, to see if the lockups are still there or not. > > Just for the record, 2.6.38 with KMS + agpmode=-1 + no_wb=1 still > locks up the GPU when I play a video with mplayer. Just for the record #2, 2.6.38 with KMS + agpmode=-1 + no_wb=1 + dynclks=1 still locks up the GPU when I play a video with mplayer. Besides that, like Andreas, with dynclks=1 the resolution is reduced to be 800x600. I didn't have the opportunity to read the X logs regarding the S-Video port, but, at least for the user, iBooks (differently from PowerBooks) don't have user-accessible S-Video ports (but this doesn't prevent Apple from having inutilized them somehow). Thanks, On Sat, May 21, 2011 at 09:34, <bugzilla-daemon@bugzilla.kernel.org> wrote: > On Sat, May 21, 2011 at 09:23, <bugzilla-daemon@bugzilla.kernel.org> wrote: >> On Sat, May 21, 2011 at 09:16, <bugzilla-daemon@bugzilla.kernel.org> wrote: >>> OK. I can try no_wb=1 with agpmode=-1 and report back in a few >>> moments, to see if the lockups are still there or not. >> >> Just for the record, 2.6.38 with KMS + agpmode=-1 + no_wb=1 still >> locks up the GPU when I play a video with mplayer. Wooow! Oopsen galore with 2.6.39 with KMS + agpmode=-1 + no_wb=1... Five in a row. OK, probably only the first one matters. Then, it stays there and doesn't load the system... Actually, as I am writing this thing, after about 180 seconds, the boot process is continuing and X is being loaded, but with the wrong colors (the "endianness issue"). I will try to see if the network is available and attach here what I get from dmesg. BTW, I hope that you don't mind me providing copious amounts of testing here (and their results) in the hope to get this fixed... :-) Created attachment 58892 [details]
dmesg log with 2.6.39-rc7 with KMS + agpmode=-1 + no_wb=1
Created attachment 58902 [details]
X log with 2.6.39-rc7 + KMS + agpmode=-1 + no_wb=1
With 2.6.39-rc7 + KMS + agpmode=-1 + no_wb=1 + dynclks=1: * I don't get the Oopsen. * the resolution is restricted to 800x600. * XV is not available to mplayer or other applications. I think the XV extension not working is something that has always happened with 2.6.39 kernels. Thanks, Rogério Brito. (In reply to comment #15) > With 2.6.39-rc7 + KMS + agpmode=-1 + no_wb=1 + dynclks=1: > > * XV is not available to mplayer or other applications. When the kernel radeon driver fails to initialize acceleration, there's no point in trying any functionality that needs acceleration, such as XVideo. I don't think there's any point doing any more tests with 2.6.39-rc7, as it's obviously suffering from additional issues which only occurred intermittently during the 2.6.39 cycle. (In reply to comment #12) > BTW, I hope that you don't mind me providing copious amounts of > testing here (and their results) in the hope to get this fixed... :-) Well, I'm afraid less quantity but more quality would be better... It's becoming rather difficult and time-consuming to find the relevant pieces of information in this mass. (In reply to comment #11) > Just for the record #2, 2.6.38 with KMS + agpmode=-1 + no_wb=1 + > dynclks=1 still locks up the GPU when I play a video with mplayer. Has either of you tried agpmode=1 dynclks=1? Does that increase stability at all? > Besides that, like Andreas, with dynclks=1 the resolution is reduced > to be 800x600. I didn't have the opportunity to read the X logs > regarding the S-Video port, but, at least for the user, iBooks > (differently from PowerBooks) don't have user-accessible S-Video ports > (but this doesn't prevent Apple from having inutilized them somehow). I thought there was some kind of multimedia adapter for the external output. Anyway, it should be possible to override the incorrect output detection, either on the kernel command line with something like video=S-video-1:d or later in xorg.conf or during X runtime with something like xrandr. But really, we need to focus on one problem per bug report as much as possible, or things are getting out of hand. (In reply to comment #9) > So, I am not quite sure if it would be the case of bisecting or, at > least, what would be a good starting point. No, there's no point in bisecting, as that problem should be gone with 2.6.39 final. > > Note that you should boot with radeon.no_wb=1 as well for > > OK. I can try no_wb=1 with agpmode=-1 and report back in a few > moments, to see if the lockups are still there or not. no_wb=1 would only have been important for bisecting, to avoid the writeback endianness bug interfering. P.S. beware of Debian package udev version 169-1: IME an initrd generated with that installed prevents the radeon module from being loaded automatically, and when trying to load it manually, it fails to load the CP microcode and consequently fails to initialize acceleration. Hi, Michel. Thank you very much for the attention. (In reply to comment #16) > When the kernel radeon driver fails to initialize acceleration, there's no > point in trying any functionality that needs acceleration, such as XVideo. OK. > I don't think there's any point doing any more tests with 2.6.39-rc7, as it's > obviously suffering from additional issues which only occurred intermittently > during the 2.6.39 cycle. Right. > Well, I'm afraid less quantity but more quality would be better... It's > becoming rather difficult and time-consuming to find the relevant pieces of > information in this mass. Indeed, it is getting out of hand pretty quickly. Do you want me to give you some SSH access to this notebook? Or, if that's not feasible/useful, what would you like me to test as the next step, so that I avoid flooding you with so much data? > Has either of you tried agpmode=1 dynclks=1? Does that increase stability at > all? I will try those. But with which kernel? I have been avoiding compiling a kernel nowadays, since they take ages on this notebook, but I can set up a cross-compilation environment, if necessary. BTW, would you mind sharing your .config? > I thought there was some kind of multimedia adapter for the external output. The only external adapter is one to a VGA port. No traces of S-video here. > But really, we need to focus on one problem per bug report as much as > possible, > or things are getting out of hand. OK, I can file a separate bug for this S-Video issue, then. Thank you so much for your patience, Rogério Brito. apples sells VGA to s-video adapters, so we list both connectors in the driver. (In reply to comment #18) > apples sells VGA to s-video adapters, so we list both connectors in the > driver. Oh, sorry for the ignorance. Created attachment 58922 [details] Allow forcing on all GPU clocks (In reply to comment #17) > > Has either of you tried agpmode=1 dynclks=1? Does that increase stability > at > > all? > > I will try those. But with which kernel? 2.6.38 should be fine for this test. But at some point it'll probably be useful for you to be able to try kernel patches. Once you've built a kernel, building the radeon module with a patch shouldn't take long. E.g., you guys could try this patch, and booting with radeon.dynclks=0, which should force on all GPU clocks. Does that increase stability with agpmode=1 or agpmode=-1? > BTW, would you mind sharing your .config? My .config still takes 1-2 hours to build on this 1.6 GHz PowerBook. If that could help you, please ask for it on the debian-powerpc list. Would also be interesting if one of you guys could attach dmesg with agpmode=1. Just for the record, I can still provide the information, as I am going to reinstall Linux on the iBook. Thanks in advance, Rogério Brito. This bug needs to be tested against a newer kernel to see if it's fixed. Cheers Nick Hi, Nick. On Jun 25 2014, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=34772 > --- Comment #23 from xerofoify@gmail.com --- > This bug needs to be tested against a newer kernel to see if it's fixed. > Cheers Nick OK, I think that this may be easier to test than the previous issue, but, if I recall correctly, this issue was so fragile that almost anything crashed it. Again, as my other e-mail, please ping me if I don't respond, as I am swamped with work. Thanks, |