Bug 75241
Summary: | radeon_compute_pll_avivo broken in 3.15-rc3 | ||
---|---|---|---|
Product: | Drivers | Reporter: | Clemens Ladisch (clemens) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | NEW --- | ||
Severity: | high | CC: | alexdeucher, benh, bugzilla, deathsimple, szg00000, tasev.stefanoska |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.15-rc3 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Possible fix.
dmesg working 3.15-rc5 dmesg broken 3.15-rc5 dmesg after boot working with max divider 32 dmesg after suspend resume broken Possible fix v2. Possible fix v3. |
Description
Clemens Ladisch
2014-05-01 13:51:26 UTC
Thanks for the info, could you provide the debug output of 3.14 as well? I especially need the line with radeon_compute_pll_avivo. The problem isn't really triggered by the patch you bisected, but more an issue of the new PLL code. Thanks in advance, Christian. Created attachment 134611 [details]
Possible fix.
Please try the attached patch, it might fix the issue.
3.14: 16205, pll dividers - fb: 135.8 ref: 2, post 6 With the patch: 162000 - 161990, pll dividers - fb: 271.5 ref: 4, post 6 And the patch indeed fixes this. (In reply to Clemens Ladisch from comment #3) > 3.14: > 16205, pll dividers - fb: 135.8 ref: 2, post 6 > > With the patch: > 162000 - 161990, pll dividers - fb: 271.5 ref: 4, post 6 > > And the patch indeed fixes this. Thanks allot for the info. Going to push the patch with the next bugfix release. Could you try higher values for the limit as well and try to figure out what's the maximum your monitor still can handle? It might also make sense to temporary comment out that the following line and see what you get for the parameters and if those still work fine: avivo_reduce_ratio(&fb_div, &ref_div, fb_div_min, ref_div_min); And by the way: What monitor is this? It's an Eizo S2100, but this should not matter because the clocks seen by the monitor are always about the same (162MHz/75kHz/60Hz). If some were out of range, the monitor would show an error message, but with the PLL problem, the monitor does not appear to detect even an out-of-range signal. I'd guess the PLL itself cannot handle the parameters. The largest working ref_div_max limit is 131. with 131: 162000 - 161990, pll dividers - fb: 1425.4 ref: 21, post 6 with 132: 162000 - 162000, pll dividers - fb: 1493.3 ref: 22, post 6 avivo_reduce_ratio does not change these values. (In reply to Clemens Ladisch from comment #5) > It's an Eizo S2100, but this should not matter because the clocks seen by the > monitor are always about the same (162MHz/75kHz/60Hz). If some were out of > range, the monitor would show an error message, but with the PLL problem, the > monitor does not appear to detect even an out-of-range signal. I'd guess > the > PLL itself cannot handle the parameters. The PLL should be able to handle this quite fine. It's just that when you increase the reference and post divider you can better match the wanted frequency for the cost of increased jitter and general signal stability. I have one monitor here that practically works with everything I give to it, another one can't handle it when the frequency doesn't precisely match and a third one doesn't like it when we have a high jitter in the signal. The trick is to find the right sweet spot where you can make everbody happy. > The largest working ref_div_max limit is 131. Thanks allot, going to use 128 then (just because it's a nice round number) until somebody else starts to complain that his monitor doesn't likes the signal. Christian. Hi I'm still having randomly the frequency out off range problem with kernel 3.15-rc5. 2 times today when booting , once yesterday after suspend resume. My screen is a Belinea 2080S2 1600x1200 . I first reported this as bug 75471 where are dmesg. Nikola Does this patch help: http://lists.freedesktop.org/archives/dri-devel/2014-May/059469.html (In reply to Alex Deucher from comment #8) > Does this patch help: > http://lists.freedesktop.org/archives/dri-devel/2014-May/059469.html Unlikely. Tasev has an RS780, on those the feedback divider is usually in the ~1000 area. This patch only moves the feedback divider limit for .5 from 14 down to 13. Does it help if you suspend/resume again after this issue? Might be that we are seeing a crash somewhere else? I'm compiling a patched kernel now. I will test it but to be shure i will need 4-5 day's probably, because i use the 3.15-rc5 from sunday and the problem appear only now. Hi I just notice that the 3.15-rc5 will boot successfully only once in 5-6 attempt. When i try it the first time in sunday i was just lucky he boot at first time. I suspend resume this computer rater then shutdown, and i did not shutdown the computer until yesterday after the failure when suspend resume. Now, with or without the patch, it will boot only once in 5-6 attempts without the out off range frequency problem. Attached are dmesg when working and not working without patch. Created attachment 136241 [details]
dmesg working 3.15-rc5
Created attachment 136251 [details]
dmesg broken 3.15-rc5
Hi I try today with a Medion 1280x1024 monitor and everything work without problem. It seem's that only the combinaison RS880 + Belinea 2080S2 have problem with the new PLL code. I tried different value from 128 to 90 for the ref_div_max but none work with my Belinea 1600x1200 screen. (In reply to Tasev Nikola from comment #14) > I tried different value from 128 to 90 for the ref_div_max but none work > with my Belinea 1600x1200 screen. Try going down to at least 32, this would match the behaviour on 3.14. The problem is that in both the working and broken case the calculated parameters are the same. Broken: [ 23.511041] [drm:radeon_compute_pll_avivo] 162000 - 161990, pll dividers - fb: 1425.4 ref: 21, post 6 Working: [ 23.560826] [drm:radeon_compute_pll_avivo] 162000 - 161990, pll dividers - fb: 1425.4 ref: 21, post 6 So I'm not really sure what else could go wrong here. Hi I tried with 64, 48 and 32 for the ref_div_max . The only one working at boot is 32 , but after the first suspend resume the off range frequency problem appear again. I try a second suspend resume with the same result. I try also with the patch from comment 8 with the same result, boot succesfull and fail after resume. And you're right, the calculated parameters are the same in both the working and broken case again. The dmesg after boot and after suspend resume are attached. Created attachment 136831 [details]
dmesg after boot working with max divider 32
Created attachment 136841 [details]
dmesg after suspend resume broken
From the logs you are always getting the same set of paramaters, even when you change the maximum used in the fix: [drm:radeon_compute_pll_avivo] 162000 - 161990, pll dividers - fb: 1425.4 ref: 21, post 6 With a maximum of 32 and a post divider of 6 the ref divider shouldn't be more than 5, but it still stays at 21. Thise means there is something wrong with the way you install the kernel module (or the modification you make). Please double check that you got the right kernel module loaded. You're right again. It seems that just build the module doesn't work for me. I build a new kernel from sources with the ref_div_max 124 and it seems to work for now. [drm:radeon_compute_pll_avivo] 162000 - 161990, pll dividers - fb: 1346.2 ref: 17, post 7 I rebooted 3 times and it always boot fine. I would test it for some days and report if everything work fine. Sorry for my previous post Hi With the ref_div_max 124 everything works fine. Should i try another value just let me now. (In reply to Tasev Nikola from comment #21) > Hi > > With the ref_div_max 124 everything works fine. > Should i try another value just let me now. I'm going to submit a patch with value 114, just to have some more room for errors. I know that values below 100 causes problems for another user, so when 114 works for you we probably found the sweet spot. With ref_div_max 114 everything works fine for me. The new ref_div_max = max(min(100 / post_div, ref_div_max), 1u); works fine with my Belinea 1600x1200 screen. Unfortunately, I had to set this down to 32 to work on my system. Radeon HD 3200 (onboard, RS780) Monitor Viewsonic G225f Kernel 3.16-rc3 Nonworking: [drm:radeon_compute_pll_avivo] 229500 - 229500, pll dividers - fb: 1602.7 ref: 25, post 4 Working: [drm:radeon_compute_pll_avivo] 229500 - 229500, pll dividers - fb: 240.4 ref: 3, post 5 CRTs are getting increasingly rare - perhaps a tunable for this so us fogies with 100 pound monitors can set it where it works on our system? For me, it's a trivial patch to carry forward but setting something like drm.ref_div_tweak=32 in my grub config would be easier. I haven't been able to use a kernel since commit 3216701 drm/radeon: rework finding display PLL numbers v2. Created attachment 142051 [details]
Possible fix v2.
Does this patch fixes the issue for you?
No, I reverted to a clean 3.16-rc3 (changed the 32 back to 100) and applied the patch: [drm:radeon_compute_pll_avivo] 229500 - 229500, pll dividers - fb: 1602.7 ref: 20, post 5 fb: is the same, ref and post are different. Same results as without the patch - the monitor wakes up out of sleep, but doesn't display anything. I can't get the OSD to display, so I don't know what it thinks the sync rates are. Created attachment 142281 [details]
Possible fix v3.
How about this one? Does it fixes the issue as well?
(In reply to Christian König from comment #28) > Created attachment 142281 [details] > Possible fix v3. > > How about this one? Does it fixes the issue as well? Sorry for the long delay in getting back to you. 3.16 stock does not work on my monitor, this patch (alone) fixes it. I don't have a scope at my house, but at the office when this happens all signal lines on the VGA are idle. Your latest change broke it for me, sorry for the delay in noticing, that combination of machine & monitor was stuck in the dark ages for a while... The combo is Radeon R9 290 (from Sapphire) and good old Apple Cinema Display 23" (1920x1200x60 fixed resolution display) on DVI. I get a black screen with radeon. It works with Alex's amdgpu. The one liner that fixes it is in the PLL calculation: -ref_div_max = max(min(100 / post_div, ref_div_max), 1u); +ref_div_max = max(min(128 / post_div, ref_div_max), 1u); I noticed other differences though, the max fb div is 2047 with radeon and 4095 with amdgpu but the above is the key. This is a trace of amdgpu calculation (which works) after I sprinkled printk's around: [ 3.471131] fb_div_min/max=4/4095 pll_flags=400 [ 3.471132] by 10 ! fb_div_min/max=40/40950 [ 3.471133] ref_div_min=2 (from 0/2) [ 3.471133] ref_div_max=1023 (from 0/1023) [ 3.471134] vco_min/max=600000/1200000 [ 3.471134] post_div_min/max=4/7 [ 3.471135] initial nom=153970, den=2700 [ 3.471136] reduced nom=15397, den=270 [ 3.471136] - trying post_div 4, ref_div_max=32 [ 3.471137] tentative ref_div=32m, fb_div=7299 [ 3.471137] adjusted ref_div=32m, fb_div=7299 [ 3.471138] diff=7, diff_best=-1 [ 3.471138] - trying post_div 5, ref_div_max=25 [ 3.471139] tentative ref_div=25m, fb_div=7128 [ 3.471139] adjusted ref_div=25m, fb_div=7128 [ 3.471139] diff=6, diff_best=7 [ 3.471140] - trying post_div 6, ref_div_max=21 [ 3.471140] tentative ref_div=21m, fb_div=7185 [ 3.471141] adjusted ref_div=21m, fb_div=7185 [ 3.471141] diff=6, diff_best=6 [ 3.471141] - trying post_div 7, ref_div_max=18 [ 3.471142] tentative ref_div=18m, fb_div=7185 [ 3.471142] adjusted ref_div=18m, fb_div=7185 [ 3.471150] diff=6, diff_best=6 [ 3.471150] post_div_best=7 [ 3.471151] - trying post_div 7, ref_div_max=18 [ 3.471151] tentative ref_div=18m, fb_div=7185 [ 3.471152] adjusted ref_div=18m, fb_div=7185 [ 3.471153] [drm:amdgpu_pll_compute] 153970 - 153960, pll dividers - fb: 239.5 ref: 6, post 7 Now this is with radeon *NOTE: I have bumped the max fb div to the same as AMD GPU when taking that trace but that had no effect: [ 4.718126] fb_div_min/max=4/4095 pll_flags=410 [ 4.718126] by 10 ! fb_div_min/max=40/40950 [ 4.718127] ref_div_min=2 (from 0/2) [ 4.718128] ref_div_max=1023 (from 0/1023) [ 4.718128] vco_min/max=600000/1200000 [ 4.718129] post_div_min/max=4/7 [ 4.718129] initial nom=153970, den=2700 [ 4.718130] reduced nom=15397, den=270 [ 4.718130] - trying post_div 4, ref_div_max=25 [ 4.718131] tentative ref_div=25m, fb_div=5703 [ 4.718131] adjusted ref_div=25m, fb_div=5703 [ 4.718132] diff=11, diff_best=-1 [ 4.718133] - trying post_div 5, ref_div_max=20 [ 4.718133] tentative ref_div=20m, fb_div=5703 [ 4.718133] adjusted ref_div=20m, fb_div=5703 [ 4.718134] diff=11, diff_best=11 [ 4.718134] - trying post_div 6, ref_div_max=16 [ 4.718135] tentative ref_div=16m, fb_div=5474 [ 4.718135] adjusted ref_div=16m, fb_div=5474 [ 4.718136] diff=14, diff_best=11 [ 4.718136] - trying post_div 7, ref_div_max=14 [ 4.718136] tentative ref_div=14m, fb_div=5589 [ 4.718137] adjusted ref_div=14m, fb_div=5589 [ 4.718137] diff=12, diff_best=11 [ 4.718138] post_div_best=5 [ 4.718138] - trying post_div 5, ref_div_max=20 [ 4.718139] tentative ref_div=20m, fb_div=5703 [ 4.718139] adjusted ref_div=20m, fb_div=5703 [ 4.718141] [drm:radeon_compute_pll_avivo] 153970 - 153980, pll dividers - fb: 570.3 ref: 20, post 5 The modeline is: Modeline 55:"1920x1200" 60 153970 1920 1968 2000 2080 1200 1203 1209 1235 0x48 0x9 And is consistent between the 2 drivers. Note: It's an LCD :-) It's one of those fixed-mode panels Apple has always been fond of, one of the very first 1920x1200 out there. Note 2: Catalyst and the Windows driver both work fine. Any way to know what formula these 2 use (I assume it's the same code) ? (In reply to Benjamin Herrenschmidt from comment #30) > The combo is Radeon R9 290 (from Sapphire) and good old Apple Cinema Display > 23" (1920x1200x60 fixed resolution display) on DVI. Well this bug report is about nearly ten year old hardware and was fixed almost two years ago (we just forgot to close it). So please open a separate bug report preferable in the FDO bugzilla. Well, the Apple Cinema Display is nearly 10 years old too :-) But at least it's an LCD... I will open a new bug on FDO. |