Bug 100871
Summary: | radeon fails to initialize one DisplayPort monitor | ||
---|---|---|---|
Product: | Drivers | Reporter: | Charles R. Anderson (cra) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | NEW --- | ||
Severity: | normal | CC: | alexdeucher, reg, szg00000, tiwai, vedran |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: | https://bugzilla.redhat.com/show_bug.cgi?id=1232562 | ||
Kernel Version: | v3.19-7478-g796e1c55717e | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
lspci-nn.txt
Xorg journal messages from bad kernel Xorg journal messages from good 3.19.8-200.fc21 kernel kernel journal messages from bad kernel kernel journal messages from good 3.19.8-200.fc21 kernel dmsg - All 6 screens good kernel-3.16.7-35-default Xorg.0.log - All 6 screens good kernel-3.16.7-35-default Logs to compare all screens good on boot to some bad on boot |
Description
Charles R. Anderson
2015-07-03 16:00:17 UTC
Does it work correctly if you set radeon.audio=0 on the kernel command line in grub? Also does the problematic monitor support audio? Please attach your xorg log and dmesg output. I'll try radeon.audio=0. I never tried DisplayPort audio on the Dell U2410, but it apparently does support it: http://en.community.dell.com/support-forums/peripherals/f/3529/t/19311234 This problem happens on the text VC even before X is started. You can find the logs in the Red Hat bugzilla linked above. radeon.audio=0 indeed works around the problem. (In reply to Charles R. Anderson from comment #3) > radeon.audio=0 indeed works around the problem. This problem still exists in 4.2.3 as released in Fedora 22 (kernel-4.2.3-200.fc22). I still need the workaround. Created attachment 190671 [details]
lspci-nn.txt
Created attachment 190681 [details]
Xorg journal messages from bad kernel
Created attachment 190691 [details]
Xorg journal messages from good 3.19.8-200.fc21 kernel
Created attachment 190701 [details]
kernel journal messages from bad kernel
Created attachment 190711 [details]
kernel journal messages from good 3.19.8-200.fc21 kernel
Problem still exists in Linux 4.6.4 in Fedora 24 (kernel-4.6.4-301.fc24.x86_64). I think I have the same problem but I have a more complicated setup and because of that I have been able to identify more symptoms which may help. In any case, here's everything I have been able to determine but first the hardware setup: My graphics card is "HD 5870 Eyefinity 6" which has 6 DisplayPorts. I have them setup in a grid of 3 across by 2 down. Each display is at a resolution of 2560x1440 creating a total work area of 7680x2880 in a Xinerama setup running on the KDE4 desktop. I currently have 3 kernels in my grub list which are: kernel-3.16.7 kernel-4.7.0 kernel-4.7.2 Of these 3.16.7 was with opensuse 13.2 and the other two came into being when I switched over to Tumbleweed, SUSE's rolling distribution. With the kernel 3.16.7 I had no problem with all DisplayPorts turning on as they should all the time. When I changed over to Tumbleweed it still worked fine. However, the other two kernels would only turn on the first two displays. That happens during boot long before Xorg gets loaded. In Xorg the behavior is a little strange when it gets DisplayPorts off from the kernel. Xorg will acknowledge all 6 displays but it is not able to turn on any that are initially off when the kernel was handling them. E.g.: the last 4 monitors in the case of the 4.x kernels. The upshot is that when I go to the multidisplay setup part of KDE all 6 displays are showing as active even though only the first two are turned on in reality. If I disable and re-enable the displays turned off, they don't turn on. If I use xrandr to turn them on, no dice. That is, if they are off when the kernel was handling them they are off for good, nothing in Xorg or KDE can change it that I have found. There was a bunch of updates for Tumbleweed a few days ago. With this update the kernel 4.7.2 was added and 3.16.7 started to not always boot with all the displays on. In fact, it was consistently leaving out display 0 and 5 (first and last on the graphics card). However, after much playing with the "radeon." kernel boot parameters I found that setting radeon.agpmode=-1 seemed to make it consistently on leave only 1 monitor off, monitor 0. No other "radeon." setting seemed to help. However, on several boots I could get variations... I must have rebooted 50+ times last night. Occasionally I would get only 4 of the 6 on and even more occasionally I would get all 6 on like it should be. Trying all the "radeon." settings seemed to have no effect on the 4.x kernels and they still only booted with 2 of the 6 displays on... as if someone hard coded a 2 output limit in the kernel code for testing and forgot to remove the test code. I also found two other curious symptoms on the 3.16.7 kernel: - If I turn the monitor off and back on while booting that the kernel left off sometimes I can get the kernel to recognise that display and leave it on during boot. If it gets to the gui before I can turn off/on the monitor then it's too late. Again, this is very iffy, some times it works and some times it doesn't. - Since the latest updates, if I let KDE turn off all the monitors, say I walk away for a while so that power saving kicks in, then all the monitors that were on will come back on. However if I leave it too long, like over night, then some displays may not come back on, and, once they are off when they should be on again, there is no turning them back on without clearing the KDE cache and rebooting before that cache gets refreshed. This usually means logging out of my profile, logging in as root user, clearing my profile's KDE/plasma cache, rebooting, making sure I get a boot that the kernel turns on all the displays and the logging in to my profile again... not exactly and long term workable way to be. I have attached my dmesg with the 3.16.7 kernel working correctly (I just got very lucky so I preserved the logs). Tomorrow, I can get you the logs of 3.16.7 not coming up correctly and the other two kernels coming up with only 2 displays on out of the 6 there should be. That's all that I have figured out so far. As you can guess with my setup it's rather important that I get this fixed or I'll have to revert back to an older release which I don't want to do for several reasons. The upshot, I am at your disposal to figure this out, just tell me what you need me to do. Created attachment 231661 [details]
dmsg - All 6 screens good kernel-3.16.7-35-default
Created attachment 231671 [details]
Xorg.0.log - All 6 screens good kernel-3.16.7-35-default
Created attachment 232241 [details]
Logs to compare all screens good on boot to some bad on boot
First I have to take one thing back, the radeon.audo=0 definitely makes a difference and I am not so sure that radeon.apgmode=-1 helps anymore. That said, things still go wrong. Because the biggest issue here seems to be a lack of reproducibility and therefore it's almost impossible to track down I went to the trouble to write a script to gather information.
In the tarred file I found that to see what's different between a good and bad boot all you have to do is a diff on the files:
./Logs/timing-stripped/filtered-drm/
screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
screens-0-5-good_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
Anybody who wanted to also gather comprehensive information for the developers could take the file ./gather-info-for-diagnostics.sh in the tarred file and modify as needed for their own system.
That said, below explains in detail what's in the tarred compressed file.
Directory structure
===================
.
└── logs
├── filtered-drm
└── timing-stripped
└── filtered-drm
This structure is as follows:
.
=
The script that creates the log files and script to turn on any screens that are off during boot (more on this one later).
./Logs
======
The raw log files the script gathered which include:
dmsg.txt - from dmesg
proc-cmdline.txt - from /proc/cmdline
module-kernel-parameters.txt - from /module/kernel/parameters/*
module-processor-parameters.txt - from /module/processor/parameters/*
sys-module-radeon-parameters.txt - from /module/radeon/parameters/*
Xorg.0.log.txt - from /var/log/Xorg.0.log
./Logs/filtered-drm
===================
Some of the above raw Log files with lines that do not contain radeon information removed. Makes it easier to see what's relevant. If you want to know exactly how the lines were filtered you can look at the script ./gather-info-for-diagnostics.sh.
./Logs/timing-stripped
======================
The above raw Log files with the timing at the beginning of each line removed. This makes using diff programs easier. If you want to know exactly how this was done you can look at the script ./gather-info-for-diagnostics.sh.
./Logs/timing-stripped/filtered-drm
===================================
Some of the above raw Log files with the timing at the beginning of each line removed and lines that do not contain radeon information removed. Again, makes it easier to see what's relevant. If you want to know exactly how this was done you can look at the script ./gather-info-for-diagnostics.sh.
Scripts
=======
./gather-info-for-diagnostics.sh
--------------------------------
Does all the heavy lifting in gathering the info.
./display-on.sh
---------------
This was a curious discovery and may make fixing the issue easier. This is because I found when the script was like this:
xrandr --output DisplayPort-${1} --mode 1920x1080
xrandr --output DisplayPort-${1} --mode 2560x1440
I found that sometimes it would turn the display on but others it would turn it off. To consistantly turn the display on I had to change it to this:
xrandr --output DisplayPort-${1} --mode 1920x1080
sleep
xrandr --output DisplayPort-${1} --mode 2560x1440
suggesting there might be a timing problem that needs to be addressed.
File Names
==========
File names take the form of:
<what happened to the screens at boot>_<partial command line when booting the kernel>_<the file name>.txt
E.g. The file:
screens-0-4-good-5-bad_kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects_dmsg.txt
can be broken down to:
screens-0-4-good-5-bad = The first 5 of the 6 screens came on as they should during boot but the 6th one (number 5) did not.
kernel-4.7.2-1-default_logo.nologo-radeon.audio=0-debug-debug_objects
= shows most of the boot command line
dmsg = A key indicating the file contents, from dmesg in this case
.txt = That is is a text file
If the file starts off with something like this: screens-0-5-good-after-5-fixed-with_display-on.sh it means after booting and logging in I ran the script ./display-on.sh to turn on the display and then gathered all the log information. I will have gathered the log information prior to running the script as well so you will also see files prefixed with just screens-0-5-good in such a case.
Hm, I can't edit that last message. Here are a couple of corrections: Where I showed: xrandr --output DisplayPort-${1} --mode 1920x1080 sleep xrandr --output DisplayPort-${1} --mode 2560x1440 It should have been: xrandr --output DisplayPort-${1} --mode 1920x1080 sleep 5 xrandr --output DisplayPort-${1} --mode 2560x1440 Where I showed: from /module/... It should have been: from /sys/module/... Update: Regarding c14 (comment 14) about the ./display-on.sh. Even though running this script can turn the display on that was erroneously off during boot the display will turn itself back off after a few seconds or so so it's not a usable workaround. I guess there is some status flag during boot in the kernel that ultimately can't be changed or overridden that eventually reasserts itself. Still a problem on Fedora 36 / Linux kernel 5.18.16-200.fc36.x86_64. I'm now using two newer monitors (DELL U3219Q) connected via DP instead of the previous four monitors (2 DP, 2 DVI) and neither monitor turns on unless radeon.audio=0 is passed. Maybe there needs to be a quirk added to the driver to keep audio turned off for this card? Advanced Micro Devices, Inc. [AMD/ATI] Cedar GL [FirePro 2460] (prog-if 00 [VGA controller]) I believe the PCI ID is 1002:68f1: 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Cedar GL [FirePro 2460] [1002:68f1] (prog-if 00 [VGA controller]) |