Bug 26812

Summary: [RADEON:KMS:BARTS:HD6850:FIRMWARE] radeon module causes hard reset on modprobe
Product: Drivers Reporter: Michael Evans (mjevans1983)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED OBSOLETE    
Severity: normal CC: alan, alexdeucher, glisse
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.37-git9 Subsystem:
Regression: No Bisected commit-id:
Attachments: lspci, dmidecode, config and netconsole logs
Don't load if MC ucode is not available

Description Michael Evans 2011-01-15 21:40:09 UTC
I'm using the git snapshots between when Linus merged from drm-core-next and when network drivers were merged (breaks build).

If there is a better patch relative to a released source I'll gladly test it too; however the other sources I tried also had similar issues.

When radeon.ko (or build in init) starts the screens blank and a few seconds later the system completely reboots.  This is 100% repeatable.

I have firmware from Arch's (AUR) linux-firmware-git package installed, /lib/firmware pointed as the source, and the relevant blobs for the recent ATI/AMD GPUs.

When built as a module I can modprobe radeon to trigger the exact same behavior.

Problems debugging:
* It resets so hard there isn't a peep from netconsole.
* It resets so hard that my disks loose md raid sync and need to completely rebuild (yeah, I know, I'll add a write-intent log before I test this again next time...; it wasn't an issue before I got the brainwave of making it a module to try to give netconsole/syslog time to get setup.)

How can I possibly approach isolating this?  Even a link to the correct documentation would be helpful.
Comment 1 Alex Deucher 2011-01-15 22:11:00 UTC
What radeoon chip do you have?  If it's an r6xx, r7xx, or evergreen asic, this possibly a duplicate of this bug:
https://bugs.freedesktop.org/show_bug.cgi?id=33027
If so, does the patch there help?
Comment 2 Michael Evans 2011-01-16 01:41:59 UTC
It's a 6850 with two DVI's in use.  I think that's a "Northern Islands" chip?

The system is an Athlon II x4, 00:00.0 Host bridge: Advanced Micro Devices [AMD] RS880 Host Bridge ( M4A785TD-V EVO vendor: ASUSTeK Computer INC. )

I'm going to try booting a copy of drm-next's tree that finished compiling while I was running errands; it should have everything possible already.

drm-2.6-drm-next-56bec7c

-

Nope, that totally failed too.  I'm going to collect some system information and what logs I can.
Comment 3 Michael Evans 2011-01-16 02:09:58 UTC
Created attachment 43702 [details]
lspci, dmidecode, config and netconsole logs
Comment 4 Michael Evans 2011-01-16 05:57:33 UTC
I noticed something when looking at the files in /lib/firmware/radeon to make sure I had /all/ the correct ones in place.  BARTS*, which wasn't in my config (I copy/pasted an example for that section without paying it much attention, and wouldn't have remembered which chip codename I had anyway.) when added allows at /least/ drm-next to work correctly.

There are two outstanding issues I noticed however.
1) When the firmware isn't found the module very un-gracefully reboots without //any// explanation (no information provided to lead to enlightenment).  Correct behavior should be to fail gracefully and provide a clean message in the kernel log without taking the system fully offline.

2) the EXTRA firmware line has a size limit.  I didn't look at what it was exactly, but it can't contain the expansion of cd /lib/firmware; ls radeon/* ; a more preferable solution might be to allow the use of shell wildcards (or document that they are evaluated).


I am --not-- marking this as resolved since the first issue is still a major problem (it makes this error non-trivial to diagnose for no good reason).
Comment 5 Alex Deucher 2011-02-02 17:52:27 UTC
Created attachment 46072 [details]
Don't load if MC ucode is not available

The default boot state only allows limited operation.
Comment 6 Jérôme Glisse 2011-03-08 18:06:22 UTC
Michael please confirm Alex fixed the issue (should be upstream already)