Bug 40952

Summary: R100 firmware no longer loads
Product: Drivers Reporter: James Cloos (cloos)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED CODE_FIX    
Severity: normal CC: alexdeucher, florian, maciej.rutecki, riesebie, rjw, torvalds
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.1.0-rc1-00035-ge6a99d3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 40982    

Description James Cloos 2011-08-11 17:07:27 UTC
With kernel 3.1.0-rc1-00035-ge6a99d3 I now get:

 [drm] Loading R100 Microcode
 ------------[ cut here ]------------
 WARNING: at drivers/base/firmware_class.c:524 _request_firmware+0x37d/0x3a0()
 Hardware name: Inspiron 8100                   
 Modules linked in:
 Pid: 1, comm: swapper Not tainted 3.1.0-rc1-00035-ge6a99d3 #32
 Call Trace:
[    1.036839] Call Trace:
[    1.036883]  [<c153ba2a>] ? printk+0x1d/0x1f
[    1.036933]  [<c1031252>] warn_slowpath_common+0x72/0xa0
[    1.036978]  [<c1372a9d>] ? _request_firmware+0x37d/0x3a0
[    1.037064]  [<c1372a9d>] ? _request_firmware+0x37d/0x3a0
[    1.037111]  [<c10312a0>] warn_slowpath_null+0x20/0x30
[    1.037156]  [<c1372a9d>] _request_firmware+0x37d/0x3a0
[    1.037202]  [<c136a017>] ? platform_device_add+0xf7/0x1e0
[    1.037248]  [<c1372b5f>] request_firmware+0x1f/0x30
[    1.037296]  [<c130bc68>] r100_cp_init+0x298/0x880
[    1.037341]  [<c1310ac4>] r100_startup+0x224/0x320
[    1.037385]  [<c131101a>] r100_init+0x12a/0x290
[    1.037431]  [<c12d32a5>] radeon_device_init+0x345/0x410
[    1.037476]  [<c12d1b10>] ? cail_reg_write+0x40/0x40
[    1.037522]  [<c12d484f>] radeon_driver_load_kms+0x7f/0x110
[    1.037569]  [<c12a5893>] drm_get_pci_dev+0x143/0x260
[    1.037615]  [<c1537122>] radeon_pci_probe+0x82/0x87
[    1.037661]  [<c12127bc>] pci_device_probe+0x8c/0x120
[    1.037707]  [<c1149ed7>] ? sysfs_create_link+0x17/0x20
[    1.037756]  [<c1368516>] driver_probe_device+0x66/0x170
[    1.037800]  [<c12126f9>] ? pci_match_device+0xc9/0xe0
[    1.037845]  [<c13686b1>] __driver_attach+0x91/0xa0
[    1.037889]  [<c1368620>] ? driver_probe_device+0x170/0x170
[    1.037935]  [<c1367769>] bus_for_each_dev+0x49/0x70
[    1.037979]  [<c12116c0>] ? pci_dev_put+0x20/0x20
[    1.038041]  [<c1368221>] driver_attach+0x21/0x30
[    1.038086]  [<c1368620>] ? driver_probe_device+0x170/0x170
[    1.038132]  [<c1367f07>] bus_add_driver+0x187/0x250
[    1.038176]  [<c12116c0>] ? pci_dev_put+0x20/0x20
[    1.038220]  [<c1368bba>] driver_register+0x6a/0x120
[    1.038264]  [<c12121fc>] __pci_register_driver+0x3c/0xb0
[    1.038309]  [<c12a5acc>] drm_pci_init+0x11c/0x130
[    1.038356]  [<c177d6e9>] ? start_kernel+0x2d0/0x2d0
[    1.038402]  [<c179b98d>] radeon_init+0xd4/0xd6
[    1.038446]  [<c1001124>] do_one_initcall+0x34/0x170
[    1.038490]  [<c179b8b9>] ? ttm_init+0x64/0x64
[    1.038533]  [<c177d6e9>] ? start_kernel+0x2d0/0x2d0
[    1.038576]  [<c177d757>] kernel_init+0x6e/0xfa
[    1.038620]  [<c15486b6>] kernel_thread_helper+0x6/0x10
 ---[ end trace 6a17ba782679faba ]---
 platform radeon_cp.0: firmware: radeon/R100_cp.bin will not be loaded
 radeon_cp: Failed to load firmware "radeon/R100_cp.bin"
 [drm:r100_cp_init] *ERROR* Failed to load firmware!
 radeon 0000:01:00.0: failed initializing CP (-16).
 radeon 0000:01:00.0: Disabling GPU acceleration

It worked fine with 3.0.0.

The firmware is compiled in to the kernel.

Doing bisects on this box is painful; it may be some time before I can do so.
Comment 1 Alex Deucher 2011-08-17 23:11:47 UTC
Do you have the firmware in your initrd (if radeon is modular) or compiled into your kernel (if radeon is compiled in)?  There haven't been any changes to the r100 code in ages, so I suspect this is either missing firmware or a change in the generic firmware code.
Comment 2 James Cloos 2011-08-18 00:08:09 UTC
Sorry. I thought that I had written that the firmare is compiled in.

Or at least it is supposed to be:

    :; zgrep FIRMWARE /proc/config.gz 
    # CONFIG_PREVENT_FIRMWARE_BUILD is not set
➤   CONFIG_FIRMWARE_IN_KERNEL=y
    CONFIG_EXTRA_FIRMWARE=""
    CONFIG_FIRMWARE_EDID=y
    CONFIG_FIRMWARE_MEMMAP=y
    # CONFIG_GOOGLE_FIRMWARE is not set
Comment 3 Michel Dänzer 2011-08-24 21:03:48 UTC
From Elimar Riesebieter (riesebie at lxtec.de):

bisecting brought me to commit
288d5abec8314ae50fe6692f324b0444acae8486. Reverting seems to work as
microcode is loaded with compiled in firmware and radeon kms driver.


That's

commit 288d5abec8314ae50fe6692f324b0444acae8486
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Aug 3 22:03:29 2011 -1000

    Boot up with usermodehelper disabled
Comment 4 Linus Torvalds 2011-08-24 21:57:45 UTC
On Wed, Aug 24, 2011 at 2:04 PM,  <bugzilla-daemon@bugzilla.kernel.org> wrote:
>
> --- Comment #3 from Michel Dänzer <michel@daenzer.net>  2011-08-24 21:03:48
> ---
> From Elimar Riesebieter (riesebie at lxtec.de):
>
> bisecting brought me to commit
> 288d5abec8314ae50fe6692f324b0444acae8486. Reverting seems to work as
> microcode is loaded with compiled in firmware and radeon kms driver.

Grr.

So _request_firmware() does this:

        if (WARN_ON(usermodehelper_is_disabled())) {
                dev_err(device, "firmware: %s will not be loaded\n", name);
                return -EBUSY;
        }

which is reasonable, but it does mean that it will warn even if the
firmware is built into the kernel.

On the one hand, that's really nice, because it implies a driver does
a firmware load too early, at a point where it cannot do the generic
firmware load.

On the other hand, it sucks, because it does disallow this situation
that used to work, now that we actually do the sane thing and don't
allow usermode helpers before init has been set up.

So I bet the attached patch fixes the R100 problem, but I'm not 100%
happy with it.

Comments?

                               Linus
Comment 5 Linus Torvalds 2011-08-24 23:05:19 UTC
Ok, that fix is committed as caca9510ff4e ("firmware loader: allow builtin firmware load even if usermodehelper is disabled"). 

Please verify that it does fix it, and close this bug if so.
Comment 6 James Cloos 2011-08-24 23:54:29 UTC
> Ok, that fix is committed as caca9510ff4e ("firmware loader: allow
> builtin firmware load even if usermodehelper is disabled").

> Please verify that it does fix it, and close this bug if so.

Will do, just as soon as it hits hera.  (Not there yet.)
Comment 7 Linus Torvalds 2011-08-25 00:00:21 UTC
James Cloos <cloos@jhcloos.com>  2011-08-24 23:54:29:
>
> Will do, just as soon as it hits hera.  (Not there yet.)

Duh. I forgot to push.

Done,

                  Linus
Comment 8 Anonymous Emailer 2011-08-25 01:07:40 UTC
Reply-To: gregkh@suse.de

On Wed, Aug 24, 2011 at 02:22:27PM -0700, Linus Torvalds wrote:
> On Wed, Aug 24, 2011 at 2:04 PM,  <bugzilla-daemon@bugzilla.kernel.org>
> wrote:
> >
> > --- Comment #3 from Michel Dänzer <michel@daenzer.net>  2011-08-24 21:03:48
> ---
> > From Elimar Riesebieter (riesebie at lxtec.de):
> >
> > bisecting brought me to commit
> > 288d5abec8314ae50fe6692f324b0444acae8486. Reverting seems to work as
> > microcode is loaded with compiled in firmware and radeon kms driver.
> 
> Grr.
> 
> So _request_firmware() does this:
> 
>         if (WARN_ON(usermodehelper_is_disabled())) {
>                 dev_err(device, "firmware: %s will not be loaded\n", name);
>                 return -EBUSY;
>         }
> 
> which is reasonable, but it does mean that it will warn even if the
> firmware is built into the kernel.
> 
> On the one hand, that's really nice, because it implies a driver does
> a firmware load too early, at a point where it cannot do the generic
> firmware load.
> 
> On the other hand, it sucks, because it does disallow this situation
> that used to work, now that we actually do the sane thing and don't
> allow usermode helpers before init has been set up.
> 
> So I bet the attached patch fixes the R100 problem, but I'm not 100%
> happy with it.
> 
> Comments?

That patch looks good to me.

Any ideas on ways that this all could be rewritten to be "saner"?

greg k-h
Comment 9 James Cloos 2011-08-25 21:25:02 UTC
That worked.  3.1.0-rc3-00096-gcaca951 loads the r100 firmware.

Interestingly, the RS780 managed to load its firmware (also all compiled into the kernel) even w/o this patch.  But everything I can test again works fine with master.
Comment 10 Linus Torvalds 2011-08-25 22:43:53 UTC
On Thu, Aug 25, 2011 at 2:25 PM,  <bugzilla-daemon@bugzilla.kernel.org> wrote:
>
> Interestingly, the RS780 managed to load its firmware (also all compiled into
> the kernel) even w/o this patch.

Does the R600+ driver perhaps load firmware only at open time?

The problem with the r100 driver is that it loads firmware very early
on, at driver init time. Generally it's *much* better if you request
the firmware when the device is actually opened, rather than when the
driver loads.

                         Linus
Comment 11 James Cloos 2011-08-26 08:34:13 UTC
> Does the R600+ driver perhaps load firmware only at open time?

They look the same in dmesg:

[    1.648745] [drm] Initialized drm 1.1.0 20060810
[    1.648824] [drm] radeon kernel modesetting enabled.
...
[    1.658077] [drm] Loading RS780 Microcode
...
[    1.733643] fbcon: radeondrmfb (fb0) is primary device
...
[    1.754301] [drm] Initialized radeon 2.11.0 20080528 for 0000:01:05.0 on minor 0

vs

[    1.034278] [drm] Initialized drm 1.1.0 20060810
[    1.034359] [drm] radeon kernel modesetting enabled.
...
[    1.038968] [drm] Loading R100 Microcode
...
[    1.238337] fbcon: radeondrmfb (fb0) is primary device
...
[    1.694374] [drm] Initialized radeon 2.11.0 20080528 for 0000:01:00.0 on minor 0


my syslog for the earlier boot has already rotated out, but from memory
it looks the same as the previous boot.
Comment 12 Anonymous Emailer 2011-08-26 08:40:42 UTC
Reply-To: airlied@gmail.com

> --- Comment #10 from Linus Torvalds <torvalds@linux-foundation.org> 
> 2011-08-25 22:43:53 ---
> On Thu, Aug 25, 2011 at 2:25 PM,  <bugzilla-daemon@bugzilla.kernel.org>
> wrote:
>>
>> Interestingly, the RS780 managed to load its firmware (also all compiled
>> into
>> the kernel) even w/o this patch.
>
> Does the R600+ driver perhaps load firmware only at open time?
>
> The problem with the r100 driver is that it loads firmware very early
> on, at driver init time. Generally it's *much* better if you request
> the firmware when the device is actually opened, rather than when the
> driver loads.

That doesn't make any sense for radeon graphics drivers, since the
firmware is required to initialise the engines on the GPU, which you
really want to do so you can turn on the screen and stuff.

Dave.
Comment 13 Alex Deucher 2011-08-26 12:33:10 UTC
(In reply to comment #10)
> On Thu, Aug 25, 2011 at 2:25 PM,  <bugzilla-daemon@bugzilla.kernel.org>
> wrote:
> >
> > Interestingly, the RS780 managed to load its firmware (also all compiled
> into
> > the kernel) even w/o this patch.
> 
> Does the R600+ driver perhaps load firmware only at open time?

They both load the ucode at the same time.
Comment 14 Florian Mickler 2011-08-29 18:15:36 UTC
A patch referencing this bug report has been merged in Linux v3.1-rc4:

commit caca9510ff4e5d842c0589110243d60927836222
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Aug 24 15:55:30 2011 -0700

    firmware loader: allow builtin firmware load even if usermodehelper is disabled