Bug 109481

Summary: Radeon Module crashing/freezing on ATI/AMD Evergreen (Radeon HD6250, Wrestler)
Product: Drivers Reporter: Steffen Schmid (elbuffo166)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: NEW ---    
Severity: normal CC: alexdeucher, szg00000
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.3.3 Subsystem:
Regression: No Bisected commit-id:
Attachments: possible fix

Description Steffen Schmid 2015-12-16 17:12:55 UTC
We have some hundred ThinClients (Fujitsu Futro S900) based on AMD G-T44R with AMD Radeon HD 6250.

I have tested with some Kernel-Versions, e.g. the newest of 4.1, 4.2, 4.3 and 4.4-RC5.
On all these Kernels, System is freezing with a black screen, when radeon module with enabled modeset is loaded.
With kernel parameter nomodeset system is not freezing, but the xorg ati driver will need kms.

I've tried booting an Ubuntu Live System to verify, that it is not the problem of my ThinClient distribution. Ubuntu 15.10 as well  fails with a black screen.

I've the following trace:

radeon: Unknown symbol i2c_bit_add_bus (err 0)
radeon: Unknown symbol i2c_bit_add_bus (err 0)
[drm] radeon kernel modesetting enabled.
[drm] initializing kernel modesetting (PALM 0x1002:0x9805 0x1734:0x11BD).
[drm] register mmio base: 0xFEB00000
[drm] register mmio size: 262144
ATOM BIOS: AMD
radeon 0000:00:01.0: VRAM: 384M 0x0000000000000000 - 0x0000000017FFFFFF (384M used)
radeon 0000:00:01.0: GTT: 1024M 0x0000000018000000 - 0x0000000057FFFFFF
[drm] Detected VRAM RAM=384M, BAR=256M
[drm] RAM width 32bits DDR
[TTM] Zone  kernel: Available graphics memory: 443856 kiB
[TTM] Zone highmem: Available graphics memory: 824158 kiB
[TTM] Initializing pool allocator
[drm] radeon: 384M of VRAM memory ready
[drm] radeon: 1024M of GTT memory ready.
[drm] Loading PALM Microcode
[drm] Internal thermal controller without fan control
[drm] Found smc ucode version: 0x00010601
[drm] radeon: dpm initialized
[drm] GART: num cpu pages 262144, num gpu pages 262144
[drm] PCIE GART of 1024M enabled (table at 0x0000000000274000).
radeon 0000:00:01.0: WB enabled
radeon 0000:00:01.0: fence driver on ring 0 use gpu addr 0x0000000018000c00 and cpu addr 0xf1788c00
radeon 0000:00:01.0: fence driver on ring 3 use gpu addr 0x0000000018000c0c and cpu addr 0xf1788c0c
radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xf9432118
[drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[drm] Driver supports precise vblank timestamp query.
radeon 0000:00:01.0: radeon: MSI limited to 32-bit
[drm] radeon: irq initialized.
[drm] ring test on 0 succeeded in 1 usecs
[drm] ring test on 3 succeeded in 3 usecs
[drm] ring test on 5 succeeded in 1 usecs
[drm] UVD initialized successfully.
[drm] ib test on ring 0 succeeded in 0 usecs
[drm] ib test on ring 3 succeeded in 0 usecs
[drm] ib test on ring 5 succeeded
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   DP-1
[drm]   HPD1
[drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY
[drm] Connector 1:
[drm]   DVI-D-1
[drm]   HPD2
[drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[drm]   Encoders:
[drm]     DFP2: INTERNAL_UNIPHY
BUG: unable to handle kernel NULL pointer dereference at 00000024
IP: [<f90b7e91>] atombios_get_encoder_mode+0x87/0x178 [radeon]
*pde = 00000000 
Oops: 0000 [#1] SMP 
Modules linked in: radeon(+) i2c_algo_bit ipv6 ati_agp snd_hda_intel k10temp hwmon drm_kms_helper snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi ttm snd_hda_codec snd_hda_core fujitsu_laptop video drm snd_hwdep snd_pcm snd_timer snd agpgart soundcore syscopyarea sysfillrect sysimgblt fb_sys_fops button acpi_cpufreq squashfs [last unloaded: snd_hda_intel]
CPU: 0 PID: 2457 Comm: modprobe Not tainted 4.3.0-slitaz #2
Hardware name: FUJITSU FUTRO S900/D3003-A1, BIOS V4.6.4.1 R1.14.0 for D3003-A1x 01/27/2012
task: f284a080 ti: d3430000 task.ti: d3430000
EIP: 0060:[<f90b7e91>] EFLAGS: 00010246 CPU: 0
EIP is at atombios_get_encoder_mode+0x87/0x178 [radeon]
EAX: 00000000 EBX: 00000000 ECX: f167c9ec EDX: f167c9e0
ESI: f6699600 EDI: d3410000 EBP: 00000020 ESP: d3431d2c
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: 00000024 CR3: 3418d000 CR4: 000006d0
Stack:
 f6699600 00000003 d3410000 f90b9524 f6699600 f911ed18 f167c9f8 00000020
 f8302270 f6699600 f167c800 f83022ac f167c800 d3410000 00000000 f8302320
 f6699200 f90797f8 d3410000 00000002 00000002 f167c800 f9075554 00000000
Call Trace:
 [<f90b9524>] ? radeon_atom_encoder_dpms+0x14/0xeb [radeon]
 [<f8302270>] ? drm_encoder_disable+0x25/0x2f [drm_kms_helper]
 [<f83022ac>] ? __drm_helper_disable_unused_functions+0x32/0x97 [drm_kms_helper]
 [<f8302320>] ? drm_helper_disable_unused_functions+0xf/0x17 [drm_kms_helper]
 [<f90797f8>] ? radeon_fbdev_init+0xa2/0xc8 [radeon]
 [<f9075554>] ? radeon_modeset_init+0x6b4/0x820 [radeon]
 [<f9059427>] ? radeon_driver_load_kms+0xce/0x16d [radeon]
 [<f81565f8>] ? drm_dev_register+0x53/0x8a [drm]
 [<f815822e>] ? drm_get_pci_dev+0xcf/0x15a [drm]
 [<c126df13>] ? pci_device_probe+0x5b/0xa8
 [<c12def83>] ? driver_probe_device+0xbd/0x1e4
 [<c12df0ee>] ? __driver_attach+0x44/0x5f
 [<c12ddd0c>] ? bus_for_each_dev+0x47/0x63
 [<c12dec54>] ? driver_attach+0x11/0x13
 [<c12df0aa>] ? driver_probe_device+0x1e4/0x1e4
 [<c12de93f>] ? bus_add_driver+0xaf/0x180
 [<f9168000>] ? 0xf9168000
 [<c12df68c>] ? driver_register+0x6c/0x9d
 [<f9168000>] ? 0xf9168000
 [<c100043d>] ? do_one_initcall+0xcd/0x144
 [<c10946f3>] ? free_pcppages_bulk+0xe7/0x260
 [<c1095d6a>] ? free_hot_cold_page+0x38/0xc4
 [<c1090cd0>] ? do_init_module+0x43/0x19d
 [<c107caac>] ? load_module+0x1342/0x16c8
 [<c102b4a3>] ? vmalloc_sync_all+0xcf/0xcf
 [<c107ceca>] ? SyS_init_module+0x98/0xb0
 [<c1498df2>] ? syscall_call+0x7/0x7
Code: 8b 46 3c 83 f8 14 0f 84 00 01 00 00 83 f8 0b 0f 84 f7 00 00 00 89 f0 e8 2f ad fb ff 85 c0 89 c3 75 09 89 f0 e8 7d ad fb ff 89 c3 <8b> 43 24 48 83 f8 0d 77 68 ff 24 85 f4 e9 11 f9 83 3d 40 cb 14
EIP: [<f90b7e91>] atombios_get_encoder_mode+0x87/0x178 [radeon] SS:ESP 0068:d3431d2c
CR2: 0000000000000024
---[ end trace 29ec372e8e4d666d ]---
Comment 1 Alex Deucher 2015-12-16 18:18:03 UTC
Is this a regression?  If so, when was it working last?
Comment 2 Steffen Schmid 2015-12-17 08:41:42 UTC
Going to find out which is the last working kernel...
Comment 3 Steffen Schmid 2015-12-17 12:39:37 UTC
The following Kernel-Versions are working:
3.4.110
3.10.94
3.12.51
3.14.58
3.18.25

Kernel 4.1.15 is not working.
Testing 4.1-Tree...
Comment 4 Steffen Schmid 2015-12-17 14:56:41 UTC
Final result:
3.18.25 is last working.
>= 4.0.1 is not working.

Other Clients with AMD Graphics (Futro S920 - Kabini, Futro S550 - Radeon X1250) are working without problem on newest Kernel.
Comment 5 Steffen Schmid 2015-12-17 16:41:16 UTC
Maybe another finding regarding this problem:

On the old Kernel we are currently running in production (3.2.53) there are 3 display connectors found by radeon. 

Using Kernel 3.18.25 VGA is missing and because of this, displays connected by VGA (all our systems) wont't display anything.

Old Kernel:
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   DisplayPort
[drm]   HPD1
[drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY
[drm] Connector 1:
[drm]   DVI-D
[drm]   HPD2
[drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[drm]   Encoders:
[drm]     DFP2: INTERNAL_UNIPHY
[drm] Connector 2:
[drm]   VGA
[drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[drm]   Encoders:
[drm]     CRT1: INTERNAL_KLDSCP_DAC1

Using Kernel 3.18.25 VGA is missing:
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   DP-1
[drm]   HPD1
[drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY
[drm] Connector 1:
[drm]   DVI-D-1
[drm]   HPD2
[drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[drm]   Encoders:
[drm]     DFP2: INTERNAL_UNIPHY
Comment 6 Alex Deucher 2015-12-17 18:01:44 UTC
Created attachment 197651 [details]
possible fix

Does this patch help?  It looks you are getting hit by a quirk for another Fujitsu board with the same ids.
Comment 7 Steffen Schmid 2015-12-18 10:22:51 UTC
Great! It works! :-)
Many thanks for the fast fix.
Tested on Fujitsu Futro S900 and Kernel 4.3.3 (additionally tested on Futro S550 and S920, works too).

Do I have to do now anything? More testing, closing Ticket...?

Relevant dmesg Output:
[drm] Radeon Display Connectors
[drm] Connector 0:
[drm]   DP-1
[drm]   HPD1
[drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c
[drm]   Encoders:
[drm]     DFP1: INTERNAL_UNIPHY
[drm] Connector 1:
[drm]   DVI-I-1
[drm]   HPD2
[drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c
[drm]   Encoders:
[drm]     DFP2: INTERNAL_UNIPHY
[drm]     CRT1: INTERNAL_KLDSCP_DAC1
[drm] fb mappable at 0xE0478000
[drm] vram apper at 0xE0000000
[drm] size 5242880
[drm] fb depth is 24
[drm]    pitch is 5120
Comment 8 Alex Deucher 2015-12-21 21:32:52 UTC
(In reply to Steffen Schmid from comment #7)
> Great! It works! :-)
> Many thanks for the fast fix.
> Tested on Fujitsu Futro S900 and Kernel 4.3.3 (additionally tested on Futro
> S550 and S920, works too).
> 
> Do I have to do now anything? More testing, closing Ticket...?

I'll make sure the patch gets upstream.  Please close the ticket once it's committed upstream.