Bug 73041 - radeon: not responding, "atombios stuck in loop"
Summary: radeon: not responding, "atombios stuck in loop"
Status: RESOLVED OBSOLETE
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri
URL: http://lkml.kernel.org/r/CALCETrVG6uR...
Keywords:
Depends on:
Blocks:
 
Reported: 2014-03-27 17:21 UTC by Bjorn Helgaas
Modified: 2017-03-02 22:51 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.14-rc7
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
v3.13 dmesg (83.88 KB, text/plain)
2014-03-27 17:22 UTC, Bjorn Helgaas
Details
v3.14-rc7 dmesg (82.11 KB, text/plain)
2014-03-27 17:22 UTC, Bjorn Helgaas
Details
v3.13 lspci (51.45 KB, text/plain)
2014-03-27 17:23 UTC, Bjorn Helgaas
Details
v3.14-rc7 lspci (50.45 KB, text/plain)
2014-03-27 17:23 UTC, Bjorn Helgaas
Details
v3.14-rc7 Xorg.0.log (46.65 KB, application/octet-stream)
2014-03-27 17:23 UTC, Bjorn Helgaas
Details
dmesg, 3.14-rc7, NR_CPUS=12 (103.10 KB, text/plain)
2014-03-27 20:17 UTC, Andy Lutomirski
Details
lspci, 3.14, NR_CPUS=12, as root (140.42 KB, text/plain)
2014-03-27 20:17 UTC, Andy Lutomirski
Details
Bad config (137.93 KB, application/octet-stream)
2014-03-27 21:26 UTC, Andy Lutomirski
Details
Good config (138.23 KB, application/octet-stream)
2014-03-27 21:26 UTC, Andy Lutomirski
Details

Description Bjorn Helgaas 2014-03-27 17:21:53 UTC
Andy Lutomirski reported (see URL above):

My system works on a 3.13 Fedora kernel.  It does not work on a
more-or-less identically configured 3.14-rc7+ kernel.  The symptom is
that the Plymouth password prompt flashes and them the screen goes
blank.  Hitting escape brings back the text console, and all is well
until X tries to start.  Then I get a blank screen.  killall -9 Xorg
from ssh causes these errors to be logged:


[  226.239747] [drm:atom_op_jump] *ERROR* atombios stuck in loop for
more than 5secs aborting
[  226.239751] [drm:atom_execute_table_locked] *ERROR* atombios stuck
executing CD34 (len 55, WS 0, PS 0) @ 0xCD57
[  231.241492] [drm:atom_op_jump] *ERROR* atombios stuck in loop for
more than 5secs aborting
[  231.241496] [drm:atom_execute_table_locked] *ERROR* atombios stuck
executing CD6C (len 62, WS 0, PS 0) @ 0xCD88
[  236.243111] [drm:atom_op_jump] *ERROR* atombios stuck in loop for
more than 5secs aborting
[  236.243115] [drm:atom_execute_table_locked] *ERROR* atombios stuck
executing CD6C (len 62, WS 0, PS 0) @ 0xCD88
[  241.244625] [drm:atom_op_jump] *ERROR* atombios stuck in loop for
more than 5secs aborting
[  241.244628] [drm:atom_execute_table_locked] *ERROR* atombios stuck
executing CD6C (len 62, WS 0, PS 0) @ 0xCD88

lspci -vvvxxxnn on 3.14-rc7+ says:

09:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc.
[AMD/ATI] Caicos [Radeon HD 6450/7450/8450 / R5 230 OEM] [1002:6779]
(rev ff) (prog-if ff)
    !!! Unknown header type 7f
    Kernel driver in use: radeon
00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
10: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
20: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
30: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Comment 1 Bjorn Helgaas 2014-03-27 17:22:30 UTC
Created attachment 130811 [details]
v3.13 dmesg
Comment 2 Bjorn Helgaas 2014-03-27 17:22:52 UTC
Created attachment 130821 [details]
v3.14-rc7 dmesg
Comment 3 Bjorn Helgaas 2014-03-27 17:23:13 UTC
Created attachment 130831 [details]
v3.13 lspci
Comment 4 Bjorn Helgaas 2014-03-27 17:23:31 UTC
Created attachment 130841 [details]
v3.14-rc7 lspci
Comment 5 Bjorn Helgaas 2014-03-27 17:23:57 UTC
Created attachment 130851 [details]
v3.14-rc7 Xorg.0.log
Comment 6 Alex Deucher 2014-03-27 17:44:13 UTC
Can you bisect?
Comment 7 Andy Lutomirski 2014-03-27 20:16:32 UTC
I apologize for the bad bug report.  There is, indeed, a change in 3.14 that sort of caused this, but it's not a real regression.  Somehow make oldconfig on Fedora's 3.13 config results in NR_CPUS=8, and NR_CPUS=8 seems to break radeon.  ISTR that there was at least one issue related to PCI issues when NR_CPUs was too low -- am I hitting that?

On my current boot, I have X working, although I still had an issue with Plymouth flashing a graphical prompt and then going blank.

I can try to do some explicit tests with NR_CPUS and/or maxcpus later today.
Comment 8 Andy Lutomirski 2014-03-27 20:17:30 UTC
Created attachment 130871 [details]
dmesg, 3.14-rc7, NR_CPUS=12
Comment 9 Andy Lutomirski 2014-03-27 20:17:55 UTC
Created attachment 130881 [details]
lspci, 3.14, NR_CPUS=12, as root
Comment 10 Bjorn Helgaas 2014-03-27 20:33:29 UTC
Hm, I don't remember a PCI issue related to NR_CPUS; do you remember any more details about that?

If you can narrow it down, e.g., NR_CPUS=8 fails and NR_CPUS=12 works on the same kernel, that might give a place to start, although I still don't know where I would look unless there was some hint in dmesg.  I'm trying to avoid the hassle of you bisecting it, but I'm afraid I don't have any better ideas.

As far as the lspci output, I was just grasping at straws and comparing the v3.13 and v3.14 output because I didn't have any better ideas.
Comment 11 Andy Lutomirski 2014-03-27 21:26:27 UTC
Created attachment 130891 [details]
Bad config
Comment 12 Andy Lutomirski 2014-03-27 21:26:45 UTC
Created attachment 130901 [details]
Good config
Comment 13 Andy Lutomirski 2014-03-27 21:29:51 UTC
It's a config issue.  The differences are (- = bad, + = good):

+CONFIG_USER_NS=y
+CONFIG_X86_UV=y
-CONFIG_GART_IOMMU=y
+CONFIG_MEMORY_HOTPLUG=y
+CONFIG_MEMORY_HOTPLUG_SPARSE=y
-CONFIG_DEFAULT_MMAP_MIN_ADDR=4096
+CONFIG_DEFAULT_MMAP_MIN_ADDR=65536
+CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE=y
+CONFIG_ACPI_HOTPLUG_MEMORY=y
-CONFIG_ACPI_EXTLOG=m
+CONFIG_IPV6_VTI=m
-CONFIG_NF_TABLES_INET=m
-CONFIG_NFT_QUEUE=m
-CONFIG_NFT_REJECT=m
-CONFIG_NFT_REJECT_INET=m
+CONFIG_IP_SET_HASH_NETPORTNET=m
+CONFIG_IP_SET_HASH_NETNET=m
-CONFIG_NFT_REJECT_IPV4=m
-CONFIG_NFT_REJECT_IPV6=m
-CONFIG_NET_SCH_HHF=m
-CONFIG_NET_SCH_PIE=m
+CONFIG_NFC_DIGITAL=m
+CONFIG_NFC_PORT100=m
+CONFIG_BLK_DEV_NULL_BLK=m
+CONFIG_BLK_DEV_SKD=m
-CONFIG_VIRTIO_BLK=m
+CONFIG_VIRTIO_BLK=y
+CONFIG_SGI_XP=m
+CONFIG_SGI_GRU=m
+CONFIG_INTEL_MIC_HOST=m
+CONFIG_INTEL_MIC_CARD=m
-CONFIG_VIRTIO_NET=m
+CONFIG_VIRTIO_NET=y
+CONFIG_USB_NET_HUAWEI_CDC_NCM=m
-CONFIG_USB_NET_SR9800=m
+CONFIG_WCN36XX=m
+CONFIG_TOUCHSCREEN_ZFORCE=m
+CONFIG_UV_MMTIMER=m
-CONFIG_TCG_TIS_I2C_ATMEL=m
-CONFIG_TCG_TIS_I2C_NUVOTON=m
+CONFIG_DRM_BOCHS=m
+CONFIG_SND_DICE=m
+CONFIG_SONY_FF=y
-CONFIG_VIRT_DRIVERS=y
-CONFIG_VIRTIO_BALLOON=m
-CONFIG_VIRTIO_MMIO=m
+CONFIG_VIRTIO_BALLOON=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_CHROME_PLATFORMS=y
+CONFIG_CHROMEOS_LAPTOP=m
-CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
+CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x0
+CONFIG_EARLY_PRINTK_EFI=y

I'm running ea1cd65a648bd98ff9d028a647462d28313aadfd.  Does anything stand out?  If not, I can try to narrow it down.  CONFIG_EARLY_PRINTK_EFI and CONFIG_GART_IOMMU sounds like the more relevant.
Comment 14 Bjorn Helgaas 2017-03-02 22:51:26 UTC
I'm closing this as obsolete.  If it still happens, please reopen with any additional information you have.

Note You need to log in before you can comment on or make changes to this bug.