Bug 94551
Summary: | System reboots on graphical load with Intel HD 4000 iGPU since Linux 3.10. | ||
---|---|---|---|
Product: | Drivers | Reporter: | rasmus (kernel-bugzilla) |
Component: | Video(DRI - Intel) | Assignee: | intel-gfx-bugs (intel-gfx-bugs) |
Status: | RESOLVED UNREPRODUCIBLE | ||
Severity: | normal | CC: | intel-gfx-bugs, kernel-bugzilla, rui.zhang |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 3.18 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
bios_settings.txt: my bios settings
dmesg_fedora_19.txt: dmesg without reboot dmesg_fedora_21.txt: dmesg when crashing (via netconsole) iomem_fedora_21.txt iomem_fedora_19.txt ioports_fedora_19.txt ioports_fedora_21.txt lspci_fedora_19.txt lspci_fedora_21.txt modules_fedora_19.txt modules_fedora_21.txt ver_linux_fedora_19.txt ver_linux_fedora_21.txt xorg_crash_fedora_21.txt: An example of an Xorg log when crashing. |
Description
rasmus
2015-03-08 17:02:15 UTC
Created attachment 169741 [details]
dmesg_fedora_19.txt: dmesg without reboot
Created attachment 169751 [details]
dmesg_fedora_21.txt: dmesg when crashing (via netconsole)
Created attachment 169761 [details]
iomem_fedora_21.txt
Created attachment 169771 [details]
iomem_fedora_19.txt
Created attachment 169781 [details]
ioports_fedora_19.txt
Created attachment 169791 [details]
ioports_fedora_21.txt
Created attachment 169801 [details]
lspci_fedora_19.txt
Created attachment 169811 [details]
lspci_fedora_21.txt
Created attachment 169821 [details]
modules_fedora_19.txt
Created attachment 169831 [details]
modules_fedora_21.txt
Created attachment 169841 [details]
ver_linux_fedora_19.txt
Created attachment 169851 [details]
ver_linux_fedora_21.txt
Created attachment 169861 [details]
xorg_crash_fedora_21.txt: An example of an Xorg log when crashing.
BTW: I don't know if this bug is filed wrongly; I'm only a user of the Linux kernel. Please let me know if there is any more information I can add to improve the bug report. can you please check if the problem still exists if you 1. add "nomodeset" kernel option or 2. disable the graphics driver by setting CONFIG_DRM_I915=n? > 1. add "nomodeset" kernel option
It seems completly stable. I've run with nomodeset for six to seven hours without any problems.
> I've run with nomodeset for six to seven hours without any problems.
^^^
it
This seems to be a i915 issue to me. re-assign to the graphics experts. Sounds fair. But: I initially reported the bug on Freedesktop/Intel DRM. I don't know if it's the same people (same general mailing list, though), but Chris Wilson (of Intel) said:
> Your dmesg does not show a controlled shutdown. A GPU hang, even a lowlevel
> hardware hang, should not result in the machine rebooting. You dmesg does
> show that the kernel disagrees with the ACPI firmware implementation and that
> its actively managing the thermal throttling. At this point, your best bet is
> to bisect the kernel and see where that leads.
Yes, I saw that. There are indeed some ACPI related warning messages. [ 2355.912119] ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140926/nsarguments-95) [ 2355.912786] ACPI Warning: \_SB_.PCI0.PEG_.VID_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20140926/nsarguments-95) [ 2357.181414] thinkpad_acpi: EC reports that Thermal Table has changed But the ACPI warning is generated when graphics _DSM is invoked, and the thinkpad_acpi one is sounds like a platform driver issue. Thus I don't see what we Linux/ACPI can do for this issue. IMO, nomodeset makes the problem disappear is sufficient to show that this is a graphics issue, but to double check if it is a thermal issue, you can build your kernel with CONFIG_THERMAL=n and boot without nomodeset parameter and check if the problem still exists. [Sorry about the malformed quotes above]. > thinkpad_acpi one is sounds like a platform driver issue. I tried to install tp_smapi. No change. I think I tried to blacklist thinkpad_acpi with no change, but I'm not 100% sure on that one. > Thus I don't see what we Linux/ACPI can do for this issue. OK. Fair enough. > to double check if it is a thermal issue I can do that. It's currently bisecting between 3.9 and 3.10. From eyeballing, the temperature was exactly the same with nomodeset and without it, but I test this better. (In reply to rasmus from comment #21) > I tried to install tp_smapi. No change. I think I tried to blacklist > thinkpad_acpi with no change, but I'm not 100% sure on that one. Please do try to double check the thinkpad_acpi (and possibly other thinkpad_*) platform driver. > Please do try to double check the thinkpad_acpi Is it sufficient to blacklist thinkpad_acpi (i.e. after init)? Or do I need to somehow remove it at compile time? > (and possibly other thinkpad_*) platform driver. Are there any other in ther kernel? On my X200s (which is what I have here) lsmod | grep -i think only gives me thinkpad_acpi. Thanks, Rasmus (In reply to rasmus from comment #23) > Is it sufficient to blacklist thinkpad_acpi (i.e. after init)? Or do I need > to somehow remove it at compile time? That should be enough, unless is built-in. > Are there any other in ther kernel? On my X200s (which is what I have here) > lsmod | grep -i think only gives me thinkpad_acpi. Probably not, I was thinking some other laptop. > Please do try to double check the thinkpad_acpi platform driver.
The bug happens irrespective of whether thinkpad_acpi is enabled or not on 3.18 (the default Fedora 21 kernel).
My bisect has not revealed anything yet.
(In reply to Zhang Rui from comment #20) > but to double check if it is a thermal issue, you can > build your kernel with CONFIG_THERMAL=n and boot without nomodeset parameter > and check if the problem still exists. The computer still crashes. I don't know what that means though. I guess with CONFIG_THERMAL the bios 'heuristics' for temperature are followed. In this case I don't think it's conclusive that nomodeset working makes it an i915 bug. If it's a thermal shutdown merely using vs. not using the gpu might make the difference. That said, is this still an issue with current upstream kernels? Hi, > In this case I don't think it's conclusive that nomodeset working makes it an > i915 bug. If it's a thermal shutdown merely using vs. not using the gpu might > make the difference. But then why doesn't in happen on Windows? > That said, is this still an issue with current upstream kernels? It's a lot better now. I did experience one crash like this recently, though, but it's seems to be a lot harder to trigger. But I think some of this bug remains. Rasmus (In reply to rasmus from comment #28) > It's a lot better now. I did experience one crash like this recently, > though, but it's seems to be a lot harder to trigger. That's good... > But I think some of this bug remains. ...while that's not so good. However, we don't see this in our testing, and I'm not sure what we could do about this. I'm closing this as unreproducible. However, if the problem persists and keeps annoying you, please file a new bug at [1] where we're migrating all of the graphics bugs. Thanks. [1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel |