Bug 217776
Summary: | System (Xeon Nvidia) hangs at boot terminal after kernel 6.4.7 | ||
---|---|---|---|
Product: | Other | Reporter: | Peter Bottomley (peebee) |
Component: | Other | Assignee: | other_other |
Status: | RESOLVED ANSWERED | ||
Severity: | normal | CC: | bagasdotme, peebee, regressions |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
URL: | https://gitlab.freedesktop.org/drm/nouveau/-/issues/255 | ||
Kernel Version: | 6.4.7 | Subsystem: | drm/nouveau/disp |
Regression: | Yes | Bisected commit-id: | f01f7f27ca06411fed0dc5b118c767591f9436ae |
Attachments: |
attachment-8334-0.html
attachment-4825-0.html |
Description
Peter Bottomley
2023-08-08 16:16:34 UTC
Your best chance of getting it fixed is performing regression testing: https://docs.kernel.org/admin-guide/bug-bisect.html (In reply to Peter Bottomley from comment #0) > Kernel 6.4.6 compiled from source worked AOK on my desktop with Intel Xeon > cpu and Nvidia graphics - see below for system specs. > > Kernels 6.4.7 & 6.4.8 also compiled from source with identical configs hang > with a frozen boot terminal screen after a significant way through the boot > sequence (e.g. whilst running /etc/profile). The system may still be running > as a sound is emitted when the power button is pressed (only way to escape > from the system hang). > > The issue seems to be specific to the hardware of this desktop as the > problem kernels do boot through to completion on other machines. > > A test was done with a different build (from Porteus) of kernel 6.5-RC4 and > that did not hang - but kernel 6.4.7 from the same builder hung just like my > build. > > I apologise that I cannot provide any detailed diagnostics - but I can put > diagnostics into /etc/profile and provide screenshots if requested. > > Forum thread with more details and screenshots: > https://forum.puppylinux.com/viewtopic.php?p=95733#p95733 > > Computer Profile: > Machine Dell Inc. Precision WorkStation T5400 > (version: Not Specified) > Mainboard Dell Inc. 0RW203 (version: NA) > • BIOS Dell Inc. A11 | Date: 04/30/2012 | Type: Legacy > • CPU Intel(R) Xeon(R) CPU E5450 @ 3.00GHz (4 cores) > • RAM Total: 7955 MB | Used: 1555 MB (19.5%) | Actual > Used: 775 MB (9.7%) > Graphics Resolution: 1366x768 pixels | Display Server: > X.Org 21.1.8 > • device-0 NVIDIA Corporation GT218 [NVS 300] [10de:10d8] > (rev a2) > Audio ALSA > • device-0 Intel Corporation 631xESB/632xESB High > Definition Audio Controller [8086:269a] (rev 09) > • device-1 NVIDIA Corporation High Definition Audio > Controller [10de:0be3] (rev a1) > Network wlan1 > • device-0 Ethernet: Broadcom Inc. and subsidiaries > NetXtreme BCM5754 Gigabit Ethernet PCI Express [14e4:167a] (rev 02) Do you use Nouveau or NVIDIA driver? Also, can you attach dmesg and system log output (like journalctl)? Created attachment 304803 [details] attachment-8334-0.html On 09/08/2023 09:38, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=217776 > > Bagas Sanjaya (bagasdotme@gmail.com) changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |bagasdotme@gmail.com > > --- Comment #2 from Bagas Sanjaya (bagasdotme@gmail.com) --- > Do you use Nouveau or NVIDIA driver? > > Also, can you attach dmesg and system log output (like journalctl)? > Nouveau driver .... Sadly system never gets to the point where dmesg can be run. I'll see if I can capture it before the system freezes. 6.4.9 has the same problem as 6.4.7 and 6.4.8. I'm pretty sure it is a graphics problem of some sort. Given that each kernel build takes c. 30 mins - the bug-bisect regression testing suggestion is challenging to say the least! Boot without any options that hide kernel output, including "quiet". If you some some kernel messages, try to capture them, e.g. take a photo and upload it here. (In reply to peter from comment #3) > Created attachment 304803 [details] > attachment-8334-0.html > > On 09/08/2023 09:38, bugzilla-daemon@kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=217776 > > > > Bagas Sanjaya (bagasdotme@gmail.com) changed: > > > > What |Removed |Added > > > ---------------------------------------------------------------------------- > > CC| |bagasdotme@gmail.com > > > > --- Comment #2 from Bagas Sanjaya (bagasdotme@gmail.com) --- > > Do you use Nouveau or NVIDIA driver? > > > > Also, can you attach dmesg and system log output (like journalctl)? > > > Nouveau driver .... > Can you also open issue at gitlab.freedesktop.org tracker [1]? [1]: https://gitlab.freedesktop.org/drm/nouveau/-/issues On 10/08/2023 02:21, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=217776 > > --- Comment #5 from Bagas Sanjaya (bagasdotme@gmail.com) --- > (In reply to peter from comment #3) >> Created attachment 304803 [details] >> attachment-8334-0.html >> >> On 09/08/2023 09:38, bugzilla-daemon@kernel.org wrote: >>> https://bugzilla.kernel.org/show_bug.cgi?id=217776 >>> >>> Bagas Sanjaya (bagasdotme@gmail.com) changed: >>> >>> What |Removed |Added >>> >> ---------------------------------------------------------------------------- >>> CC| |bagasdotme@gmail.com >>> >>> --- Comment #2 from Bagas Sanjaya (bagasdotme@gmail.com) --- >>> Do you use Nouveau or NVIDIA driver? >>> >>> Also, can you attach dmesg and system log output (like journalctl)? >>> >> Nouveau driver .... >> > Can you also open issue at gitlab.freedesktop.org tracker [1]? > > [1]: https://gitlab.freedesktop.org/drm/nouveau/-/issues > https://lore.kernel.org/all/20230806213107.GFZNARG6moWpFuSJ9W@fat_crate.local/ identies the cause of the issue.... which apparently comes from: drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts https://cgit.freedesktop.org/drm-misc/commit/?h=drm-misc-fixes&id=2b5d1c29f6c4cb19369ef92881465e5ede75f4ef which is a patch to: ..../drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c Does this need to be reported further?? Created attachment 304813 [details]
attachment-4825-0.html
6.4.9 built with unconn.c from 6.4.6 builds and boots and runs fine.
Thanks everybody.
Commit 1b254b791d7b7dea6e8adc887fbbd51746d8bb27 should fix this. Lets hope so.... https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/gpu/drm/nouveau?h=next-20230818&id=1b254b791d7b7dea6e8adc887fbbd51746d8bb27 says: It might not fix all regressions from commit 2b5d1c29f6c4 ("drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts"), but at least it fixes a memory corruption in error handling related to that commit. Sadly..... built 6.4.11 with the patched drivers/gpu/drm/nouveau/nouveau_connector.c and it still hangs on boot.... To communicate with nouveau developers please use their bug tracker instead. Your comments here are not sent to anyone. Thank you. FWIW, does latest 6.4.y now work for you? It might be due to this change https://lore.kernel.org/all/20230821194136.393887865@linuxfoundation.org/ Sadly no - neither 6.4.12 nor 6.5 boot If I revert to version 6.4.6 of /drivers/gpu/drm/nouveau/nvkm/engine/disp/uconn.c and build the kernel with that, then that boots. https://gitlab.freedesktop.org/drm/nouveau/-/issues/255 is where the issue is being tracked. |