Bug 218296 - Kernel 6.6.8 locks up shortly after booting.
Summary: Kernel 6.6.8 locks up shortly after booting.
Status: NEW
Alias: None
Product: Linux
Classification: Unclassified
Component: Kernel (show other bugs)
Hardware: i386 Linux
: P3 normal
Assignee: Virtual assignee for kernel bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-21 00:12 UTC by Chris Rankin
Modified: 2023-12-23 11:33 UTC (History)
1 user (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
6.5.13 config (100.57 KB, text/plain)
2023-12-21 00:16 UTC, Chris Rankin
Details
6.6.8 config (100.94 KB, text/plain)
2023-12-21 00:17 UTC, Chris Rankin
Details
dmesg log for Linux 6.5.0 (13.95 KB, text/plain)
2023-12-21 22:20 UTC, Chris Rankin
Details
6.4.16 config (100.17 KB, text/plain)
2023-12-22 09:20 UTC, Chris Rankin
Details
dmesg log for 6.4.16 (15.29 KB, text/plain)
2023-12-22 09:21 UTC, Chris Rankin
Details
Crash dump - 1 (1016.19 KB, image/jpeg)
2023-12-22 15:56 UTC, Chris Rankin
Details
Crash dump - 2 (962.60 KB, image/jpeg)
2023-12-22 15:57 UTC, Chris Rankin
Details
Crash dump - 3 (1.05 MB, image/jpeg)
2023-12-22 15:58 UTC, Chris Rankin
Details
Crash dump - 4 (1001.57 KB, image/jpeg)
2023-12-22 15:59 UTC, Chris Rankin
Details

Description Chris Rankin 2023-12-21 00:12:41 UTC
I have an ancient UP i586 machine which successfully runs 6.4.16 but which crashes without logging an oops shortly after booting either 6.5.13 or 6.6.8.

This bug *might* be network-related, but is not fixed by:
```
--- linux-6.5/include/net/neighbour.h.orig	2023-12-10 22:11:54.079741645 +0000
+++ linux-6.5/include/net/neighbour.h	2023-12-10 22:12:24.920364781 +0000
@@ -162,7 +162,7 @@
 	struct rcu_head		rcu;
 	struct net_device	*dev;
 	netdevice_tracker	dev_tracker;
-	u8			primary_key[0];
+	u8			primary_key[];
 } __randomize_layout;
 
 struct neigh_ops {
```
The dmesg log seems to get this far before stopping:
```
NET: Registered PF_INET6 protocol family
Segment Routing with IPv6
In-situ OAM (IOAM) with IPv6
bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
br0: port 1(eth1) entered blocking state
br0: port 1(eth1) entered disabled state
e100 0000:01:04.0 eth1: entered allmulticast mode
e100 0000:01:04.0 eth1: entered promiscuous mode
e100 0000:01:04.0 eth1: NIC Link is Up 100 Mbps Full Duplex
br0: port 2(eth2) entered blocking state
br0: port 2(eth2) entered disabled state
e100 0000:01:05.0 eth2: entered allmulticast mode
e100 0000:01:05.0 eth2: entered promiscuous mode
e100 0000:01:05.0 eth2: NIC Link is Up 100 Mbps Full Duplex
br0: port 2(eth2) entered blocking state
br0: port 2(eth2) entered forwarding state
br0: port 1(eth1) entered blocking state
br0: port 1(eth1) entered forwarding state
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp-with-tls transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
e100 0000:00:0f.0 eth0: NIC Link is Up 100 Mbps Full Duplex
```
Comment 1 Chris Rankin 2023-12-21 00:16:13 UTC
Created attachment 305637 [details]
6.5.13 config
Comment 2 Chris Rankin 2023-12-21 00:17:38 UTC
Created attachment 305638 [details]
6.6.8 config
Comment 3 Chris Rankin 2023-12-21 22:20:09 UTC
Created attachment 305642 [details]
dmesg log for Linux 6.5.0

Linux 6.5.0 also fails on this UP machine.
Comment 4 Randy Dunlap 2023-12-22 04:22:53 UTC
Chris, does a 6.4 kernel run successfully?
Please post a successful boot log and kernel .config file.
Comment 5 Chris Rankin 2023-12-22 09:20:30 UTC
Created attachment 305643 [details]
6.4.16 config
Comment 6 Chris Rankin 2023-12-22 09:21:32 UTC
Created attachment 305644 [details]
dmesg log for 6.4.16

Dmesg log for successful boot with 6.4.16.
Comment 7 Chris Rankin 2023-12-22 15:56:20 UTC
Created attachment 305645 [details]
Crash dump - 1

6.6.8 again, except recompiled with an updated toolchain. Once the kernel had locked up, I managed to trigger an oops via SysRq-"kill all tasks".

I was obviously only able to capture what would fit on my screen at the time.
Comment 8 Chris Rankin 2023-12-22 15:57:39 UTC
Created attachment 305646 [details]
Crash dump - 2

Same crash, screenshot 2.
Comment 9 Chris Rankin 2023-12-22 15:58:21 UTC
Created attachment 305647 [details]
Crash dump - 3

Same crash, screenshot 3.
Comment 10 Chris Rankin 2023-12-22 15:59:22 UTC
Created attachment 305648 [details]
Crash dump - 4

Same crash, screenshot 4.

I gave up after this and rebooted the box back to 6.4.16.
Comment 11 Chris Rankin 2023-12-23 11:33:30 UTC
One obvious theory is that the e100 driver could be broken as of 6.5.0.

Note You need to log in before you can comment on or make changes to this bug.