Bug 209911 - Intermittent panics/oopses during boot - usually when mounting drives or loading AppArmor profiles
Summary: Intermittent panics/oopses during boot - usually when mounting drives or load...
Status: RESOLVED UNREPRODUCIBLE
Alias: None
Product: Other
Classification: Unclassified
Component: Other (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: other_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-27 19:16 UTC by Radosław Wyrzykowski
Modified: 2020-11-24 08:44 UTC (History)
0 users

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Kernel config (239.67 KB, text/plain)
2020-10-27 19:16 UTC, Radosław Wyrzykowski
Details
ver_linux output (2.24 KB, text/plain)
2020-10-27 19:18 UTC, Radosław Wyrzykowski
Details
cpuinfo (5.70 KB, text/plain)
2020-10-27 19:18 UTC, Radosław Wyrzykowski
Details
/proc/modules (8.23 KB, text/plain)
2020-10-27 19:19 UTC, Radosław Wyrzykowski
Details
/proc/ioports (1.16 KB, text/plain)
2020-10-27 19:19 UTC, Radosław Wyrzykowski
Details
/proc/iomem (3.44 KB, text/plain)
2020-10-27 19:19 UTC, Radosław Wyrzykowski
Details
sudo lspci -vvv output (64.63 KB, text/plain)
2020-10-27 19:20 UTC, Radosław Wyrzykowski
Details
/proc/scsi/scsi (336 bytes, text/plain)
2020-10-27 19:21 UTC, Radosław Wyrzykowski
Details
Example panic (4.01 MB, image/jpeg)
2020-10-27 19:23 UTC, Radosław Wyrzykowski
Details
Another example panic (4.00 MB, image/jpeg)
2020-10-27 19:27 UTC, Radosław Wyrzykowski
Details

Description Radosław Wyrzykowski 2020-10-27 19:16:31 UTC
Created attachment 293249 [details]
Kernel config

[2.] Since at least kernel 5.8.7, I've been experiencing intermittent panics and oopses during boot without much apparent rhyme or reason between them, other than 1) if the system manages to boot without any errors, everything works fine and 2) they seem to occur mostly when mounting drives (thankfully, I haven't seen any corruption) and loading AppArmor profiles. The some services fail to start after they happen, requiring a reboot.

[4.1] Linux version 5.9.1-rtest (REDACTED@REDACTED) (gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c], GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.34.0.20200325-1) #23 SMP Fri Oct 23 15:56:43 CEST 2020


[5.] Appears to be 5.8.5, if my bisect was correct - verified on 5.8.4, which I continue to stick to, for the time being. Bisect log follows:
git bisect start
# bad: [66534fe2b9400003b0f49cc94686a162132b64e7] Linux 5.8.6
git bisect bad 66534fe2b9400003b0f49cc94686a162132b64e7
# good: [9ece50d8a470ca7235ffd6ac0f9c5f0f201fe2c8] Linux 5.8.5
git bisect good 9ece50d8a470ca7235ffd6ac0f9c5f0f201fe2c8
# good: [e3a5fa63a2e5b9a785d5ef1d4bdfc48965d3027e] netfilter: avoid ipv6 -> nf_defrag_ipv6 module dependency
git bisect good e3a5fa63a2e5b9a785d5ef1d4bdfc48965d3027e
# good: [68adec4646bf187d76d8d07dc3ee3b0fbd5e0150] usb: renesas-xhci: remove version check
git bisect good 68adec4646bf187d76d8d07dc3ee3b0fbd5e0150
# good: [2b9be3af1037c4b470d9ae3f5a200472ed44ba09] usb: uas: Add quirk for PNY Pro Elite
git bisect good 2b9be3af1037c4b470d9ae3f5a200472ed44ba09
# bad: [84e29c7cf5913311b72b663d93c3fac03626efd0] usb: typec: ucsi: Fix 2 unlocked ucsi_run_command calls
git bisect bad 84e29c7cf5913311b72b663d93c3fac03626efd0
# good: [d884a90cec5ace7ca767788ca574e68f9af2998a] usb: dwc3: gadget: Don't setup more than requested
git bisect good d884a90cec5ace7ca767788ca574e68f9af2998a
# good: [0ca26ffe3c1f381f95a0c0bbb5e8241d8c7183cc] usb: storage: Add unusual_uas entry for Sony PSZ drives
git bisect good 0ca26ffe3c1f381f95a0c0bbb5e8241d8c7183cc
# bad: [53965c79c2dbdc898ffc7fdbab37e920594f5e14] USB: Fix device driver race
git bisect bad 53965c79c2dbdc898ffc7fdbab37e920594f5e14
# good: [a18d5d456c009d69bd032fc96f4751d58e6ccc3a] USB: Also match device drivers using the ->match vfunc
git bisect good a18d5d456c009d69bd032fc96f4751d58e6ccc3a
# first bad commit: [53965c79c2dbdc898ffc7fdbab37e920594f5e14] USB: Fix device driver race


[6.] There's a lot of them - I'll provide some examples (Panics coming as pictures - I also managed to get a deadlock while shutting down)

general protection fault, probably for non-canonical address 0x940070756f726763: 0000 [#1] SMP PTI
CPU: 3 PID: 1469 Comm: pidof Tainted: G            E     5.9.1-rtest #23
Hardware name: Gigabyte Technology Co., Ltd. Z370 HD3/Z370 HD3-CF, BIOS F5 10/30/2017
RIP: 0010:aa_get_task_label+0x22/0xb0
Code: cc cc cc cc cc cc cc cc 0f 1f 44 00 00 41 54 48 8b 97 f8 06 00 00 48 63 05 6b 8b 45 01 48 8b 52 78 4c 8b 24 02 4d 85 e4 74 1e <41> f6 44 24 41 08 75 1c b8 01 >
RSP: 0018:ffffa5f701627cc0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff8d6956dcbb40
RDX: ffff8d6c2cdeb100 RSI: 000000000000000d RDI: ffff8d6c639a5e80
RBP: ffff8d6947de1eb0 R08: 0000000000000001 R09: 0000000000000000
R10: 00000000000158a0 R11: 0000000000000000 R12: 940070756f726763
R13: 0000000000000001 R14: ffffffff89062020 R15: ffff8d69579fe4e8
FS:  00007fb98ac147c0(0000) GS:ffff8d6c6f580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555643ed90c8 CR3: 00000003e51c8005 CR4: 00000000003706e0
Call Trace:
 apparmor_ptrace_access_check+0x4d/0x180
 security_ptrace_access_check+0x28/0x40
 ptrace_may_access+0x2a/0x50
 do_task_stat+0x81/0xdb0
 ? __mod_memcg_lruvec_state+0x21/0xe0
 proc_single_show+0x4d/0xb0
 seq_read+0xb4/0x460
 vfs_read+0x9c/0x180
 ksys_read+0x5f/0xe0
 do_syscall_64+0x33/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fb98af9389e
Code: c0 e9 b6 fe ff ff 50 48 8d 3d 3e 2d 0a 00 e8 19 e9 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 >
RSP: 002b:00007ffc455b0758 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fb98af9389e
RDX: 0000000000000400 RSI: 0000555643ed8cc0 RDI: 0000000000000004
RBP: 0000000000000004 R08: 00007fb98b029040 R09: 00007ffc455b05e0
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fb98b093e20 R14: 0000000000000000 R15: 00007ffc455b199e
R13: 00007fb98b093e20 R14: 0000000000000000 R15: 00007ffc455b199e
Modules linked in: ip6t_REJECT(E) nf_reject_ipv6(E) ip6t_rpfilter(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_conntrack(E) nf_conntrack_netbios_ns(E) nf_conntrack_broadca>
 efi_pstore(E) snd_hda_intel(E) i2c_smbus(E) e1000e(E) snd_intel_dspcfg(E) snd_hda_codec(E) mei_me(E) snd_hda_core(E) mei(E) snd_hwdep(E) snd_pcm(E) apple_mfi_fastc>
---[ end trace b63e959ae8e7df89 ]---
RIP: 0010:aa_get_task_label+0x22/0xb0
Code: cc cc cc cc cc cc cc cc 0f 1f 44 00 00 41 54 48 8b 97 f8 06 00 00 48 63 05 6b 8b 45 01 48 8b 52 78 4c 8b 24 02 4d 85 e4 74 1e <41> f6 44 24 41 08 75 1c b8 01 >
RSP: 0018:ffffa5f701627cc0 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff8d6956dcbb40
RDX: ffff8d6c2cdeb100 RSI: 000000000000000d RDI: ffff8d6c639a5e80
RBP: ffff8d6947de1eb0 R08: 0000000000000001 R09: 0000000000000000
R10: 00000000000158a0 R11: 0000000000000000 R12: 940070756f726763
R13: 0000000000000001 R14: ffffffff89062020 R15: ffff8d69579fe4e8
FS:  00007fb98ac147c0(0000) GS:ffff8d6c6f580000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555643ed90c8 CR3: 00000003e51c8005 CR4: 00000000003706e0

~~~~~~~~~~~~~~~~~~~~~~~~

stack segment: 0000 [#1] SMP PTI
CPU: 2 PID: 614 Comm: systemd-udevd Tainted: G            E     5.9.1-rtest #23
Hardware name: Gigabyte Technology Co., Ltd. Z370 HD3/Z370 HD3-CF, BIOS F5 10/30/2017
RIP: 0010:apparmor_file_alloc_security+0x39/0x1e0
Code: 53 48 8b 90 00 07 00 00 48 63 05 d2 85 44 01 48 63 1d cf 85 44 01 48 8b 52 78 48 03 9f c0 00 00 00 48 8b 2c 02 e8 07 98 57 00 <f6> 45 41 08 75 7c c7 03 00 00 >
RSP: 0018:ffffb66f009bbd00 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8bb8ba7a5d68 RCX: 0000000000000001
RDX: ffff8bb86c248310 RSI: 0000000000000000 RDI: ffff8bb86f293300
RBP: e6e86a07bc458355 R08: 0000000000000018 R09: ffff8bb8ba7a5d68
R10: ffffffffffffdf90 R11: 0000000000000000 R12: ffff8bb86f293300
R13: ffffb66f009bbee4 R14: ffffb66f009bbdd0 R15: ffff8bb8a5bfde80
FS:  00007f1ace64f940(0000) GS:ffff8bb8bf500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd290e4a9e0 CR3: 00000003efd08005 CR4: 00000000003706e0
Call Trace:
 security_file_alloc+0x48/0x90
 __alloc_file+0x52/0x110
 ? try_to_wake_up+0x20a/0x550
 alloc_empty_file+0x41/0xb0
 path_openat+0x43/0x1d0
 ? autoremove_wake_function+0xe/0x30
 do_filp_open+0x88/0x130
 ? __wake_up_common_lock+0x8a/0xc0
 ? _cond_resched+0x16/0x40
 ? slab_pre_alloc_hook.constprop.0+0xd0/0x110
 ? getname_flags.part.0+0x29/0x1a0
 ? __check_object_size.part.0+0x11f/0x140
 do_sys_openat2+0x97/0x150
 __x64_sys_openat+0x54/0x90
 do_syscall_64+0x33/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f1acf1d95f7
Code: 25 00 00 41 00 3d 00 00 41 00 74 47 64 8b 04 25 18 00 00 00 85 c0 75 6b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 95 00 >
RSP: 002b:00007fff2061adf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 000055d0673f28c0 RCX: 00007f1acf1d95f7
RDX: 0000000000080000 RSI: 00007fff2061af00 RDI: 00000000ffffff9c
RBP: 00007fff2061af00 R08: 00007f1acf26f040 R09: 00007fff2061ad00
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000080000
R13: 0000000000000026 R14: 0000000000000007 R15: 000055d0673f28c0
Modules linked in: pcc_cpufreq(E-) fjes(E-) acpi_cpufreq(E-) snd_hda_codec(E) apple_mfi_fastcharge(E) snd_hda_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) joydev(E)>
---[ end trace abe0218078b7cebb ]---
RIP: 0010:apparmor_file_alloc_security+0x39/0x1e0
Code: 53 48 8b 90 00 07 00 00 48 63 05 d2 85 44 01 48 63 1d cf 85 44 01 48 8b 52 78 48 03 9f c0 00 00 00 48 8b 2c 02 e8 07 98 57 00 <f6> 45 41 08 75 7c c7 03 00 00 >
RSP: 0018:ffffb66f009bbd00 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8bb8ba7a5d68 RCX: 0000000000000001
RDX: ffff8bb86c248310 RSI: 0000000000000000 RDI: ffff8bb86f293300
RBP: e6e86a07bc458355 R08: 0000000000000018 R09: ffff8bb8ba7a5d68
R10: ffffffffffffdf90 R11: 0000000000000000 R12: ffff8bb86f293300
R13: ffffb66f009bbee4 R14: ffffb66f009bbdd0 R15: ffff8bb8a5bfde80
FS:  00007f1ace64f940(0000) GS:ffff8bb8bf500000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd290e4a9e0 CR3: 00000003efd08005 CR4: 00000000003706e0

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I'll add more data…
Comment 1 Radosław Wyrzykowski 2020-10-27 19:18:05 UTC
Created attachment 293251 [details]
ver_linux output
Comment 2 Radosław Wyrzykowski 2020-10-27 19:18:32 UTC
Created attachment 293253 [details]
cpuinfo
Comment 3 Radosław Wyrzykowski 2020-10-27 19:19:12 UTC
Created attachment 293255 [details]
/proc/modules
Comment 4 Radosław Wyrzykowski 2020-10-27 19:19:35 UTC
Created attachment 293257 [details]
/proc/ioports
Comment 5 Radosław Wyrzykowski 2020-10-27 19:19:59 UTC
Created attachment 293259 [details]
/proc/iomem
Comment 6 Radosław Wyrzykowski 2020-10-27 19:20:41 UTC
Created attachment 293261 [details]
sudo lspci -vvv output
Comment 7 Radosław Wyrzykowski 2020-10-27 19:21:22 UTC
Created attachment 293263 [details]
/proc/scsi/scsi
Comment 8 Radosław Wyrzykowski 2020-10-27 19:23:29 UTC
Created attachment 293265 [details]
Example panic
Comment 9 Radosław Wyrzykowski 2020-10-27 19:27:06 UTC
Created attachment 293267 [details]
Another example panic
Comment 10 Radosław Wyrzykowski 2020-10-27 19:28:38 UTC
Please excuse the info dump, it's my first time reporting a kernel bug and I'm trying to follow the documentation.
Comment 11 Radosław Wyrzykowski 2020-10-27 20:22:16 UTC
Posted to the LKML as r.wyrz@protonmail.com due to server restrictions - I confirm (as much as I can this way, at least) that I'm the same person.
Comment 12 Radosław Wyrzykowski 2020-11-24 08:44:17 UTC
It seems this problem is now fixed as of Kernel 5.9.8

Note You need to log in before you can comment on or make changes to this bug.