Bug 219920 - Page fault crash on RT 6.12.13
Summary: Page fault crash on RT 6.12.13
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: Intel Linux
: P3 normal
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-03-24 12:20 UTC by Ismo Puustinen
Modified: 2025-03-24 12:21 UTC (History)
0 users

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Ismo Puustinen 2025-03-24 12:20:46 UTC
Running Talos Linux with PREEMPT_RT 6.12.13 kernel. This is the same system that sees issue https://bugzilla.kernel.org/show_bug.cgi?id=219919 . Every now often (maybe once every two days) the system crashes with a "page not found" error.

The address seems to always be 0051. The system is more or less idle when the crash happens.

[45215.650366] BUG: kernel NULL pointer dereference, address: 0000000000000051
[45215.650370] #PF: supervisor read access in kernel mode
[45215.650372] #PF: error_code(0x0000) - not-present page
[45215.650374] PGD 80000001245f8067 P4D 80000001245f8067 PUD 1245fa067 PMD 0 
[45215.650378] Oops: Oops: 0000 [#1] PREEMPT_RT SMP PTI
[45215.650382] CPU: 2 UID: 0 PID: 217267 Comm: init Not tainted 6.12.13-talos #1
[45215.650385] Hardware name: HPE ProLiant DL110 Gen11/ProLiant DL110 Gen11, BIOS 2.44 01/17/2025
[45215.650386] RIP: 0010:pick_task_fair+0x51/0x100
[45215.650392] Code: 00 00 4c 89 ed eb 3d eb 54 48 8b 5d 68 48 85 db 74 10 48 8b 73 70 48 89 ef e8 1b 7b ff ff 85 c0 75 16 48 89 ef e8 8f ab ff ff <80> 78 51 00 48 89 c3 75 51 48 85 c0 74 ba 48 8b ab a8 00 00 00 48
[45215.650395] RSP: 0018:ffffbb0aeee1b910 EFLAGS: 00010086
[45215.650397] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000003169
[45215.650398] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98046373e400
[45215.650400] RBP: ffff98046373e400 R08: 0000000000000329 R09: 0000000000000002
[45215.650401] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9840be131900
[45215.650403] R13: ffff9840be131980 R14: 0000000000000002 R15: 0000000000000000
[45215.650404] FS:  000000c000783198(0000) GS:ffff9840be100000(0000) knlGS:0000000000000000
[45215.650406] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[45215.650407] CR2: 0000000000000051 CR3: 0000000135076003 CR4: 0000000000f72ef0
[45215.650409] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[45215.650410] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[45215.650411] PKRU: 55555554
[45215.650412] Call Trace:
[45215.650413]  <TASK>
[45215.650416]  ? __die_body.cold+0x19/0x26
[45215.650422]  ? page_fault_oops+0x15a/0x2d0
[45215.650429]  ? exc_page_fault+0x70/0x150
[45215.650433]  ? asm_exc_page_fault+0x26/0x30
[45215.650437]  ? pick_task_fair+0x51/0x100
[45215.650440]  pick_next_task_fair+0x21/0x390
[45215.650443]  __pick_next_task+0x3e/0x1a0
[45215.650447]  __schedule+0x1da/0x14b0
[45215.650452]  ? rt_spin_lock+0x28/0x60
[45215.650455]  ? plist_add+0xdd/0x140
[45215.650458]  schedule+0x27/0xd0
[45215.650461]  futex_wait_queue+0x65/0x90
[45215.650464]  __futex_wait+0xa5/0x110
[45215.650468]  ? __pfx_futex_wake_mark+0x10/0x10
[45215.650471]  futex_wait+0x79/0x120
[45215.650475]  do_futex+0xcb/0x190
[45215.650478]  __x64_sys_futex+0x129/0x1e0
[45215.650481]  do_syscall_64+0x82/0x160
[45215.650491]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[45215.650495] RIP: 0033:0x47d5c3
[45215.650497] Code: 24 20 c3 cc cc cc cc 48 8b 7c 24 08 8b 74 24 10 8b 54 24 14 4c 8b 54 24 18 4c 8b 44 24 20 44 8b 4c 24 28 b8 ca 00 00 00 0f 05 <89> 44 24 30 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[45215.650499] RSP: 002b:000000c001a2bcf8 EFLAGS: 00000286 ORIG_RAX: 00000000000000ca
[45215.650501] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000047d5c3
[45215.650502] RDX: 0000000000000000 RSI: 0000000000000080 RDI: 000000c000783248
[45215.650504] RBP: 000000c001a2bd40 R08: 0000000000000000 R09: 0000000000000000
[45215.650505] R10: 0000000000000000 R11: 0000000000000286 R12: 000000000001d283
[45215.650506] R13: 0000000000000001 R14: 000000c0012bc8c0 R15: 0000000000000003
[45215.650509]  </TASK>
[45215.650509] Modules linked in: igbvf iavf libeth vrf vfio_pci vfio_pci_core vfio_iommu_type1 vfio mlx5_ib mlx5_core ice igb mlxfw nvme hpilo i2c_algo_bit libie
[45215.650522] CR2: 0000000000000051
[45215.650525] ---[ end trace 0000000000000000 ]---
[45215.941503] RIP: 0010:pick_task_fair+0x51/0x100
[45215.941513] Code: 00 00 4c 89 ed eb 3d eb 54 48 8b 5d 68 48 85 db 74 10 48 8b 73 70 48 89 ef e8 1b 7b ff ff 85 c0 75 16 48 89 ef e8 8f ab ff ff <80> 78 51 00 48 89 c3 75 51 48 85 c0 74 ba 48 8b ab a8 00 00 00 48
[45215.941516] RSP: 0018:ffffbb0aeee1b910 EFLAGS: 00010086
[45215.941519] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000003169
[45215.941521] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff98046373e400
[45215.941522] RBP: ffff98046373e400 R08: 0000000000000329 R09: 0000000000000002
[45215.941523] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9840be131900
[45215.941524] R13: ffff9840be131980 R14: 0000000000000002 R15: 0000000000000000
[45215.941526] FS:  000000c000783198(0000) GS:ffff9840be100000(0000) knlGS:0000000000000000
[45215.941527] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[45215.941529] CR2: 0000000000000051 CR3: 0000000135076003 CR4: 0000000000f72ef0
[45215.941530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[45215.941531] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[45215.941532] PKRU: 55555554
[45215.941534] Kernel panic - not syncing: Fatal exception
[45216.976349] Shutting down cpus with NMI
[45216.976365] Kernel Offset: 0x37800000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Note You need to log in before you can comment on or make changes to this bug.