Bug 218227

Summary: page dumped because: VM_BUG_ON_PAGE(n > 0 && !((__builtin_constant_p(PG_head) && __builtin_constant_p((uintptr_t)(&page->flags) != (uintptr_t)((vo>
Product: Memory Management Reporter: Luis Chamberlain (mcgrof)
Component: Page AllocatorAssignee: Andrew Morton (akpm)
Status: NEW ---    
Severity: normal CC: matthew, vbabka
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: v6.6-rc5 Subsystem:
Regression: No Bisected commit-id:

Description Luis Chamberlain 2023-12-05 04:17:40 UTC
Ran into the following crash while testing a baseline for XFS on v6.6-rc5 with kdevops [0] against the following test sections and tests as defined in kdevops config for XFS [1]:

  * xfs_reflink_4k: F:1/14 - means it fails one out of 14 loops
  - xfs/229: https://gist.github.com/mcgrof/ed4a3c025337434231303d70fefea684
  * xfs_crc_rtdev: F:1/20 - means it fails one out 20 loops
  - https://gist.github.com/mcgrof/0831c332fa33157fd6dadf9bf4bfc068

So to reproduce you can use kdevops and run the test in a loop.

[0] https://github.com/linux-kdevops/
[1] https://github.com/linux-kdevops/kdevops/blob/master/playbooks/roles/fstests/templates/xfs/xfs.config

The reflink crash is below, it tells me that we can probably reproduce easily by just triggering compaction during these tests. Which means we should probably run fsstress + compaction in a loop as new fstests test.

Nov 04 15:38:18 base-xfs-crc-rtdev unknown: run fstests xfs/066 at 2023-11-04 15:38:18
Nov 04 15:38:19 base-xfs-crc-rtdev kernel: XFS (loop16): Mounting V5 Filesystem df0efea2-60d6-40f0-9fea-c873665e5884
Nov 04 15:38:19 base-xfs-crc-rtdev kernel: XFS (loop16): Ending clean mount
Nov 04 15:38:20 base-xfs-crc-rtdev kernel: XFS (loop5): Mounting V5 Filesystem 9ee2e5d4-eb61-4dd2-951a-e99719220010
Nov 04 15:38:20 base-xfs-crc-rtdev kernel: XFS (loop5): Ending clean mount
Nov 04 15:38:20 base-xfs-crc-rtdev kernel: XFS (loop5): Unmounting Filesystem 9ee2e5d4-eb61-4dd2-951a-e99719220010
Nov 04 15:38:20 base-xfs-crc-rtdev kernel: XFS (loop5): Mounting V5 Filesystem 9ee2e5d4-eb61-4dd2-951a-e99719220010
Nov 04 15:38:20 base-xfs-crc-rtdev kernel: XFS (loop5): Ending clean mount
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: page:000000009006bf10 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x3f8a0 pfn:0x1035c0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: page_type: 0xffffff7f(buddy)
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: raw: 017fffc000000000 ffffe704c422f808 ffffe704c41ac008 0000000000000000
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: raw: 000000000003f8a0 0000000000000005 00000000ffffff7f 0000000000000000
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: page dumped because: VM_BUG_ON_PAGE(n > 0 && !((__builtin_constant_p(PG_head) && __builtin_constant_p((uintptr_t)(&page->flags) != (uintptr_t)((vo>
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: ------------[ cut here ]------------
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: kernel BUG at include/linux/page-flags.h:314!
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: CPU: 6 PID: 2435641 Comm: md5sum Not tainted 6.6.0-rc5 #2
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RIP: 0010:folio_flags+0x65/0x70
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: Code: a8 40 74 de 48 8b 47 48 a8 01 74 d6 48 83 e8 01 48 39 c7 75 bd eb cb 48 8b 07 a8 40 75 c8 48 c7 c6 d8 a7 c3 89 e8 3b c7 fa ff <0f> 0b 66 0f >
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RSP: 0018:ffffad51c0bfb7a8 EFLAGS: 00010246
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RAX: 000000000000015f RBX: ffffe704c40d7000 RCX: 0000000000000000
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RDX: 0000000000000000 RSI: ffffffff89be8040 RDI: 00000000ffffffff
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RBP: 0000000000103600 R08: 0000000000000000 R09: ffffad51c0bfb658
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: R10: 0000000000000003 R11: ffffffff89eacb08 R12: 0000000000000035
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: R13: ffffe704c40d7000 R14: 0000000000000000 R15: ffffad51c0bfb930
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: FS:  00007f350c51b740(0000) GS:ffff9b62fbd80000(0000) knlGS:0000000000000000
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: CR2: 0000555860919508 CR3: 00000001217fe002 CR4: 0000000000770ee0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: PKRU: 55555554
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: Call Trace:
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  <TASK>
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? die+0x32/0x80
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? do_trap+0xd6/0x100
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? folio_flags+0x65/0x70
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? do_error_trap+0x6a/0x90
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? folio_flags+0x65/0x70
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? exc_invalid_op+0x4c/0x60
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? folio_flags+0x65/0x70
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? asm_exc_invalid_op+0x16/0x20
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? folio_flags+0x65/0x70
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? folio_flags+0x65/0x70
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  PageHuge+0x67/0x80
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  isolate_migratepages_block+0x1c5/0x13b0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? __pv_queued_spin_lock_slowpath+0x16c/0x370
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  compact_zone+0x746/0xfc0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  compact_zone_order+0xbb/0x100
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  try_to_compact_pages+0xf0/0x2f0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  __alloc_pages_direct_compact+0x78/0x210
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  __alloc_pages_slowpath.constprop.0+0xac1/0xdb0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? prepare_alloc_pages.constprop.0+0xff/0x1b0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  __alloc_pages+0x218/0x240
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  folio_alloc+0x17/0x50
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  page_cache_ra_order+0x15a/0x340
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  filemap_get_pages+0x136/0x6c0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? update_load_avg+0x7e/0x780
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? current_time+0x2b/0xd0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  filemap_read+0xce/0x340
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? do_sched_setscheduler+0x111/0x1b0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? nohz_balance_exit_idle+0x16/0xc0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? trigger_load_balance+0x302/0x370
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ? preempt_count_add+0x47/0xa0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  xfs_file_buffered_read+0x52/0xd0 [xfs]
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  xfs_file_read_iter+0x73/0xe0 [xfs]
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  vfs_read+0x1b1/0x300
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  ksys_read+0x63/0xe0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  do_syscall_64+0x38/0x90
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RIP: 0033:0x7f350c615a5d
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: Code: 31 c0 e9 c6 fe ff ff 50 48 8d 3d a6 60 0a 00 e8 99 08 02 00 66 0f 1f 84 00 00 00 00 00 80 3d 81 3b 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 >
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RSP: 002b:00007ffca3ef5108 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RAX: ffffffffffffffda RBX: 000055ff140712f0 RCX: 00007f350c615a5d
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RDX: 0000000000008000 RSI: 000055ff140714d0 RDI: 0000000000000003
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: RBP: 00007f350c6ee600 R08: 0000000000000900 R09: 000000000b9bc05b
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: R10: 000055ff140794d0 R11: 0000000000000246 R12: 000055ff140714d0
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: R13: 0000000000008000 R14: 0000000000000a68 R15: 00007f350c6edd00
Nov 04 15:38:23 base-xfs-crc-rtdev kernel:  </TASK>
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: Modules linked in: dm_zero dm_thin_pool dm_persistent_data dm_bio_prison sd_mod sg scsi_mod scsi_common dm_snapshot dm_bufio dm_flakey xfs sunrpc >
Nov 04 15:38:23 base-xfs-crc-rtdev kernel: ---[ end trace 0000000000000000 ]---
Comment 1 Luis Chamberlain 2023-12-05 04:23:13 UTC
Sorry that should have been:

  * xfs_crc_rtdev: F:1/20
  - xfs/066: https://gist.github.com/mcgrof/0831c332fa33157fd6dadf9bf4bfc068
Comment 2 Vlastimil Babka 2023-12-08 17:57:56 UTC
I believe this is due to 9c5ccf2db04b ("mm: remove HUGETLB_PAGE_DTOR")
Let's continue on the thread of said patch: https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/