Bug 217123

Summary: NFS Server seems to write to locked file-system
Product: File System Reporter: Flole (flole)
Component: NFSAssignee: Chuck Lever (cel)
Status: RESOLVED CODE_FIX    
Severity: normal CC: cel, jack, jlayton
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.15.0-60 Subsystem:
Regression: No Bisected commit-id:

Description Flole 2023-03-04 02:20:46 UTC
When freezing a file-system on a server that's exported and accessed over NFS the following warning occasionally appears:

[1210423.486500] ------------[ cut here ]------------
[1210423.486503] WARNING: CPU: 8 PID: 1524732 at fs/ext4/ext4_jbd2.c:75 ext4_journal_check_start+0x68/0xb0
[1210423.486512] Modules linked in: blksnap(OE) nls_iso8859_1 uas usb_storage bdevfilter(OEK) nvidia_uvm(POE) nvidia(POE) rpcsec_gss_krb5 mpt3sas raid_class scsi_transport_sas mptctl mptbase dell_rbu nft_chain_nat xt_REDIRECT xt_MASQUERADE xt_owner xt_nat nf_nat nft_counter xt_LOG nf_log_syslog xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp nft_compat nf_tables nfnetlink binfmt_misc intel_rapl_msr rc_tt_1500 snd_hda_codec_hdmi ts2020 intel_rapl_common snd_hda_intel m88ds3103 sb_edac snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec i2c_mux x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_core kvm_intel dvb_usb_dw2102 dvb_usb dvb_core snd_hwdep snd_pcm joydev mc input_leds snd_timer kvm snd soundcore ipmi_ssif rapl dcdbas intel_cstate mei_me mei mac_hid acpi_power_meter ipmi_si sch_fq_codel ipmi_watchdog ipmi_devintf ipmi_msghandler 8021q garp mrp stp llc nfsd parport_pc ppdev auth_rpcgss nfs_acl lockd lp grace parport sunrpc ramoops efi_pstore reed_solomon
[1210423.486612]  pstore_blk pstore_zone ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress dm_crypt raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear bonding tls mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec crct10dif_pclmul hid_generic crc32_pclmul ghash_clmulni_intel rc_core usbhid cdc_ether aesni_intel usbnet igb crypto_simd mii ahci hid dca cryptd drm lpc_ich libahci megaraid_sas i2c_algo_bit wmi [last unloaded: blksnap]
[1210423.486662] CPU: 8 PID: 1524732 Comm: nfsd Tainted: P        W  OE K   5.15.0-60-generic #66-Ubuntu
[1210423.486666] Hardware name: Dell Inc. PowerEdge R720/0XH7F2, BIOS 2.9.0 12/06/2019
[1210423.486668] RIP: 0010:ext4_journal_check_start+0x68/0xb0
[1210423.486673] Code: 02 00 00 31 c0 48 85 d2 74 07 8b 02 83 e0 02 75 24 4c 8b 65 f8 c9 c3 cc cc cc cc 4c 8b 65 f8 b8 fb ff ff ff c9 c3 cc cc cc cc <0f> 0b eb ce b8 e2 ff ff ff eb dc 44 8b 42 10 68 08 e8 20 b7
45 31
[1210423.486676] RSP: 0018:ffffbc372216fac0 EFLAGS: 00010246
[1210423.486679] RAX: ffff97fd44420000 RBX: 0000000000000001 RCX: 0000000000000002
[1210423.486681] RDX: 0000000000000000 RSI: 0000000000001739 RDI: ffff97fd5e0b4800
[1210423.486683] RBP: ffffbc372216fac8 R08: 0000000000000000 R09: ffffffffb608e08a
[1210423.486686] R10: ffff97fd5cd4b630 R11: ffff9803db98ba10 R12: ffff97fd5e0b4800
[1210423.486688] R13: 0000000000000000 R14: 0000000000000008 R15: ffff97fd5e0b4800
[1210423.486690] FS:  0000000000000000(0000) GS:ffff98086f900000(0000) knlGS:0000000000000000
[1210423.486693] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1210423.486696] CR2: 00007fdcd824cfe8 CR3: 0000000216410001 CR4: 00000000000606e0
[1210423.486698] Call Trace:
[1210423.486700]  <TASK>
[1210423.486702]  __ext4_journal_start_sb+0x35/0x130
[1210423.486707]  ext4_dirty_inode+0x3a/0x80
[1210423.486713]  __mark_inode_dirty+0x5e/0x330
[1210423.486718]  generic_update_time+0x6c/0xd0
[1210423.486723]  file_update_time+0x127/0x140
[1210423.486727]  file_modified+0x27/0x40
[1210423.486731]  ext4_buffered_write_iter+0x5f/0x180
[1210423.486735]  ext4_file_write_iter+0x43/0x60
[1210423.486738]  do_iter_readv_writev+0x14a/0x1b0
[1210423.486741]  do_iter_write+0x8c/0x160
[1210423.486745]  vfs_iter_write+0x19/0x30
[1210423.486748]  nfsd_vfs_write+0x2d5/0x610 [nfsd]
[1210423.486794]  nfsd_write+0x7e/0x140 [nfsd]
[1210423.486821]  nfsd3_proc_write+0xe8/0x190 [nfsd]
[1210423.486867]  nfsd_dispatch+0x16c/0x270 [nfsd]
[1210423.486911]  svc_process_common+0x3da/0x720 [sunrpc]
[1210423.486972]  ? nfsd_svc+0x190/0x190 [nfsd]
[1210423.487017]  svc_process+0xbc/0x100 [sunrpc]
[1210423.487070]  nfsd+0xed/0x150 [nfsd]
[1210423.487101]  ? nfsd_shutdown_threads+0x90/0x90 [nfsd]
[1210423.487132]  kthread+0x12a/0x150
[1210423.487137]  ? set_kthread_struct+0x50/0x50
[1210423.487142]  ret_from_fork+0x22/0x30
[1210423.487148]  </TASK>
[1210423.487150] ---[ end trace c1947abfeca4e0b1 ]---
[1210424.220978] ------------[ cut here ]------------
Comment 1 Chuck Lever 2023-03-06 14:26:19 UTC
I need an ext4 developer to explain the significance of this warning.
Comment 2 Flole 2023-03-06 14:33:44 UTC
I believe this warning means, that the FS was frozen and "something" still managed to write to it, which could potentially cause severe issues. But that's just what I managed to figure out from my short research, so someone with more knowledge might be able to provide a better or more accurate explanation.
Comment 3 Jan Kara 2023-03-06 15:12:58 UTC
Yes, correct. The code in nfsd_vfs_write() needs to call file_start_write() before it calls into vfs_iter_write() and call file_end_write() after it returns. Ideally it would be done inside vfs_iter_write() itself but it would not work for CODA and splice code. So we have what we have...
Comment 4 Chuck Lever 2023-03-06 15:16:44 UTC
Thanks, Jan. I'll send a patch to Linus this week.
Comment 5 Chuck Lever 2024-01-16 14:21:40 UTC
See commit fd9a2e1d5138 ("NFSD: Protect against filesystem freezing")