Bug 205563
Summary: | nfs4_open_prepare causes unable to handle kernel paging request at ffffffffffffffb0 | ||
---|---|---|---|
Product: | File System | Reporter: | harald.schilly |
Component: | NFS | Assignee: | Trond Myklebust (trondmy) |
Status: | NEW --- | ||
Severity: | normal | CC: | bfields, bruno.santos, jasu, pmenzel+bugzilla.kernel.org |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.0.0-1025-gcp | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | Linux 5.4.39 messages (`dmesg` excerpt) |
Description
harald.schilly
2019-11-18 17:43:09 UTC
Please reproduce without any ZFS or other proprietary modules loaded. Here's my reproduction without ZFS or proprietary modules. There are other out-of-tree drivers though: vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE). /proc/version: Linux version 5.4.6-arch1-1 (linux@archlinux) (gcc version 9.2.0 (GCC)) #1 SMP PREEMPT Sat, 21 Dec 2019 16:34:41 +0000 Mounts: redacted.redacted.com:/srv/export on /mnt/nasroot type nfs4 (rw,nosuid,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=192.168.1.8,local_lock=none,addr=192.168.1.20) redacted.redacted.com:/mnt/cryptroot/data on /mnt/nascrypt type nfs4 (rw,nosuid,relatime,vers=4.2,rsize=524288,wsize=524288,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=192.168.1.8,local_lock=none,addr=192.168.1.20) [ 8157.118266] BUG: unable to handle page fault for address: ffffffffffffffb0 [ 8157.118269] #PF: supervisor read access in kernel mode [ 8157.118270] #PF: error_code(0x0000) - not-present page [ 8157.118270] PGD 53a20f067 P4D 53a20f067 PUD 53a211067 PMD 0 [ 8157.118272] Oops: 0000 [#1] PREEMPT SMP PTI [ 8157.118274] CPU: 6 PID: 19255 Comm: kworker/u16:0 Tainted: G OE 5.4.6-arch1-1 #1 [ 8157.118275] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3501 06/23/2017 [ 8157.118288] Workqueue: rpciod rpc_async_schedule [sunrpc] [ 8157.118301] RIP: 0010:nfs4_get_valid_delegation+0x7/0x30 [nfsv4] [ 8157.118303] Code: 6f fd 5c e5 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 f0 80 4f 48 08 c3 0f 1f 44 00 00 0f 1f 44 00 00 41 54 <4c> 8b 67 b0 31 f6 4c 89 e7 e8 3b fa ff ff 84 c0 b8 00 00 00 00 4c [ 8157.118304] RSP: 0018:ffffb59d883f7dd8 EFLAGS: 00010202 [ 8157.118305] RAX: ffff9729768a3780 RBX: ffff972c14ec4800 RCX: 0000000000000000 [ 8157.118305] RDX: 0000000000028000 RSI: 0000000000000001 RDI: 0000000000000000 [ 8157.118306] RBP: ffff973150409c00 R08: 0000646f69637072 R09: 8080808080808080 [ 8157.118307] R10: ffff972bff432a2c R11: 0000000000000018 R12: ffff97314f848400 [ 8157.118307] R13: 0000000000000004 R14: 0000000000004281 R15: ffffffffc058e8b0 [ 8157.118308] FS: 0000000000000000(0000) GS:ffff973156b80000(0000) knlGS:0000000000000000 [ 8157.118309] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8157.118310] CR2: ffffffffffffffb0 CR3: 000000038613a006 CR4: 00000000003606e0 [ 8157.118311] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8157.118311] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 8157.118312] Call Trace: [ 8157.118320] nfs4_open_prepare+0x85/0x1f0 [nfsv4] [ 8157.118329] __rpc_execute+0x7c/0x3a0 [sunrpc] [ 8157.118338] rpc_async_schedule+0x29/0x40 [sunrpc] [ 8157.118340] process_one_work+0x1e2/0x3b0 [ 8157.118342] worker_thread+0x4a/0x3d0 [ 8157.118343] kthread+0xfb/0x130 [ 8157.118345] ? process_one_work+0x3b0/0x3b0 [ 8157.118346] ? kthread_park+0x90/0x90 [ 8157.118348] ret_from_fork+0x35/0x40 [ 8157.118350] Modules linked in: tun fuse rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache intel_rapl_msr intel_rapl_common amdgpu ath9k ath9k_common x86_pkg_temp_thermal intel_powerclamp ath9k_hw coretemp gpu_sched snd_usb_audio nls_iso8859_1 i2c_algo_bit snd_hda_codec_hdmi nls_cp437 ttm vfat joydev mei_hdcp iTCO_wdt kvm_intel snd_hda_intel ath fat snd_usbmidi_lib snd_intel_nhlt drm_kms_helper iTCO_vendor_support btusb mac80211 snd_hda_codec snd_rawmidi btrtl kvm btbcm snd_seq_device drm snd_hda_core mc irqbypass btintel eeepc_wmi snd_hwdep asus_wmi intel_cstate battery bluetooth intel_uncore cfg80211 wacom input_leds sparse_keymap wmi_bmof r8169 intel_rapl_perf agpgart snd_pcm mxm_wmi realtek syscopyarea ecdh_generic sysfillrect snd_timer rfkill mousedev ecc e1000e libphy snd libarc4 sysimgblt mei_me i2c_i801 mei fb_sys_fops soundcore evdev mac_hid vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) auth_rpcgss sg sunrpc crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 [ 8157.118372] dm_crypt dm_mod hid_generic usbhid hid uas usb_storage sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ahci serio_raw libahci aesni_intel libata xhci_pci crypto_simd cryptd scsi_mod glue_helper xhci_hcd i8042 wmi atkbd libps2 serio [ 8157.118380] CR2: ffffffffffffffb0 [ 8157.118381] ---[ end trace 91c52b50d9ef8bfb ]--- [ 8157.118390] RIP: 0010:nfs4_get_valid_delegation+0x7/0x30 [nfsv4] [ 8157.118392] Code: 6f fd 5c e5 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 f0 80 4f 48 08 c3 0f 1f 44 00 00 0f 1f 44 00 00 41 54 <4c> 8b 67 b0 31 f6 4c 89 e7 e8 3b fa ff ff 84 c0 b8 00 00 00 00 4c [ 8157.118393] RSP: 0018:ffffb59d883f7dd8 EFLAGS: 00010202 [ 8157.118393] RAX: ffff9729768a3780 RBX: ffff972c14ec4800 RCX: 0000000000000000 [ 8157.118394] RDX: 0000000000028000 RSI: 0000000000000001 RDI: 0000000000000000 [ 8157.118395] RBP: ffff973150409c00 R08: 0000646f69637072 R09: 8080808080808080 [ 8157.118395] R10: ffff972bff432a2c R11: 0000000000000018 R12: ffff97314f848400 [ 8157.118396] R13: 0000000000000004 R14: 0000000000004281 R15: ffffffffc058e8b0 [ 8157.118397] FS: 0000000000000000(0000) GS:ffff973156b80000(0000) knlGS:0000000000000000 [ 8157.118398] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 8157.118398] CR2: ffffffffffffffb0 CR3: 000000038613a006 CR4: 00000000003606e0 [ 8157.118399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 8157.118399] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Reproduced on Linux 5.4.39 with XFS on the remote server. (In reply to Paul Menzel from comment #3) > Reproduced on Linux 5.4.39 with XFS on the remote server. Could we see the message from the kernel logs? Created attachment 296295 [details]
Linux 5.4.39 messages (`dmesg` excerpt)
Sure, please find them attached.
The crash pattern seems similar to one that occurs with kernels 4.18.0-193.el8.x86_64 (CentOS 8 and RHEL 8) and 5.4.0-33.37 on Ubuntu Focal: * https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1885010 * https://access.redhat.com/solutions/5606431 (am unable to see the solution) The patch with the commit comment "nfs: fix NULL deference in nfs4_get_valid_delegation" seems to be what fixes this issue, as mentioned on the bug tracker for Ubuntu's Linux kernels and on the following Oracle's kernel patch logs: https://linux.oracle.com/errata/ELSA-2020-2427.html And I quote: [4.18.0-193.5.1_2] - [fs] nfs: fix NULL deference in nfs4_get_valid_delegation ('J. Bruce Fields') [1837969 1831553] The referenced commit was added in Linux 5.7-rc6 [1]. $ git describe 29fe839976266bc7c55b927360a1daae57477723 v5.7-rc5-2-g29fe83997626 It was applied to the Linux 5.4 stable series in v5.4.42 (commit d1538d8d). So, it was not applied yet, when we experienced it. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29fe839976266bc7c55b927360a1daae57477723 |