Bug 217877

Summary: nfs4_schedule_state_manager task stuck in tight loop
Product: File System Reporter: Christian Theune (ct)
Component: NFSAssignee: Trond Myklebust (trondmy)
Status: NEW ---    
Severity: normal    
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Christian Theune 2023-09-06 13:43:10 UTC
Today we encountered a Qemu VM on 6.1.45 with a kernel thread consuming a lot of CPU and being iowaited:

root      315344 44.5  0.0      0     0 ?        D    Sep05 781:38  \_ [172.22.56.83-manager]

After some initial shock of a potential intrusion (I haven't seen this process before) I traced this to the nfs4_schedule_state_manager() function.

There was no suspicious activity when this started and it went away after a reboot. The VM and NFS became quite unresponsive.

Googling and searching the bug tracker did not reveal a previous instance as did a changelog search.

The only changelog entry is a past one talking about swap-on-NFS issues, but it's not obvious to me whether this would be related at all. (f859788868c4fb586a6e5ae05d148ac309241d02)
Comment 1 Christian Theune 2024-01-22 12:58:00 UTC
Quick update: we downgraded to the 5.15 series due to other XFS issues in the 6.1 series and are experiencing this issue as well. Currently observed on 5.15.145.

Let me know if we can help debugging/analyzing this.