Bug 212295

Summary: pipe deadlocks since kernel v5.8 after resizing (race condition)
Product: File System Reporter: Lukas Schauer (kernel.org)
Component: OtherAssignee: fs_other
Status: NEW ---    
Severity: normal CC: brauner, dhowells, jgoerzen, kernel.org, me, sam
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.8-latest Subsystem:
Regression: Yes Bisected commit-id:
Attachments: Code to reproduce the issue
Patch fixing the race condition
[PATCH] fs/pipe: wakeup wr_wait after setting max_usage

Description Lukas Schauer 2021-03-15 18:00:06 UTC
Created attachment 295871 [details]
Code to reproduce the issue


I've been experiencing some weird bugs with pipes sometimes being stuck in a deadlock since kernel v5.8 if they are being resized.

A child process is stuck in pipe_read:

  [<0>] pipe_read+0x2ca/0x410
  [<0>] new_sync_read+0x18d/0x1a0
  [<0>] vfs_read+0xf1/0x180
  [<0>] ksys_read+0xb5/0xd0
  [<0>] do_syscall_64+0x33/0x80
  [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

While the parent process is stuck in the corresponding pipe_write:

  [<0>] pipe_write+0x274/0x5c0
  [<0>] new_sync_write+0x19c/0x1b0
  [<0>] vfs_write+0x184/0x250
  [<0>] ksys_write+0xb5/0xd0
  [<0>] do_syscall_64+0x33/0x80
  [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

The bug is only triggered if pipes get resized, which seemingly very little processes actually do.

A git bisect landed on the following commit:

  commit c73be61cede5882f9605a852414db559c0ebedfd                           
  Author: David Howells <dhowells@redhat.com>                       
  Date:   Tue Jan 14 17:07:11 2020 +0000                                                
    pipe: Add general notification queue support

I've attached some code that reproduces the bug for me (may take a few hundred loops). Removing the fcntl for F_SETPIPE_SZ removes the pipe_read/write deadlocks, so I guess the bug is somewhere in the resizing logic.
Comment 1 Christian Schwarz 2021-03-15 22:49:51 UTC
I can reproduce the issue using the provided code.
Comment 2 Lukas Schauer 2021-03-16 03:14:14 UTC
Created attachment 295881 [details]
Patch fixing the race condition

I've found the race condition.

After resizing a pipe a wakeup is issued for pipe_write, before actually raising the max_usage value for that pipe.

Depending on wether the pipe was full before resizing or not this could result in a deadlock situation.

I've attached a patch for this to this issue. It's build against v5.8 because that's what I've been using for testing. If necessary please let me know and I'll rebase it for a newer version.
Comment 3 Lukas Schauer 2021-03-24 14:29:42 UTC
Created attachment 296031 [details]
[PATCH] fs/pipe: wakeup wr_wait after setting max_usage

I revised the patch to better address the regression instead of weirdly pasting code around and also sent it to the linux-kernel mailing list with Alan Cox and David Howells in Cc.
Comment 4 John Goerzen 2022-06-20 13:42:28 UTC
What is the current status of getting this merged?  I recently encountered it in the wild.  Thanks.