Bug 218312

Summary: stackdepot, KASAN (tags): use percpu-rwsem instead of rwlock
Product: Memory Management Reporter: Andrey Konovalov (andreyknvl)
Component: SanitizersAssignee: MM/Sanitizers virtual assignee (mm_sanitizers)
Status: NEW ---    
Severity: normal CC: kasan-dev, melver
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:

Description Andrey Konovalov 2023-12-25 15:48:40 UTC
Both stack depot and KASAN stack ring (for the tag-based modes) require taking the read lock much more often than the write lock. Thus, using percpu-rwsem would likely be faster than the rwlock that is used by them right now.

We need to implement the irqsave/irqrestore API flavor for percpu-rwsem and switch stack depot and KASAN to percpu-rwsem.

Suggested by Marco in https://lore.kernel.org/lkml/CA+fCnZdAUo1CKDK4kiUyR+Fxc_F++CFezanPDVujx3u7fBmw=A@mail.gmail.com/.
Comment 1 Andrey Konovalov 2023-12-25 15:50:46 UTC
(Note that the stack depot only starts using the rwlock with the "stackdepot: allow evicting stack traces" series, which will likely be merged into 6.8.)
Comment 2 Marco Elver 2024-01-09 08:28:47 UTC
It looks like we may not be able to use percpu-rwsem if we can't sleep. From another discussion:

> > We need to figure out whether it's possible and how to add a
> > irqdisable flavor for percpu-rwsem. I now see that it's a sleeping
> > lock, so I don't know whether adding this flavor is possible or makes
> > sense.
> 
> Right, if we can't sleep in the paths we want to take the lock then it
> might not even be possible to use percpu-rwsem.

> > We also need to check whether it actually improves performance: I see
> > in [1] that taking a write lock might take "hundreds of milliseconds".
> 
> If taking the write lock is rare, this should not be a problem.
> 
> But I notice that the write lock is taken on _every_
> stack_depot_put(). With more usage of stack_depot_put(), this will
> introduce tons of contention.
> 
> The least we can do is to take the read lock first, and only upgrade
> to a write lock if the refcount has reached 0 to free the stack.