Bug 216661 - fail-nth: support multiple failures
Summary: fail-nth: support multiple failures
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Sanitizers (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: MM/Sanitizers virtual assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-04 17:40 UTC by Dmitry Vyukov
Modified: 2022-11-04 17:43 UTC (History)
1 user (show)

See Also:
Kernel Version: ALL
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Dmitry Vyukov 2022-11-04 17:40:15 UTC
Some functions try to allocate with GFP_ATOMIC and if it fails retry the allocation with GFP_KERNEL. Such allocations cannot be failed by the current fail-nth since it fails only 1 allocation, so the second one succeeds.

We could support arming fail-nth for several allocations, e.g. "5,7", or "5-100", or even "3,6,8-12,15". But probably does not make sense to support it in all generality if it makes the implementation too complex.

Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/all/Y2RbCUdEY2syxRLW@nvidia.com/
Comment 1 Dmitry Vyukov 2022-11-04 17:43:56 UTC
A reasonable compromise between generality and simplicity may be support a fixed number of ranges (say, 4) and convert all single failures into a range (N is converted to N-N).

Namely task_struct will have 4 ranges encoded as pairs and, for example, "3, 7-11, 15" will be encoded as:
[{3-3}, {7-11}, {15-15}, {0,0}]

It does not require memory allocations and checking such data structure for a match is simple and fast.

Note You need to log in before you can comment on or make changes to this bug.