Bug 80961 - [PATCH]register_bcache return value does not distinguish between transient and permanent failures
Summary: [PATCH]register_bcache return value does not distinguish between transient an...
Status: RESOLVED WILL_NOT_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jens Axboe
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-23 03:36 UTC by Ian Pilcher
Modified: 2014-07-30 00:25 UTC (History)
1 user (show)

See Also:
Kernel Version: 3.15.6
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Make register_bcache distinguish (possibly) non-fatal errors (1.03 KB, patch)
2014-07-23 03:36 UTC, Ian Pilcher
Details | Diff

Description Ian Pilcher 2014-07-23 03:36:17 UTC
Created attachment 143981 [details]
Make register_bcache distinguish (possibly) non-fatal errors

When registering a bcache device (cache or backing device), register_bcache attempts to acquire exclusive access to the device, and returns an error if it fails to do so.  In the error path, it checks whether the device is already registered as a bcache device, and it logs a different error message in this case ("device already registered" vs. "device busy"), but it does not return a different value.  It returns -EINVAL for all errors.

This means that userspace has no way of distinguishing between the "device busy" case, which should potentially be retried and the "already registered" case (or other errors), which are presumably fatal.

This is a real problem.  I have found that registration of a partition on an MD RAID (imsm) device fails about 50% of the time when triggered by udev.  Changing register_bcache's return value to distinguish between the two cases and modifying the udev helper to retry in the "device busy" case appears to fix the problem.

The attached patch changed register_bcache to return -EBUSY in the case where it cannot obtain exclusive access to the device, but it is not already registered with bcache.  It still returns -EINVAL for all other errors.
Comment 1 Alan 2014-07-29 15:21:50 UTC
Please see Documentation/SubmittingPatches

We can't take patches via bugzilla as we need a Signed-off-by:
Comment 2 Ian Pilcher 2014-07-30 00:25:06 UTC
(In reply to Alan from comment #1)
> We can't take patches via bugzilla as we need a Signed-off-by:

It never occurred to me that the patch would go in as is.  I don't even know if it applies against the latest development tree (and I've long since given up playing kernel patch Whack-a-Mole).

This is a bug report, that happens to include a suggested approach to fixing the issue.

Since all roads seem to lead to bcache-devel, and no one there appears interested ...

Note You need to log in before you can comment on or make changes to this bug.