Bug 110231
Summary: | [Regression] Crash at blk_queue_split+0x22a/0x490 | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Greg White (gwhite) |
Component: | Block Layer | Assignee: | Jens Axboe (axboe) |
Status: | RESOLVED CODE_FIX | ||
Severity: | blocking | CC: | kbusch, marcos.souza.org, szg00000 |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 4.4-rc7 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Segment split patch
Fix for split on first bio vector Re-attaching as a patch. patch submitted to list |
Description
Greg White
2016-01-01 15:35:29 UTC
The block device in question is an Intel 750 NVME SSD: 02:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01) (prog-if 02 [NVM Express]) Subsystem: Intel Corporation Device 370d Flags: bus master, fast devsel, latency 0, NUMA node 0 Memory at dfe10000 (64-bit, non-prefetchable) [size=16K] Expansion ROM at dfe00000 [disabled] [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI-X: Enable+ Count=32 Masked- Capabilities: [60] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [150] Virtual Channel Capabilities: [180] Power Budgeting <?> Capabilities: [190] Alternative Routing-ID Interpretation (ARI) Capabilities: [270] Device Serial Number 55-cd-2e-41-4c-88-d1-e8 Capabilities: [2a0] #19 Kernel driver in use: nvme Reverting that single commit seems to fix the problem with mainline. I have what seems to be a consistent way to reproduce this (building the kernel, aptly enough.) Thanks, we'll take a look. Created attachment 198601 [details]
Segment split patch
Can you try this patch?
Thanks. It no longer seems to reproduce with that patch applied. Created attachment 198681 [details]
Fix for split on first bio vector
Thanks for the catch. This fails xfstests as well.
I have an alternative proposal attached to fix that still splits the command. It's preferable for performance with this hardware that such commands are split.
Created attachment 198691 [details]
Re-attaching as a patch.
Retested with patch #2. This also seems to work. Great, thanks! I'll sync with Jens this week to see which route to go. I recommend mine for a couple reasons. A bio can be split in the middle of a vector, so might as well use the preferred alignment instead of requiring the driver accept the entire vector. And I think there's an issue in Jens' (perhaps only in theory) if the first bio vector's length is greater than the h/w's max transfer size. I think there's potential for my patch to report the wrong segment count. I'll fix that up and resend to the mailing list after a successful xfstests. Keith, your approach is the best one, for sure. Let me know when you have the segment part tested, and I can queue up the fix. Created attachment 198751 [details]
patch submitted to list
This one passed xfstests that was failing before.
The previous patch passed too, but I think that was more coincidence: we still need to split SG page gaps, which wasn't taken into account before.
Thanks, I saw that right after writing here. Looks good to me, queued up. Closing as fixed. |