Bug 213607
Summary: | unable to mount a nfs v3 file system exported from a machine running kernel 5.13 | ||
---|---|---|---|
Product: | File System | Reporter: | az0123456 |
Component: | NFS | Assignee: | bfields |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | admin, admin |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.13.0 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | kernel configuration |
Description
az0123456
2021-06-28 11:53:55 UTC
5.13 works just fine for me, but assigning to bfields in case there is a regression somewhere... Nothing I can think of. Could you give us any details about how the mount actually fails? If this happened to me, the first thing I might do is watch the traffic under wireshark, and work out what happens differently in the 5.13 case as compared to the 5.12.13 case. Created attachment 297645 [details]
kernel configuration
kernel configuration
The machine in question runs now the previous kernel, no network traces, sorry. The mount takes a long time on the client side until it is put into the background. On the server side i see only one of 4 mount requests: Jun 28 13:24:42 srv rpc.mountd[3828]: authenticated mount request from fd5d:5ce:f267:d8c4::10:831 for /usr/local/dvd (/usr/local/dvd) The logs on the client side show timeouts: Jun 28 13:24:09 ac2 mount[3663]: mount to NFS server 'srv' failed: timed out, retrying I have exactly the same problem here using nfsv4. Kernel 5.12.13 on the server side works fine. If using 5.13 on there server side, nfs mount requests are hanging. As this is a productions system I have unfortunately no possibility to do other tests. On the client side I have tested using kernel 5.13 and MacOSX. Just for information. I have seen a post on the lkml that it might be that the problem was introduced between 5.13-rc7 and 5.13. citation: "It's likely this regression is due to a last minute change to alloc_pages_bulk_array() done just before v5.13." The full post can be found here: http://lkml.iu.edu/hypermail/linux/kernel/2106.3/04707.html Just for information, I tried bisecting the differences and can confirm that applying the following patch makes nfs working again for me: --- linux-5.13/mm/page_alloc.c 2021-06-28 00:21:11.000000000 +0200 +++ linux-5.13-rc7/mm/page_alloc.c 2021-06-21 00:03:15.000000000 +0200 @@ -5053,13 +5053,9 @@ * Skip populated array elements to determine if any pages need * to be allocated before disabling IRQs. */ - while (page_array && nr_populated < nr_pages && page_array[nr_populated]) + while (page_array && page_array[nr_populated] && nr_populated < nr_pages) nr_populated++; - /* Already populated array? */ - if (unlikely(page_array && nr_pages - nr_populated == 0)) - return 0; - /* Use the single page allocator for one page. */ if (nr_pages - nr_populated == 1) goto failed; same probleme here with kernel 5.13.0 with kernel 5.12.13 everything is ok and I confirm that the @David Arendt patch works too... I can confirm that this problem is fixed in kernel 5.13.1 |