Bug 195983
Summary: | [f2fs] zombie processes and freezing application related to schedule and f2fs_issue_flush | ||
---|---|---|---|
Product: | File System | Reporter: | mwohah (hmqxfmxe) |
Component: | Other | Assignee: | fs_other |
Status: | ASSIGNED --- | ||
Severity: | high | CC: | chao, justincase, kernel-NTEO, me, rulatir, yuchaochina |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 4.12.3 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: | Example backtrace and log of freezing dockerd |
Description
mwohah
2017-06-04 19:17:34 UTC
For me it's 100% reproducible when running `npm install` with the following `package.json`: ``` { "name": "foo", "version": "0.0.1", "description": "", "author": "bar", "license": "MIT", "dependencies": { "bulma": "^0.4.1", "js-cookie": "^2.1.3", "lodash": "^4.17.4", "vue": "^2.2.1", "vue-i18n": "^6.0.0-alpha.6", "vue-multiselect": "^2.0.0-beta.14", "vue-nprogress": "^0.1.5", "vue-resource": "^1.2.1", "vue-router": "^2.3.0", "vue-shortkey": "^2.1.0" }, "devDependencies": { "clean-webpack-plugin": "^0.1.16", "css-loader": "^0.26.1", "extract-text-webpack-plugin": "^2.0.0-rc.3", "node-sass": "^4.5.0", "sass-loader": "^6.0.2", "vue-loader": "^11.1.3", "vue-template-compiler": "^2.2.1", "webpack": "^2.2.1", "webpack-bundle-tracker": "^0.2.0", "webpack-merge": "^4.1.0" }, "jshintConfig": { "esversion": 6, "strict": "global", "asi": true, "browser": true, "browserify": true, "jquery": false } } ``` npm process gets stuck forever. Dear all, Same issue occurs with ceph (mon directory on f2fs) for the root partition. Best regards Tobias Arch just shipped 4.12.3. I didn't experience issues at first, but then tried artificially creating it by starting and stopping docker a few times and lo' and behold: it resurfaced. I've switched back to 4.9 LTS again, as it's been running for two months without problems now. My problems seem to be mostly limited to annoying application freezes and zombie processes, but a friend's system completely went corrupt after some time with the logs exposing the issue multiple times (he didn't know he had the issue before). Helllloooooooooooooo f2fs devs? How can we help you tackle this issue? Sorry for later reply. :( I suspect that it is a bug of flush_merge feature, in last issued 4.14-rc1 kernel we have just fix some potentail issues of this feature which would lead userspace apps hanging sometime, so I'd like to suggest to try last f2fs codes in issued kernel to see that whether we have fixed that issue. Also, could you track this issue in thread of f2fs mailing list: https://sourceforge.net/p/linux-f2fs/mailman/message/36037901/ Thanks for your reply! I will try switching to the currently stable Linux 4.13 in Arch and activating the noflush_merge option and see if the issue still appears. I've been trying to resurface the issue by going wild with docker a bit, like before, but this didn't trigger it anymore. I've been using the noflush_merge option for about a day now and so far, the issue has not occured anymore. This seems to indicate that the issue is indeed in the flush merging functionality. I'll try to stay on 4.13 with noflush_merge for a while and see if anything bad happens. If not, at least I have a way to use the more recent kernels with F2FS :-). The issue occurs rarely. I believe that we need **more testing** before declaring it fixed. @mwohah, should we publish testing instructions on the ArchLinux forum in order to get more testers? However, I am not sure what exactly I need to do in order to test the fix. @me I agree it needs more testing. Though, some use cases seem to expose the issue fairly often, such as npm and docker, possibly due to the amount of I/O that is involved. We could post it on the Arch forums, though I already made a topic there (see the OP) as well as on the bug tracker, both of which have links to this ticket, since it is a kernel issue. However, I do agree it could use a bit more visibility (perhaps on the Arch wiki page on f2fs?) in that users that are installing or want to install f2fs now should know that this issue exists, since it can cause instability and, granted that nomerge_flush fixes it, the workaround is rather trivial. @Chao Yu I've updated to 4.14 some days ago and switched noflush_merge back to flush_merge two days ago. So far, I haven't encountered any problems. Here's to hoping the problem is solved! Should the problem return, I'll post something here. I wanted to remind others of the fixes in 4.14, as they might encounter a different experience. Thanks! @mwohah, Thanks for your test and feedback. :) Hey, sorry to bother two years later, but I am considering switching to F2FS again and I wonder if lack of further activity in this bug is because the problem got fixed or because everyone affected migrated away from F2FS (I certainly did). (In reply to Szczepan from comment #11) > Hey, sorry to bother two years later, but I am considering switching to F2FS > again and I wonder if lack of further activity in this bug is because the > problem got fixed or because everyone affected migrated away from F2FS (I > certainly did). I am still using F2FS. (In reply to Szczepan from comment #11) > Hey, sorry to bother two years later, but I am considering switching to F2FS > again and I wonder if lack of further activity in this bug is because the > problem got fixed or because everyone affected migrated away from F2FS (I > certainly did). I guess it worths for you to have a try again with f2fs, as we added lots of features and made code more stable in last two years. Also I knew there are users using f2fs of kernel v5.6rc1 as root partition filesystem, except one task hang issue we have fixed in our git tree, I didn't get any further bug reports. (In reply to me from comment #12) > (In reply to Szczepan from comment #11) > > Hey, sorry to bother two years later, but I am considering switching to > F2FS > > again and I wonder if lack of further activity in this bug is because the > > problem got fixed or because everyone affected migrated away from F2FS (I > > certainly did). > > I am still using F2FS. Cool, thanks for the trust. |