Bug 13761
Summary: | kmemleak + BUG: unable to handle kernel paging request | ||
---|---|---|---|
Product: | Other | Reporter: | Márton Németh (nm127) |
Component: | Other | Assignee: | other_other |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | alan, catalin.marinas |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.31-rc2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
kernel configuration
kmemleak: Do not acquire scan_mutex in kmemleak_open() kmemleak: Protect the seq start/next/stop sequence by rcu_read_lock() |
Description
Márton Németh
2009-07-11 06:49:34 UTC
Created attachment 22307 [details]
kernel configuration
I haven't managed to trigger this bug yet (I'll try a bit more) but I have a patch in my branch which sorts out mutex locking in the seq functions. http://www.linux-arm.org/git?p=linux-2.6.git;a=commitdiff_plain;h=b87324d082d9d898e3c06b2a07a2b94b2430b8ba Do you still get this bug with the above patch (the full branch is at http://www.linux-arm.org/git?p=linux-2.6.git;a=shortlog;h=kmemleak)? Even if I haven't reproduced it, it seems that it is possible for a kmemleak_object.object_list.next to point to a freed object (filled with 0x6b poison). I'll have a look and fix this. Thanks. The kmemleak_seq_next() function does list traversal where elements can be freed but the rcu_read_lock only protects a small loop. My understanding of the seq_file is that locks can be held between start() and stop() as the seq code doesn't sleep. In this case, the rcu_read_lock() is called in start() and rcu_read_unlock() in stop(): diff --git a/mm/kmemleak.c b/mm/kmemleak.c index b2426af..7f33ca9 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -1213,7 +1213,6 @@ static void *kmemleak_seq_start(struct seq_file *seq, loff_t *pos) } object = NULL; out: - rcu_read_unlock(); return object; } @@ -1229,13 +1228,11 @@ static void *kmemleak_seq_next(struct seq_file *seq, void *v, loff_t *pos) ++(*pos); - rcu_read_lock(); list_for_each_continue_rcu(n, &object_list) { next_obj = list_entry(n, struct kmemleak_object, object_list); if (get_object(next_obj)) break; } - rcu_read_unlock(); put_object(prev_obj); return next_obj; @@ -1251,6 +1248,7 @@ static void kmemleak_seq_stop(struct seq_file *seq, void *v) * kmemleak_seq_start may return ERR_PTR if the scan_mutex * waiting was interrupted, so only release it if !IS_ERR. */ + rcu_read_unlock(); mutex_unlock(&scan_mutex); if (v) put_object(v); It was not easy for me also to reproduce this problem. Here is the way how I can trigger this problem: 1. mount -t debugfs nodev /sys/kernel/debug/ 2. Open two xterm windows 3. On the first xterm window execute the command: "while true; do echo -n .; echo scan >/sys/kernel/debug/kmemleak; done" 4. On the second xterm window execute the command: "while true; do clear; cat /sys/kernel/debug/kmemleak; done" (In reply to comment #4) > diff --git a/mm/kmemleak.c b/mm/kmemleak.c > index b2426af..7f33ca9 100644 > --- a/mm/kmemleak.c > +++ b/mm/kmemleak.c Could you please also attach the patch to this bug? It seems that bugzilla converted all the TABs to SPACEs. Exactly which patch should I apply? The one you mentioned in comment #2 or the one at comment #4 or both? Created attachment 22319 [details]
kmemleak: Do not acquire scan_mutex in kmemleak_open()
Created attachment 22320 [details]
kmemleak: Protect the seq start/next/stop sequence by rcu_read_lock()
Both should be applied in the attached order. I attached kmemleak-scan-mutex.patch and kmemleak-seq-lock.patch. Thanks. The first patch was just merged into mainline, so only the second is needed. Please let me know if it solves the issue for you. Thanks. Both patches now merged into mainline, so I'd like to close this bug. Please re-open if you think the problem still exists. Thanks. |