The kernel panics every time a sync from the command line is executed.
I have bisected it to commit dd8544661947ad6d8d87b3c9d4333bfa1583d1bc (take bdi setup/destruction into cifs_mount/cifs_umount)
I'm mounting the cifs fs with the following line from fstab:
//fritz.box/FRITZ.NAS/Verbatim-STORENGO-01 /mnt/fritz cifs credentials=/path/to/cred/file,user,gid=disk,nounix,file_mode=0664,dir_mode=0775,noauto,comment=systemd.automount 0 0
comment=systemd.automount means that it will be mounted on first access through an autofs mount point.
Attached is a screenshot of the kernel panic.
This is probably related to the bug report here: https://lkml.org/lkml/2011/7/4/289
Created attachment 64962 [details]
Screenshot of kernel panic
Like Suresh, I'm not able to reproduce this either...
Assuming your offsets line up with the ones in my kernel (which is probably the case), this crashed here:
(gdb) list *(bdi_queue_work+0x40)
0xffffffff8115666e is in bdi_queue_work (include/linux/spinlock.h:290).
288 static inline void spin_lock_bh(spinlock_t *lock)
...that would probably indicate that bdi.lock was NULL, which would be the case if the bdi_setup_and_register never happened. I don't see how that could occur though -- Al's patch is pretty straightforward. Does the mount otherwise work before you issue a sync?
Sorry... that would indicate that bdi.wb_lock
First-Bad-Commit : dd8544661947ad6d8d87b3c9d4333bfa1583d1bc
Yeah, I'm just not seeing a bug here.
My initial thought was that maybe we have a cifs_sb that didn't have bdi_setup_and_register run on it, but I don't see how that could happen. Perhaps it's some sort of more generic memory corruption then?
One (semi-remote) possibility is that it's related to some other mount fixes that I sent to Steve this week:
...I'm not sure what he's waiting on wrt to pushing them, but it may be worthwhile to test those before we dig into this more deeply.
Are you comfortable adding patches and rebuilding cifs.ko? If so, may be able to add some debug code around this to isolate further since I am also having problems reproducing this (although in my case due to other problems I am hitting in radeon and virtualbox drivers on 3.0-rc)
Created attachment 65012 [details]
trace when touching a file
I just tested it with Steve French's tree, but unfortunately it still panics when syncing.
It also panics when I do other write operations (but strangely not on rm or mkdir/rmdir). But when I touch a file I also get a panic.
Reading seems to work fine.
(In reply to comment #6)
> Are you comfortable adding patches and rebuilding cifs.ko? If so, may be
> to add some debug code around this to isolate further since I am also having
> problems reproducing this (although in my case due to other problems I am
> hitting in radeon and virtualbox drivers on 3.0-rc)
Yes, I can try some patches. (I have cifs compiled in, maybe that's relevant?)
Interesting -- sounds like the same issue that Adam Nielsen reported to the list yesterday. I've not been able to reproduce that either:
...I doubt that the module vs. built-in matters here, but at this point anything is possible.
Reassigning to Steve since he's working on a debug patch...
A workaround seems to be to enable CONFIG_CIFS_DFS_UPCALL. After enabling it, I can sync without panics.
Maybe the problem is that the call to bdi_setup_and_register is inside a #ifdef CONFIG_CIFS_DFS_UPCALL (connect.c:3006), which means without CONFIG_CIFS_DFS_UPCALL it is never called.
Created attachment 65082 [details]
patch -- move bdi_setup_and_register outside CONFIG_CIFS_DFS_UPCALL
Well spotted. That's almost certainly the bug. This patch ought to fix it. Can you test it out?
The patch in #12 fixes the problem for me, all cifs operations work again!
Thanks a lot!
Yep, I can also confirm that #12 fixes the problem.
Thanks for testing it. Patch sent to Steve F. and email@example.com. It should make 3.0, assuming Steve pushes it to Linus soon.
Handled-By : Jeff Layton <firstname.lastname@example.org>
Patch : https://bugzilla.kernel.org/attachment.cgi?id=65082
*** Bug 39042 has been marked as a duplicate of this bug. ***
Fixed by commit 20547490c12b0ee3d32152b85e9f9bd183aa7224 .