Bug 27652

Summary: 38-rc1: umount+rmmod cause ext4 error.
Product: File System Reporter: Maciej Rutecki (maciej.rutecki)
Component: ext4Assignee: fs_ext4 (fs_ext4)
Status: CLOSED CODE_FIX    
Severity: normal CC: florian, maciej.rutecki, rjw, sandeen, tm
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 27352    

Description Maciej Rutecki 2011-01-26 20:26:48 UTC
Subject    : 38-rc1: umount+rmmod cause ext4 error.
Submitter  : Tao Ma <tm@tao.ma>
Date       : 2011-01-19 6:53
Message-ID : 4D368A70.5030103@tao.ma
References : http://marc.info/?l=linux-ext4&m=129542001031750&w=2

This entry is being used for tracking a regression from 2.6.37. Please don't
close it until the problem is fixed in the mainline.
Comment 1 Eric Sandeen 2011-01-26 20:41:05 UTC
I believe patches have been sent that should fix this; there were
2 module unloading regressions in .37, both fixed.

If you can test with:

http://marc.info/?l=linux-ext4&m=129546975702198&w=2
and
http://marc.info/?l=linux-ext4&m=129527644524410&w=2

that'd be great.

Thanks,
-Eric
Comment 2 Tao Ma 2011-01-28 06:01:10 UTC
ok, so I have applied the 2 patches. And now I did hit another bug every time, I have tried to revert these 2 patches. And it still show up every time. I use 38-rc2 now. So Maciej, is it necessary to open another bug for it?

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffff8233d343>] _raw_spin_lock_irqsave+0x11/0x22
PGD 12292c067 PUD 10fdb6067 PMD 0 
Oops: 0002 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq
CPU 1 
Modules linked in: ext4(-) jbd2 ipv6 autofs4 hidp rfcomm l2cap crc16 bluetooth rfkill ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs lockd sunrpc cpufreq_ondemand acpi_cpufreq dm_mirror video sbs sbshc battery acpi_memhotplug ac lp sg snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm option usb_wwan usbserial rtc_cmos tpm_tis snd_timer snd soundcore tpm pcspkr i2c_i801 parport_pc e1000e serio_raw dcdbas shpchp rtc_core rtc_lib i2c_core parport snd_page_alloc button tpm_bios dm_region_hash dm_log dm_mod usb_storage ahci libahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]

Pid: 4015, comm: rmmod Not tainted 2.6.38-rc2+ #30 0V4W66/OptiPlex 780                 
RIP: 0010:[<ffffffff8233d343>]  [<ffffffff8233d343>] _raw_spin_lock_irqsave+0x11/0x22
RSP: 0018:ffff88010fdbbe48  EFLAGS: 00010046
RAX: 0000000000000246 RBX: ffff88010fdbbe98 RCX: 00000000c0000100
RDX: 0000000000000100 RSI: ffff88010fdbbe98 RDI: 0000000000000020
RBP: ffff88010fdbbe48 R08: ffff88010fdba000 R09: ffff8800cfa0da70
R10: ffffffff82069498 R11: 0000000000000000 R12: 0000000000000020
R13: 0000000000000002 R14: 00007fffa52c5c60 R15: 0000000000000880
FS:  00007f7dd0d016e0(0000) GS:ffff8800cfa40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000011dc48000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 4015, threadinfo ffff88010fdba000, task ffff88012271e180)
Stack:
 ffff88010fdbbe88 ffffffff82055ee2 ffffffff82069498 ffff88010fdbbe98
 ffff88010fdbbe98 ffffffffa04d8440 0000000000000880 00007fffa52c5c60
 ffff88010fdbbed8 ffffffffa04ca2cc 0000000000000000 ffff88012271e180
Call Trace:
 [<ffffffff82055ee2>] prepare_to_wait+0x25/0x7b
 [<ffffffff82069498>] ? __try_stop_module+0x0/0x3d
 [<ffffffffa04ca2cc>] ext4_exit_fs+0x94/0x112 [ext4]
 [<ffffffff82055cdf>] ? autoremove_wake_function+0x0/0x3d
 [<ffffffff82067bce>] sys_delete_module+0x1b5/0x218
 [<ffffffff8207506f>] ? audit_syscall_entry+0x187/0x1ba
 [<ffffffff82002a6b>] system_call_fastpath+0x16/0x1b
Code: 1f 44 00 00 b8 00 01 00 00 f0 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c9 c3 55 48 89 e5 0f 1f 44 00 00 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c9 c3 55 48 89 e5 
RIP  [<ffffffff8233d343>] _raw_spin_lock_irqsave+0x11/0x22
 RSP <ffff88010fdbbe48>
CR2: 0000000000000020
---[ end trace ec9d834b678544aa ]---
Comment 3 Tao Ma 2011-01-28 09:51:52 UTC
Just tested with 37+the 2 patches. Still the same error.
So it isn't a 38 problem really.


BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffff8233dabb>] _raw_spin_lock_irqsave+0x11/0x22
PGD 122b58067 PUD 122594067 PMD 0 
Oops: 0002 [#1] SMP 
last sysfs file: /sys/devices/virtual/misc/autofs/dev
CPU 1 
Modules linked in: ext4(-) jbd2 ipv6 autofs4 hidp rfcomm l2cap crc16 bluetooth rfkill ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs cpufreq_ondemand acpi_cpufreq dm_mirror video output sbs sbshc battery acpi_memhotplug ac lp sg snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_seq_dummy snd_seq_oss option usb_wwan snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss usbserial snd_mixer_oss snd_pcm snd_timer rtc_cmos snd rtc_core tpm_tis tpm i2c_i801 shpchp parport_pc parport serio_raw button soundcore rtc_lib tpm_bios pcspkr dcdbas e1000e i2c_core snd_page_alloc dm_region_hash dm_log dm_mod usb_storage ahci libahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd

Pid: 4717, comm: rmmod Not tainted 2.6.37+ #2 0V4W66/OptiPlex 780                 
RIP: 0010:[<ffffffff8233dabb>]  [<ffffffff8233dabb>] _raw_spin_lock_irqsave+0x11/0x22
RSP: 0018:ffff88011f7ade48  EFLAGS: 00010046
RAX: 0000000000000246 RBX: ffff88011f7ade98 RCX: 00000000c0000100
RDX: 0000000000000100 RSI: ffff88011f7ade98 RDI: 0000000000000020
RBP: ffff88011f7ade48 R08: ffff88011f7ac000 R09: ffff8800cfa0da70
R10: ffff8800cfa51a00 R11: ffff88011f7add48 R12: 0000000000000020
R13: 0000000000000002 R14: 00007fff3d3ad400 R15: 0000000000000880
FS:  00007fe07ff0e6e0(0000) GS:ffff8800cfa40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000011ee14000 CR4: 00000000000406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process rmmod (pid: 4717, threadinfo ffff88011f7ac000, task ffff88012358f8d0)
Stack:
 ffff88011f7ade88 ffffffff82056d4e ffffffff8206a6e4 ffff88011f7ade98
 ffff88011f7ade98 ffffffffa0441500 0000000000000880 00007fff3d3ad400
 ffff88011f7aded8 ffffffffa0433428 0000000000000000 ffff88012358f8d0
Call Trace:
 [<ffffffff82056d4e>] prepare_to_wait+0x25/0x7b
 [<ffffffff8206a6e4>] ? __try_stop_module+0x0/0x3d
 [<ffffffffa0433428>] ext4_exit_fs+0x94/0x112 [ext4]
 [<ffffffff82056b4b>] ? autoremove_wake_function+0x0/0x3d
 [<ffffffff82069127>] sys_delete_module+0x1b5/0x218
 [<ffffffff820c26f9>] ? do_munmap+0x2e2/0x318
 [<ffffffff820767db>] ? audit_syscall_entry+0x1d6/0x209
 [<ffffffff82002aeb>] system_call_fastpath+0x16/0x1b
Code: 1f 44 00 00 b8 00 01 00 00 f0 66 0f c1 07 38 e0 74 06 f3 90 8a 07 eb f6 c9 c3 55 48 89 e5 0f 1f 44 00 00 9c 58 fa ba 00 01 00 00 <f0> 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c9 c3 55 48 89 e5 
RIP  [<ffffffff8233dabb>] _raw_spin_lock_irqsave+0x11/0x22
 RSP <ffff88011f7ade48>
CR2: 0000000000000020
---[ end trace e0bdca5906fdcbe6 ]---
Comment 4 Eric Sandeen 2011-01-28 17:15:01 UTC
Is that last oops with or without the patches I pointed at?

That might be in here:

static void ext4_destroy_lazyinit_thread(void)
{
        /*
         * If thread exited earlier
         * there's nothing to be done.
         */
        if (!ext4_li_info)
                return;

        ext4_clear_request_list();

        while (ext4_li_info->li_task) {
                wake_up(&ext4_li_info->li_wait_daemon);
                wait_event(ext4_li_info->li_wait_task,
                           ext4_li_info->li_task == NULL);
        }
}

> BUG: unable to handle kernel NULL pointer dereference at 0000000000000020

li_wait_task is 0x20 into the ext4_lazy_init struct... did ext4_li_info go away?
It does get freed and set to NULL when the lazyinit thread exits.  I'm guessing we need a little judicial application of ext4_li_mtx in the teardown thread.
Comment 5 Eric Sandeen 2011-01-28 17:32:13 UTC
I could also be off base ;)  Is the filesystem under test created with the lazy_itable_init feature?
Comment 6 Eric Sandeen 2011-01-28 22:07:53 UTC
Ok it is in fact lazyinit uninit that's falling down, the wait_event is waiting on something that gets freed.

This fixes it up for me:

Index: linux-2.6/fs/ext4/super.c
===================================================================
--- linux-2.6.orig/fs/ext4/super.c
+++ linux-2.6/fs/ext4/super.c
@@ -77,6 +77,7 @@ static struct dentry *ext4_mount(struct 
 		       const char *dev_name, void *data);
 static void ext4_destroy_lazyinit_thread(void);
 static void ext4_unregister_li_request(struct super_block *sb);
+static void ext4_clear_request_list(void);
 
 #if !defined(CONFIG_EXT3_FS) && !defined(CONFIG_EXT3_FS_MODULE) && defined(CONFIG_EXT4_USE_FOR_EXT23)
 static struct file_system_type ext3_fs_type = {
@@ -2704,6 +2705,8 @@ static void ext4_unregister_li_request(s
 	mutex_unlock(&ext4_li_info->li_list_mtx);
 }
 
+static struct task_struct *ext4_lazyinit_task;
+
 /*
  * This is the function where ext4lazyinit thread lives. It walks
  * through the request list searching for next scheduled filesystem.
@@ -2772,6 +2775,10 @@ cont_thread:
 		if (time_before(jiffies, next_wakeup))
 			schedule();
 		finish_wait(&eli->li_wait_daemon, &wait);
+		if (kthread_should_stop()) {
+			ext4_clear_request_list();
+			goto exit_thread;
+		}
 	}
 
 exit_thread:
@@ -2796,6 +2803,7 @@ exit_thread:
 	wake_up(&eli->li_wait_task);
 
 	kfree(ext4_li_info);
+	ext4_lazyinit_task = NULL;
 	ext4_li_info = NULL;
 	mutex_unlock(&ext4_li_mtx);
 
@@ -2818,11 +2826,10 @@ static void ext4_clear_request_list(void
 
 static int ext4_run_lazyinit_thread(void)
 {
-	struct task_struct *t;
-
-	t = kthread_run(ext4_lazyinit_thread, ext4_li_info, "ext4lazyinit");
-	if (IS_ERR(t)) {
-		int err = PTR_ERR(t);
+	ext4_lazyinit_task = kthread_run(ext4_lazyinit_thread,
+					 ext4_li_info, "ext4lazyinit");
+	if (IS_ERR(ext4_lazyinit_task)) {
+		int err = PTR_ERR(ext4_lazyinit_task);
 		ext4_clear_request_list();
 		del_timer_sync(&ext4_li_info->li_timer);
 		kfree(ext4_li_info);
@@ -2976,13 +2983,7 @@ static void ext4_destroy_lazyinit_thread
 	if (!ext4_li_info)
 		return;
 
-	ext4_clear_request_list();
-
-	while (ext4_li_info->li_task) {
-		wake_up(&ext4_li_info->li_wait_daemon);
-		wait_event(ext4_li_info->li_wait_task,
-			   ext4_li_info->li_task == NULL);
-	}
+	kthread_stop(ext4_lazyinit_task);
 }
 
 static int ext4_fill_super(struct super_block *sb, void *data, int silent)
Comment 7 Florian Mickler 2011-01-29 12:02:12 UTC
Patch: https://bugzilla.kernel.org/show_bug.cgi?id=27652#c6 

Tao Ma, does this work for you too?
Comment 8 Tao Ma 2011-01-30 02:36:58 UTC
(In reply to comment #7)
> Patch: https://bugzilla.kernel.org/show_bug.cgi?id=27652#c6 
> 
> Tao Ma, does this work for you too?

Hi sorry for the delay.
So the patch works for the kernel panic.
Reported-and-Tested-by: Tao Ma <boyu.mt@taobao.com>

As for the original bug, it doesn't help. I can still get the error message.

slab error in kmem_cache_destroy(): cache `ext4_inode_cache': Can't free all objects
Pid: 4674, comm: rmmod Not tainted 2.6.38-rc2+ #1
Call Trace:
 [<ffffffff820d61ac>] ? kmem_cache_destroy+0x83/0xc7
 [<ffffffffa03a9000>] ? destroy_inodecache+0x15/0x17 [ext4]
 [<ffffffffa03c7264>] ? ext4_exit_fs+0x30/0x60 [ext4]
 [<ffffffff82067bce>] ? sys_delete_module+0x1b5/0x218
 [<ffffffff8207506f>] ? audit_syscall_entry+0x187/0x1ba
 [<ffffffff82002a6b>] ? system_call_fastpath+0x16/0x1b
SLAB: cache with size 888 has lost its name

And as I pointed in the original mail, I have done some basic investigation and it looks as if it is introduced by Nick's new ext4_i_callback.
Comment 9 Tao Ma 2011-01-30 02:41:57 UTC
An off-topic question. How can I reopen this bug? I can't find a button to reopen it to 'New' or 'Not Fixed'?
Comment 10 Florian Mickler 2011-01-30 19:47:35 UTC
Well, the bugzilla could use a little love. Only the reporter or an obscure group of people can change the status of bugs (as far as I know). But it is also not possible to change the reporter. Don't know. Here you go...
Comment 11 Eric Sandeen 2011-01-31 16:07:56 UTC
Seems we have about 4 regressions all tracked in 1 bugzilla, sadly.  1 bug per regression would work better... though partly my fault for getting OT on the original bug.
Comment 12 Florian Mickler 2011-01-31 17:55:30 UTC
No problem, the regression team is up to a severe game of last man standing :)
Comment 13 Florian Mickler 2011-01-31 17:57:09 UTC
Ignore-Patch: https://bugzilla.kernel.org/show_bug.cgi?id=27652#c6
Comment 14 Tao Ma 2011-02-09 02:01:04 UTC
http://marc.info/?l=linux-fsdevel&m=129718595331581&w=2

I have tested this patch and it can resolve this bug. So please close it. Thanks.
Comment 15 Florian Mickler 2011-02-09 04:50:23 UTC
Patch: http://marc.info/?l=linux-fsdevel&m=129718595331581&w=2
Comment 16 Rafael J. Wysocki 2011-02-09 09:42:53 UTC
Handled-By : Boaz Harrosh <bharrosh@panasas.com>
Comment 17 Rafael J. Wysocki 2011-02-12 22:38:59 UTC
Fixed by commit d863b50ab01333659314c2034890cb76d9fdc3c7 .