Bug 11175

Summary: ext3 BUG in add_dirent_to_buf+0x6c/0x269
Product: File System Reporter: Asheesh Laroia (kernel.bugzilla)
Component: ext3Assignee: Andrew Morton (akpm)
Status: REJECTED INVALID    
Severity: normal CC: bunk
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25.8-vs2.3.x-vs2.3.0.34.12 Subsystem:
Regression: --- Bisected commit-id:

Description Asheesh Laroia 2008-07-29 17:00:09 UTC
Latest working kernel version: Didn't see this problem a few months ago but the filesystem is new
Earliest failing kernel version: 2.6.25.8-vs2.3.x-vs2.3.0.34.12 (same as submission)
Distribution: Debian unstable
Hardware Environment: Dell-brand ICH8-based Core 2 Duo
Software Environment: Debian unstable
Problem Description:

I have a 1T disk I store backups on using rsync and hardlinks.  Some usage information:

paulproteus@sf:~ $ df -k /dev/sdb1
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sdb1            961432072 635100984 326331088  67% /media/asheesh-big-backups
paulproteus@sf:~ $ df -i /dev/sdb1
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sdb1            122109952 9786584 112323368    9% /media/asheesh-big-backups

Sometimes, the system will hang on writes to that filesystem until I reboot.  Strangely, the problem seems to persist after an fsck -f.

I reboot and fsck, and I can write again, but then I get it again.

Steps to reproduce:

* Wait a while
* Write to the filesystem
* Note that you get a BUG in dmesg and further writes to the filesystem hang

ksymoops has this to say about the BUG:

ksymoops 2.4.11 on i686 2.6.25.8-vs2.3.x-vs2.3.0.34.12.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.6.25.8-vs2.3.x-vs2.3.0.34.12/ (default)
     -m /boot/System.map-2.6.25.8-vs2.3.x-vs2.3.0.34.12 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (regular_file): read_ksyms stat /proc/ksyms failed
No modules in ksyms, skipping objects
No ksyms, skipping lsmod
lo: Disabled Privacy Extensions
e1000e: Intel(R) PRO/1000 Network Driver - 0.2.0
e1000e: Copyright (c) 1999-2007 Intel Corporation.
ehci_hcd 0000:00:1a.7: debug port 1
ehci_hcd 0000:00:1d.7: debug port 1
dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
warning: `dnsmasq' uses deprecated v2 capabilities in a way that may be insecure.
warning: `avahi-daemon' uses 32-bit capabilities (legacy support in use)
Pid: 8603, comm: smtpd Not tainted 2.6.25.8-vs2.3.x-vs2.3.0.34.12 #3
 [<c01634a7>] oom_kill_process+0x43/0x18d
 [<c01638fb>] out_of_memory+0x158/0x18d
 [<c0165c6d>] __alloc_pages+0x22f/0x2b7
 [<c01755a8>] read_swap_cache_async+0x71/0xe0
 [<c0175654>] swapin_readahead+0x3d/0x6e
 [<c016de6e>] handle_mm_fault+0x402/0x8cd
 [<c0128846>] ? irq_exit+0x74/0x77
 [<c0110689>] ? smp_apic_timer_interrupt+0x6e/0x7c
 [<c0105600>] ? apic_timer_interrupt+0x28/0x30
 [<c030021e>] do_page_fault+0x37c/0x725
 [<c01373a8>] ? enqueue_hrtimer+0xc8/0xd3
 [<c0137bf1>] ? hrtimer_start+0x154/0x16e
 [<c011e8c1>] ? hrtick_set+0x98/0xe6
 [<c02ffea2>] ? do_page_fault+0x0/0x725
 [<c0104446>] ? do_notify_resume+0x735/0x74e
 [<c0171d34>] ? mmap_region+0x187/0x405
 [<c0171eeb>] ? mmap_region+0x33e/0x405
 [<c010a8c2>] ? set_tls_desc+0x13d/0x156
 [<c010ad25>] ? do_set_thread_area+0xa6/0xba
 [<c02ffea2>] ? do_page_fault+0x0/0x725
 [<c02fe952>] error_code+0x72/0x78
 [<c02f0000>] ? clip_ioctl+0x3f5/0x45c
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    0: hi:  186, btch:  31 usd: 149
CPU    1: hi:  186, btch:  31 usd: 183
CPU    0: hi:  186, btch:  31 usd: 136
CPU    1: hi:  186, btch:  31 usd: 106
191968 total pagecache pages
521809 pages of RAM
292433 pages of HIGHMEM
5290 reserved pages
38300 pages shared
189736 pages swap cached
4584 pages slab
1504 pages pagetables
BUG: unable to handle kernel paging request at f8000000
*pde = 00000000 
Oops: 0000 [#1] PREEMPT SMP 
Pid: 12197, comm: rsync Not tainted (2.6.25.8-vs2.3.x-vs2.3.0.34.12 #3)
EIP: 0060:[<f89c6854>] EFLAGS: 00210246 CPU: 1
Using defaults from ksymoops -t elf32-i386 -a i386
EAX: 00000000 EBX: f7fffe90 ECX: f7fffffe EDX: 00000000
ESI: 00000000 EDI: 00000800 EBP: c248ddd0 ESP: c248dd5c
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Stack: f7d6b000 c248de64 c9960e80 dbc20504 c248ddb4 00001000 ddb78860 f0f2b4b0 
       f8000000 f7d6a000 f7fff000 0000002e f7fffff8 c9ada62a 00000f78 c248ddd4 
       f89c6b1a f7d6a000 f7fffe90 00000058 c9bd73f4 875f4d2a 00000002 f7b56068 
Call Trace:
 [<f89c6b1a>] ? add_dirent_to_buf+0x6c/0x269 [ext3]
 [<f89c70df>] ? ext3_add_entry+0x3c8/0x77f [ext3]
 [<c012a34a>] ? capable+0x10/0x12
 [<c0184f38>] ? generic_permission+0x8c/0xc6
 [<c02fe715>] ? _spin_unlock+0xf/0x23
 [<f898ea69>] ? start_this_handle+0x2a7/0x2c6 [jbd]
 [<f89c7a71>] ? ext3_add_nondir+0x15/0x42 [ext3]
 [<f89c7c3a>] ? ext3_link+0xb4/0xea [ext3]
 [<c0185c50>] ? vfs_link+0x132/0x1c3
 [<c01889ef>] ? sys_linkat+0xb2/0xee
 [<c0181c36>] ? sys_lstat64+0x22/0x28
 [<c0188a3f>] ? sys_link+0x14/0x16
 [<c0104ad4>] ? sysenter_past_esp+0x6d/0xa5
Code: 00 00 00 89 79 f8 89 f1 3b 4d d4 77 d7 85 c0 74 07 8b 75 bc 31 c0 eb ee 8b 55 b8 8b 7d a0 8b 5d d4 d1 ef 8d 4c d3 fe 31 f6 31 d2 <8b> 19 83 e9 08 89 d8 66 d1 e8 0f b7 c0 8d 04 02 39 f8 77 08 0f 


>>EIP; f89c6854 <END_OF_CODE+3852d854/????>   <=====

Trace; f89c6b1a <END_OF_CODE+3852db1a/????>
Trace; f89c70df <END_OF_CODE+3852e0df/????>
Trace; c012a34a <capable+10/12>
Trace; c0184f38 <generic_permission+8c/c6>
Trace; c02fe715 <_spin_unlock+f/23>
Trace; f898ea69 <END_OF_CODE+384f5a69/????>
Trace; f89c7a71 <END_OF_CODE+3852ea71/????>
Trace; f89c7c3a <END_OF_CODE+3852ec3a/????>
Trace; c0185c50 <vfs_link+132/1c3>
Trace; c01889ef <sys_linkat+b2/ee>
Trace; c0181c36 <sys_lstat64+22/28>
Trace; c0188a3f <sys_link+14/16>
Trace; c0104ad4 <sysenter_past_esp+6d/a5>

Code;  f89c6829 <END_OF_CODE+3852d829/????>
00000000 <_EIP>:
Code;  f89c6829 <END_OF_CODE+3852d829/????>
   0:   00 00                     add    %al,(%eax)
Code;  f89c682b <END_OF_CODE+3852d82b/????>
   2:   00 89 79 f8 89 f1         add    %cl,-0xe760787(%ecx)
Code;  f89c6831 <END_OF_CODE+3852d831/????>
   8:   3b 4d d4                  cmp    -0x2c(%ebp),%ecx
Code;  f89c6834 <END_OF_CODE+3852d834/????>
   b:   77 d7                     ja     ffffffe4 <_EIP+0xffffffe4>
Code;  f89c6836 <END_OF_CODE+3852d836/????>
   d:   85 c0                     test   %eax,%eax
Code;  f89c6838 <END_OF_CODE+3852d838/????>
   f:   74 07                     je     18 <_EIP+0x18>
Code;  f89c683a <END_OF_CODE+3852d83a/????>
  11:   8b 75 bc                  mov    -0x44(%ebp),%esi
Code;  f89c683d <END_OF_CODE+3852d83d/????>
  14:   31 c0                     xor    %eax,%eax
Code;  f89c683f <END_OF_CODE+3852d83f/????>
  16:   eb ee                     jmp    6 <_EIP+0x6>
Code;  f89c6841 <END_OF_CODE+3852d841/????>
  18:   8b 55 b8                  mov    -0x48(%ebp),%edx
Code;  f89c6844 <END_OF_CODE+3852d844/????>
  1b:   8b 7d a0                  mov    -0x60(%ebp),%edi
Code;  f89c6847 <END_OF_CODE+3852d847/????>
  1e:   8b 5d d4                  mov    -0x2c(%ebp),%ebx
Code;  f89c684a <END_OF_CODE+3852d84a/????>
  21:   d1 ef                     shr    %edi
Code;  f89c684c <END_OF_CODE+3852d84c/????>
  23:   8d 4c d3 fe               lea    -0x2(%ebx,%edx,8),%ecx
Code;  f89c6850 <END_OF_CODE+3852d850/????>
  27:   31 f6                     xor    %esi,%esi
Code;  f89c6852 <END_OF_CODE+3852d852/????>
  29:   31 d2                     xor    %edx,%edx
Code;  f89c6854 <END_OF_CODE+3852d854/????>   <=====
  2b:   8b 19                     mov    (%ecx),%ebx   <=====
Code;  f89c6856 <END_OF_CODE+3852d856/????>
  2d:   83 e9 08                  sub    $0x8,%ecx
Code;  f89c6859 <END_OF_CODE+3852d859/????>
  30:   89 d8                     mov    %ebx,%eax
Code;  f89c685b <END_OF_CODE+3852d85b/????>
  32:   66 d1 e8                  shr    %ax
Code;  f89c685e <END_OF_CODE+3852d85e/????>
  35:   0f b7 c0                  movzwl %ax,%eax
Code;  f89c6861 <END_OF_CODE+3852d861/????>
  38:   8d 04 02                  lea    (%edx,%eax,1),%eax
Code;  f89c6864 <END_OF_CODE+3852d864/????>
  3b:   39 f8                     cmp    %edi,%eax
Code;  f89c6866 <END_OF_CODE+3852d866/????>
  3d:   77 08                     ja     47 <_EIP+0x47>
Code;  f89c6868 <END_OF_CODE+3852d868/????>
  3f:   

EIP: [<f89c6854>] do_split+0x1ee/0x419 [ext3] SS:ESP 0068:c248dd5c
Warning (Oops_read): Code line not seen, dumping what data is available


>>EIP; f89c6854 <END_OF_CODE+3852d854/????>   <=====


2 warnings and 1 error issued.  Results may not be reliable.
Comment 1 Asheesh Laroia 2008-07-29 17:04:28 UTC
The filesystem is private, and I'm not sure I want to send the metadata either since it will contain filenames.  But I'm open to discussion/suggestions.

I neglected to say what source tree I built this kernel from:

It uses the http://vserver.13thfloor.at/Experimental/patch-2.6.25.8-vs2.3.0.34.12.diff patch on top of stock 2.6.25.8.  My .config is at http://paulproteus.acm.jhu.edu/bug-report/2008-07-29/dot-config .
Comment 2 Anonymous Emailer 2008-07-29 17:12:14 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 29 Jul 2008 17:00:11 -0700 (PDT)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11175
> 
>            Summary: ext3 BUG in add_dirent_to_buf+0x6c/0x269
>            Product: File System
>            Version: 2.5
>      KernelVersion: 2.6.25.8-vs2.3.x-vs2.3.0.34.12
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: ext3
>         AssignedTo: akpm@osdl.org
>         ReportedBy: kernel.bugzilla@asheesh.org
> 
> 
> Latest working kernel version: Didn't see this problem a few months ago but
> the
> filesystem is new
> Earliest failing kernel version: 2.6.25.8-vs2.3.x-vs2.3.0.34.12 (same as
> submission)
> Distribution: Debian unstable
> Hardware Environment: Dell-brand ICH8-based Core 2 Duo
> Software Environment: Debian unstable
> Problem Description:
> 
> I have a 1T disk I store backups on using rsync and hardlinks.  Some usage
> information:
> 
> paulproteus@sf:~ $ df -k /dev/sdb1
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/sdb1            961432072 635100984 326331088  67%
> /media/asheesh-big-backups
> paulproteus@sf:~ $ df -i /dev/sdb1
> Filesystem            Inodes   IUsed   IFree IUse% Mounted on
> /dev/sdb1            122109952 9786584 112323368    9%
> /media/asheesh-big-backups
> 
> Sometimes, the system will hang on writes to that filesystem until I reboot. 
> Strangely, the problem seems to persist after an fsck -f.
> 
> I reboot and fsck, and I can write again, but then I get it again.
> 
> Steps to reproduce:
> 
> * Wait a while
> * Write to the filesystem
> * Note that you get a BUG in dmesg and further writes to the filesystem hang
> 
> ksymoops has this to say about the BUG:
> 
> ksymoops 2.4.11 on i686 2.6.25.8-vs2.3.x-vs2.3.0.34.12.  Options used
>      -V (default)
>      -k /proc/ksyms (default)
>      -l /proc/modules (default)
>      -o /lib/modules/2.6.25.8-vs2.3.x-vs2.3.0.34.12/ (default)
>      -m /boot/System.map-2.6.25.8-vs2.3.x-vs2.3.0.34.12 (default)
> 
> Warning: You did not tell me where to find symbol information.  I will
> assume that the log matches the kernel and modules that are running
> right now and I'll use the default options above for symbol resolution.
> If the current kernel and/or modules do not match the log, you can get
> more accurate output by telling me the kernel version and where to find
> map, modules, ksyms etc.  ksymoops -h explains the options.
> 
> Error (regular_file): read_ksyms stat /proc/ksyms failed
> No modules in ksyms, skipping objects
> No ksyms, skipping lsmod
> lo: Disabled Privacy Extensions
> e1000e: Intel(R) PRO/1000 Network Driver - 0.2.0
> e1000e: Copyright (c) 1999-2007 Intel Corporation.
> ehci_hcd 0000:00:1a.7: debug port 1
> ehci_hcd 0000:00:1d.7: debug port 1
> dcdbas dcdbas: Dell Systems Management Base Driver (version 5.6.0-3.2)
> warning: `dnsmasq' uses deprecated v2 capabilities in a way that may be
> insecure.
> warning: `avahi-daemon' uses 32-bit capabilities (legacy support in use)
> Pid: 8603, comm: smtpd Not tainted 2.6.25.8-vs2.3.x-vs2.3.0.34.12 #3
>  [<c01634a7>] oom_kill_process+0x43/0x18d
>  [<c01638fb>] out_of_memory+0x158/0x18d
>  [<c0165c6d>] __alloc_pages+0x22f/0x2b7
>  [<c01755a8>] read_swap_cache_async+0x71/0xe0
>  [<c0175654>] swapin_readahead+0x3d/0x6e
>  [<c016de6e>] handle_mm_fault+0x402/0x8cd
>  [<c0128846>] ? irq_exit+0x74/0x77
>  [<c0110689>] ? smp_apic_timer_interrupt+0x6e/0x7c
>  [<c0105600>] ? apic_timer_interrupt+0x28/0x30
>  [<c030021e>] do_page_fault+0x37c/0x725
>  [<c01373a8>] ? enqueue_hrtimer+0xc8/0xd3
>  [<c0137bf1>] ? hrtimer_start+0x154/0x16e
>  [<c011e8c1>] ? hrtick_set+0x98/0xe6
>  [<c02ffea2>] ? do_page_fault+0x0/0x725
>  [<c0104446>] ? do_notify_resume+0x735/0x74e
>  [<c0171d34>] ? mmap_region+0x187/0x405
>  [<c0171eeb>] ? mmap_region+0x33e/0x405
>  [<c010a8c2>] ? set_tls_desc+0x13d/0x156
>  [<c010ad25>] ? do_set_thread_area+0xa6/0xba
>  [<c02ffea2>] ? do_page_fault+0x0/0x725
>  [<c02fe952>] error_code+0x72/0x78
>  [<c02f0000>] ? clip_ioctl+0x3f5/0x45c
> CPU    0: hi:    0, btch:   1 usd:   0
> CPU    1: hi:    0, btch:   1 usd:   0
> CPU    0: hi:  186, btch:  31 usd: 149
> CPU    1: hi:  186, btch:  31 usd: 183
> CPU    0: hi:  186, btch:  31 usd: 136
> CPU    1: hi:  186, btch:  31 usd: 106
> 191968 total pagecache pages
> 521809 pages of RAM
> 292433 pages of HIGHMEM
> 5290 reserved pages
> 38300 pages shared
> 189736 pages swap cached
> 4584 pages slab
> 1504 pages pagetables
> BUG: unable to handle kernel paging request at f8000000
> *pde = 00000000 
> Oops: 0000 [#1] PREEMPT SMP 
> Pid: 12197, comm: rsync Not tainted (2.6.25.8-vs2.3.x-vs2.3.0.34.12 #3)
> EIP: 0060:[<f89c6854>] EFLAGS: 00210246 CPU: 1
> Using defaults from ksymoops -t elf32-i386 -a i386
> EAX: 00000000 EBX: f7fffe90 ECX: f7fffffe EDX: 00000000
> ESI: 00000000 EDI: 00000800 EBP: c248ddd0 ESP: c248dd5c
>  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
> Stack: f7d6b000 c248de64 c9960e80 dbc20504 c248ddb4 00001000 ddb78860
> f0f2b4b0 
>        f8000000 f7d6a000 f7fff000 0000002e f7fffff8 c9ada62a 00000f78
>        c248ddd4 
>        f89c6b1a f7d6a000 f7fffe90 00000058 c9bd73f4 875f4d2a 00000002
>        f7b56068 
> Call Trace:
>  [<f89c6b1a>] ? add_dirent_to_buf+0x6c/0x269 [ext3]
>  [<f89c70df>] ? ext3_add_entry+0x3c8/0x77f [ext3]
>  [<c012a34a>] ? capable+0x10/0x12
>  [<c0184f38>] ? generic_permission+0x8c/0xc6
>  [<c02fe715>] ? _spin_unlock+0xf/0x23
>  [<f898ea69>] ? start_this_handle+0x2a7/0x2c6 [jbd]
>  [<f89c7a71>] ? ext3_add_nondir+0x15/0x42 [ext3]
>  [<f89c7c3a>] ? ext3_link+0xb4/0xea [ext3]
>  [<c0185c50>] ? vfs_link+0x132/0x1c3
>  [<c01889ef>] ? sys_linkat+0xb2/0xee
>  [<c0181c36>] ? sys_lstat64+0x22/0x28
>  [<c0188a3f>] ? sys_link+0x14/0x16
>  [<c0104ad4>] ? sysenter_past_esp+0x6d/0xa5
> Code: 00 00 00 89 79 f8 89 f1 3b 4d d4 77 d7 85 c0 74 07 8b 75 bc 31 c0 eb ee
> 8b 55 b8 8b 7d a0 8b 5d d4 d1 ef 8d 4c d3 fe 31 f6 31 d2 <8b> 19 83 e9 08 89
> d8
> 66 d1 e8 0f b7 c0 8d 04 02 39 f8 77 08 0f 
> 
> 
> >>EIP; f89c6854 <END_OF_CODE+3852d854/????>   <=====
> 
> Trace; f89c6b1a <END_OF_CODE+3852db1a/????>
> Trace; f89c70df <END_OF_CODE+3852e0df/????>
> Trace; c012a34a <capable+10/12>
> Trace; c0184f38 <generic_permission+8c/c6>
> Trace; c02fe715 <_spin_unlock+f/23>
> Trace; f898ea69 <END_OF_CODE+384f5a69/????>
> Trace; f89c7a71 <END_OF_CODE+3852ea71/????>
> Trace; f89c7c3a <END_OF_CODE+3852ec3a/????>
> Trace; c0185c50 <vfs_link+132/1c3>
> Trace; c01889ef <sys_linkat+b2/ee>
> Trace; c0181c36 <sys_lstat64+22/28>
> Trace; c0188a3f <sys_link+14/16>
> Trace; c0104ad4 <sysenter_past_esp+6d/a5>
> 
> Code;  f89c6829 <END_OF_CODE+3852d829/????>
> 00000000 <_EIP>:
> Code;  f89c6829 <END_OF_CODE+3852d829/????>
>    0:   00 00                     add    %al,(%eax)
> Code;  f89c682b <END_OF_CODE+3852d82b/????>
>    2:   00 89 79 f8 89 f1         add    %cl,-0xe760787(%ecx)
> Code;  f89c6831 <END_OF_CODE+3852d831/????>
>    8:   3b 4d d4                  cmp    -0x2c(%ebp),%ecx
> Code;  f89c6834 <END_OF_CODE+3852d834/????>
>    b:   77 d7                     ja     ffffffe4 <_EIP+0xffffffe4>
> Code;  f89c6836 <END_OF_CODE+3852d836/????>
>    d:   85 c0                     test   %eax,%eax
> Code;  f89c6838 <END_OF_CODE+3852d838/????>
>    f:   74 07                     je     18 <_EIP+0x18>
> Code;  f89c683a <END_OF_CODE+3852d83a/????>
>   11:   8b 75 bc                  mov    -0x44(%ebp),%esi
> Code;  f89c683d <END_OF_CODE+3852d83d/????>
>   14:   31 c0                     xor    %eax,%eax
> Code;  f89c683f <END_OF_CODE+3852d83f/????>
>   16:   eb ee                     jmp    6 <_EIP+0x6>
> Code;  f89c6841 <END_OF_CODE+3852d841/????>
>   18:   8b 55 b8                  mov    -0x48(%ebp),%edx
> Code;  f89c6844 <END_OF_CODE+3852d844/????>
>   1b:   8b 7d a0                  mov    -0x60(%ebp),%edi
> Code;  f89c6847 <END_OF_CODE+3852d847/????>
>   1e:   8b 5d d4                  mov    -0x2c(%ebp),%ebx
> Code;  f89c684a <END_OF_CODE+3852d84a/????>
>   21:   d1 ef                     shr    %edi
> Code;  f89c684c <END_OF_CODE+3852d84c/????>
>   23:   8d 4c d3 fe               lea    -0x2(%ebx,%edx,8),%ecx
> Code;  f89c6850 <END_OF_CODE+3852d850/????>
>   27:   31 f6                     xor    %esi,%esi
> Code;  f89c6852 <END_OF_CODE+3852d852/????>
>   29:   31 d2                     xor    %edx,%edx
> Code;  f89c6854 <END_OF_CODE+3852d854/????>   <=====
>   2b:   8b 19                     mov    (%ecx),%ebx   <=====
> Code;  f89c6856 <END_OF_CODE+3852d856/????>
>   2d:   83 e9 08                  sub    $0x8,%ecx
> Code;  f89c6859 <END_OF_CODE+3852d859/????>
>   30:   89 d8                     mov    %ebx,%eax
> Code;  f89c685b <END_OF_CODE+3852d85b/????>
>   32:   66 d1 e8                  shr    %ax
> Code;  f89c685e <END_OF_CODE+3852d85e/????>
>   35:   0f b7 c0                  movzwl %ax,%eax
> Code;  f89c6861 <END_OF_CODE+3852d861/????>
>   38:   8d 04 02                  lea    (%edx,%eax,1),%eax
> Code;  f89c6864 <END_OF_CODE+3852d864/????>
>   3b:   39 f8                     cmp    %edi,%eax
> Code;  f89c6866 <END_OF_CODE+3852d866/????>
>   3d:   77 08                     ja     47 <_EIP+0x47>
> Code;  f89c6868 <END_OF_CODE+3852d868/????>
>   3f:   
> 
> EIP: [<f89c6854>] do_split+0x1ee/0x419 [ext3] SS:ESP 0068:c248dd5c
> Warning (Oops_read): Code line not seen, dumping what data is available
> 
> 
> >>EIP; f89c6854 <END_OF_CODE+3852d854/????>   <=====
> 
> 
> 2 warnings and 1 error issued.  Results may not be reliable.
> 
Comment 3 Asheesh Laroia 2008-07-29 18:09:12 UTC
On Tue, 29 Jul 2008, Andrew Morton wrote:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).

Great, will do.  I've rpepared an e2image -s now that follows this chain 
of events:

* ext3 BUG()
* reboot
* filesystem is mounted once
* e2image snapshots taken: both with and without -s

I have not run fsck between the BUG() and the snapshots I took, so it will 
hopefully still be useful.  I will publish the one with "-s"; I will stash 
away the one without "-s" in case it is helpful.

I then ran fsck on the filesystem so I could proceed with using the disk.

The full image is uploading now; I'll let you all know when it's done in 
2-3 days (it's ca. 1.5 gibibytes).

-- Asheesh.
Comment 4 Theodore Tso 2008-07-29 19:49:52 UTC
Hmm... disassembling the code, it's pretty clear the problem is here
in do_split(), around line 1208:

	map = (struct dx_map_entry *) (data2 + blocksize);
	count = dx_make_map ((struct ext3_dir_entry_2 *) data1,
			     blocksize, hinfo, map);
	map -= count;
	dx_sort_map (map, count);
	/* Split the existing block in the middle, size-wise */
	size = 0;
	move = 0;
	for (i = count-1; i >= 0; i--) {
		/* is more than half of this entry in 2nd half of the block? */
		if (size + map[i].size/2 > blocksize/2)  <====
			break;
		size += map[i].size;
		move++;
	}

I suspect dx_make_map returned 0, and since i and count are an
unsigned int, things started to go downhill after i was assigned the
value of ((unsigned) -1).

How could this happen?  The only way thing I can think of is a
directory block with a large number of deleted directory entries that
were not coalesced for some reason.  That shouldn't happen normally,
but that's the only explanation I can think of why we might be trying
to split an directory block where dx_make_map() is returning 0.

Asheesh, are you willing to recompile a kernel with some debugging
code inserted?

						- Ted
Comment 5 Eric Sandeen 2008-07-29 20:03:52 UTC
Theodore Tso wrote:
> Hmm... disassembling the code, it's pretty clear the problem is here
> in do_split(), around line 1208:
> 
>       map = (struct dx_map_entry *) (data2 + blocksize);
>       count = dx_make_map ((struct ext3_dir_entry_2 *) data1,
>                            blocksize, hinfo, map);
>       map -= count;
>       dx_sort_map (map, count);
>       /* Split the existing block in the middle, size-wise */
>       size = 0;
>       move = 0;
>       for (i = count-1; i >= 0; i--) {
>               /* is more than half of this entry in 2nd half of the block? */
>               if (size + map[i].size/2 > blocksize/2)  <====

You sure this isn't our old friend
https://bugzilla.redhat.com/show_bug.cgi?id=451068 ?

which version of gcc compiled this?

-Eric
Comment 6 Theodore Tso 2008-07-29 21:04:36 UTC
On Tue, Jul 29, 2008 at 10:03:38PM -0500, Eric Sandeen wrote:
> Theodore Tso wrote:
> > Hmm... disassembling the code, it's pretty clear the problem is here
> > in do_split(), around line 1208:
> > 
> >     map = (struct dx_map_entry *) (data2 + blocksize);
> >     count = dx_make_map ((struct ext3_dir_entry_2 *) data1,
> >                          blocksize, hinfo, map);
> >     map -= count;
> >     dx_sort_map (map, count);
> >     /* Split the existing block in the middle, size-wise */
> >     size = 0;
> >     move = 0;
> >     for (i = count-1; i >= 0; i--) {
> >             /* is more than half of this entry in 2nd half of the block? */
> >             if (size + map[i].size/2 > blocksize/2)  <====
> 
> You sure this isn't our old friend
> https://bugzilla.redhat.com/show_bug.cgi?id=451068 ?
> 
> which version of gcc compiled this?

As we discussed on IRC, I think you're theory is dead on.  %ecx is at
the very end of the page-2, which would correspond to
map[count-1].size.  And size (%esi) is zero, which rules out my scenario.

This very much looks like a GCC bug.  Asheesh, can you confirm which
version of GCC you used to build your kernel?

Longer term, do_split() was coded in a very non-robust fashion.
Looking at do_split(), it was pretty easy to imagine corrupted
directory blocks that might force count to be 0 (causing the for loop
to do something insane, since i is unsigned), and adding some checks
to make sure that the split variable is neither 0 nor equal to count
might also be a really good idea.

					- Ted
Comment 7 Asheesh Laroia 2008-07-30 07:56:35 UTC
On Wed, 30 Jul 2008, Theodore Tso wrote:

> On Tue, Jul 29, 2008 at 10:03:38PM -0500, Eric Sandeen wrote:
>> Theodore Tso wrote:
>>> Hmm... disassembling the code, it's pretty clear the problem is here
>>> in do_split(), around line 1208:
>>>
>>>     map = (struct dx_map_entry *) (data2 + blocksize);
>>>     count = dx_make_map ((struct ext3_dir_entry_2 *) data1,
>>>                          blocksize, hinfo, map);
>>>     map -= count;
>>>     dx_sort_map (map, count);
>>>     /* Split the existing block in the middle, size-wise */
>>>     size = 0;
>>>     move = 0;
>>>     for (i = count-1; i >= 0; i--) {
>>>             /* is more than half of this entry in 2nd half of the block? */
>>>             if (size + map[i].size/2 > blocksize/2)  <====
>>
>> You sure this isn't our old friend
>> https://bugzilla.redhat.com/show_bug.cgi?id=451068 ?
>>
>> which version of gcc compiled this?
>
> As we discussed on IRC, I think you're theory is dead on.  %ecx is at
> the very end of the page-2, which would correspond to
> map[count-1].size.  And size (%esi) is zero, which rules out my scenario.
>
> This very much looks like a GCC bug.  Asheesh, can you confirm which
> version of GCC you used to build your kernel?

gcc --version indicates:

gcc (Debian 4.3.1-2) 4.3.1

dpkg -l gcc reports:

ii  gcc            4:4.3.1-1      The GNU C compiler

> Longer term, do_split() was coded in a very non-robust fashion.
> Looking at do_split(), it was pretty easy to imagine corrupted
> directory blocks that might force count to be 0 (causing the for loop
> to do something insane, since i is unsigned), and adding some checks
> to make sure that the split variable is neither 0 nor equal to count
> might also be a really good idea.

Thanks for the speedy replies, all.  I guess then you're not interested in 
those e2image dumps I took, then.

I'm recompiling with GCC 4.2 now; is there a straightforward(ish) test 
you've seen that can indicate if the GCC 4.3 in Debian unstable or Debian 
testing still has this bug?  FWIW their changelogs are at 
http://packages.debian.org/changelogs/pool/main/g/gcc-4.3/gcc-4.3_4.3.1-8/changelog 
.

-- Asheesh.
Comment 8 Eric Sandeen 2008-07-30 08:07:24 UTC
FWIW, the gcc bug is at:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36533

-Eric
Comment 9 Adrian Bunk 2008-07-30 09:01:50 UTC
(In reply to comment #7)
> gcc --version indicates:
> 
> gcc (Debian 4.3.1-2) 4.3.1

Then it's this bug.

Not a kernel bug, and fixed in Debians 4.3.1-3 package.