Bug 11805 - mounting XFS produces a segfault
Summary: mounting XFS produces a segfault
Status: CLOSED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: XFS (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: XFS Guru
URL:
Keywords:
Depends on:
Blocks: Regressions-2.6.26
  Show dependency tree
 
Reported: 2008-10-21 18:00 UTC by Tiago Maluta
Modified: 2008-11-22 14:02 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.27-05577-g0cfd810-dirty
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Output from dmesg (3.62 KB, text/plain)
2008-10-21 18:01 UTC, Tiago Maluta
Details
check for memory allocation failure in xlog_alloc_log (2.96 KB, patch)
2008-10-21 21:55 UTC, Dave Chinner
Details | Diff

Description Tiago Maluta 2008-10-21 18:00:36 UTC
Latest working kernel version: 2.6.26.3 (at least in my tests)
Earliest failing kernel version:
Distribution: Gentoo
Hardware Environment: x86 (32 bits)
Problem Description:

I got a segfault when I trie to mount an XFS partition. If I tried once more my machine freeze.

Steps to reproduce:

~# modprobe xfs
~# mount -t xfs /dev/sda3 /mnt/folder
Segmentation Fault
Comment 1 Tiago Maluta 2008-10-21 18:01:19 UTC
Created attachment 18395 [details]
Output from dmesg
Comment 2 Dave Chinner 2008-10-21 20:36:48 UTC
Looks like allocating a buffer in xlog_alloc_log() via xfs_buf_get_noaddr()
is failing and we are not checking the return value before trying to
lock the buffer.

That implies you have not enough memory available to mount the filesystem.
The only thing I can see that would have caused this was the call to
alloc_pages(GFP_KERNEL) failed.

As it is, this is not a regression in XFS - we've never checked the return
value to xfs_buf_get_noaddr(). That is easily fixed - I'll have a patch in
a few minutes.

That being said - your kernel reports itself as:

2.6.27maluta-05577-g0cfd810-dirty #10

which, IIRC, indicates that you've built it from a git tree that
contains local modifications (the -dirty bit). Have you modified the
kernel you are running and if so, could those modifications be responsible
for lack of memory in the machine?
Comment 3 Tiago Maluta 2008-10-21 21:25:04 UTC
Yes. I modified the kernel, as long as I no the patch didn't affect memory. In fact the patch affects DMI (used to fix a bug on Lguest) http://marc.info/?l=linux-kernel&m=122445958110927&w=2
Comment 4 Dave Chinner 2008-10-21 21:55:24 UTC
Created attachment 18397 [details]
check for memory allocation failure in xlog_alloc_log

This patch should fix the mount ENOMEM problem. Can you try it?
Comment 5 Tiago Maluta 2008-10-22 08:09:41 UTC
Fixed. 
Thanks!
Comment 6 Rafael J. Wysocki 2008-10-26 04:06:33 UTC
On Sunday, 26 of October 2008, Dave Chinner wrote:
> On Sat, Oct 25, 2008 at 10:06:44PM +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.27.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=11805
> > Subject             : mounting XFS produces a segfault
> > Submitter   : Tiago Maluta <maluta_tiago@yahoo.com.br>
> > Date                : 2008-10-21 18:00 (5 days old)
> 
> Ah - this was reported as a 2.6.26 -> 2.6.27 regression, not a
> .27->.28-rcX regression.
> 
> Even so, it's not obviously an XFS regression as the problem is
> that alloc_pages(GFP_KERNEL) is the new failure on .27. The fact
> that XFS never handled the allocation failure is not a new bug
> or regression - it has never caught failures during log
> allocation...
> 
> So really, if you want to look for a regression here, it is the
> change of behaviour in the VM leading to a memory allocation failure
> where it has never, ever previously failed...
Comment 7 Rafael J. Wysocki 2008-10-26 04:08:31 UTC
Handled-By : Dave Chinner <dgc@sgi.com>
Patch : http://bugzilla.kernel.org/attachment.cgi?id=18397&action=view
Comment 8 Rafael J. Wysocki 2008-11-22 14:02:20 UTC
Fixed by commit 8f330f5149ef41ff943b04d914406cc417f62784 .

Note You need to log in before you can comment on or make changes to this bug.