Bug 6831

Summary: io_getevents() says an extending write is complete before i_size is updated
Product: IO/Storage Reporter: Rafal Wijata (wijata)
Component: AIOAssignee: Zach Brown (zach.brown)
Status: CLOSED CODE_FIX    
Severity: high CC: jmoyer, suparna, zach.brown
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.9-34.0.1.ELsmp 2.6.16-gentoo-r6 2.6.16-1.2122_FC5smp Subsystem:
Regression: --- Bisected commit-id:
Attachments: testcase showing the race

Description Rafal Wijata 2006-07-14 02:14:32 UTC
Most recent kernel where this bug did not occur:
Distribution: gentoo RHAS4.3 FC5
Hardware Environment: various
Software Environment: libaio
Problem Description:
after io_getevents reports that write/appen was done, the data in file is still
unaccessible (neither reported by fstat() nor readable via io_submit())

Steps to reproduce:
gcc aiotest-wait.c -laio -lpthread
run attached test on SMP few times
for UP edit source and initialize thread_write with fun_write1()

NOTE: if synch is introduced (not fstating until write io_submit() returns) all
is fine. For test that compile with
gcc aiotest-wait.c -DDO_SYNCH -laio -lpthread
Comment 1 Rafal Wijata 2006-07-14 02:15:13 UTC
Created attachment 8547 [details]
testcase showing the race
Comment 2 Zach Brown 2006-07-14 12:32:41 UTC
> after io_getevents reports that write/appen was done, the data in file is
> still
> unaccessible (neither reported by fstat() nor readable via io_submit())

Yeah, this looks like a real bug.  The path under generic_file_direct_IO() is
calling aio_complete() before its caller updates i_size.  I think this test is
seeing it because it has another thread waiting in io_getevents() while another
thread is submitting the extending write.  Usually the completion isn't seen
until io_submit() returns after having updated i_size.

I won't have time to craft a patch today, I don't think, but I'll try during OLS
next week if no one beats me to it.
Comment 3 Zach Brown 2006-07-14 12:38:16 UTC
I reproduced the failure under 2.6.18-rc1-mm2 on a dual athlon.  (b.k.o yelled
at me when I tried to add it to the kernel version field.)
Comment 4 Rafal Wijata 2006-07-28 00:44:34 UTC
Zach, I'm not sure I understand You, as You only watch this thread
> I won't have time to craft a patch today,
You mean You going to create the patch?
Comment 5 Zach Brown 2006-07-28 10:24:30 UTC
> You mean You going to create the patch?

Yeah, I sent it off to the kernel mailing list but haven't gotten any response.  

http://lkml.org/lkml/2006/7/26/257

We need to make sure that this fix doesn't break other uses of O_DIRECT and aio
before we can merge it into the kernel.
Comment 6 Jeff Moyer 2006-09-13 13:40:21 UTC
Zach, I'm guessing that this patch is obsoleted by the patch series you sent out
on September 5th (which cleans up the error handling in the dio code).  Is that
right?
Comment 7 Zach Brown 2006-09-13 13:51:28 UTC
Yeah, the 'dio: clean up completion phase' patch serious addresses this problem.
 It's the final patch in the series that does it:

http://lkml.org/lkml/2006/9/5/268

The hunks that remove the EIOCBQUEUED translation from dio's callers are the
indication.

We're still working on testing the patch series.  We had an unrelated hardware
failure that is slowing things down :/.  It's been solid enough so far that I
might throw it in -mm once .19 opens.

- z
Comment 8 Rafal Wijata 2006-09-14 00:21:49 UTC
Zach - recently (for few days) I'm struggling with strange AIO behaviour.
I have HDD with some badblocks (or rather more than some;). In general case AIO
reads seem to operate OK(return EIO), but very rarely read returns unmodified
read-buffer and no error. The kernel is 2.6.9-34.ELsmp. I suspect aio bug - but
no clear evidence. Do You happen to know anything of such possible bug?
Comment 9 Zach Brown 2006-09-14 11:56:28 UTC
> but very rarely read returns unmodified
> read-buffer and no error. The kernel is 2.6.9-34.ELsmp. I suspect aio bug - but
> no clear evidence. Do You happen to know anything of such possible bug?

Hmm, I could imagine cases where it might lose an error that was generated from
block lookups but I don't know of any bugs which would specifically do this.

Is it possible that file system corruption has caused the file size to shrink
and for reads to be issued past the end of the file?  That would lead to reads
that succeed with 0 bytes read.

Are you in a position where you can try the patches to the kernel that were
referred to in comment 7?
Can you try the 
Comment 10 Rafal Wijata 2006-09-15 00:38:38 UTC
> Is it possible that file system corruption has caused the file size to shrink
> and for reads to be issued past the end of the file?  That would lead to reads
> that succeed with 0 bytes read.
I'm reading raw device in my test. Read with 0 bytes and unmodified buffer is OK
for me, hence I'm getting 4K data read and 4K unmodified buffer and no error.

> Are you in a position where you can try the patches to the kernel that were
> referred to in comment 7?
I may try to try ;) That's because I'm not allowed to use kernel other than
official RH and obviously can't run the test on another machine ;)
It's going to last a little.
Comment 11 Zach Brown 2008-01-03 14:46:19 UTC
The first race that this bug was filed for was fixed mainline in commit 8459d86aff04fa53c2ab6a6b9f355b3063cc8014 about a year ago.

We have a test which makes sure this bug doesn't regress:

http://git.kernel.org/?p=linux/kernel/git/zab/aio-dio-regress.git;a=blob;f=c/aio-dio-extend-stat.c

So this bug can be closed.  Any further bugs with distribution kernels should be taken up with the distribution provider.