Bug 7650

Summary: Data corruption with sendfile and SMP kernels
Product: Networking Reporter: Mark Groves (mjgroves)
Component: OtherAssignee: Arnaldo Carvalho de Melo (acme)
Status: REJECTED INVALID    
Severity: normal CC: nacc, rdunlap
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.19 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Test case with readme file

Description Mark Groves 2006-12-08 12:22:57 UTC
Most recent kernel where this bug did *NOT* occur: Unknown
Hardware Environment: Itanium or x86 chip with SMP enabled
Software Environment: 2.6 kernel with SMP enabled
Problem Description: Data received is corrupted - contains mix of expected data
and data from previous sendfile call.

Steps to reproduce:
1. Allocate a file buffer and mmap it.
                                                                               
                                                                             
2. Next, the buffer is filled with the contents of a file, or manually
filled with the same character via a loop.
                                                                               
                                                                             
3. A very small header is appended to the buffer (some blank space was
left during the previous step.)
                                                                               
                                                                             
4. The buffer is sent via sendfile()
                                                                               
                                                                             
5. The client receives the buffer and checks it for corruption.
Comment 1 Mark Groves 2006-12-08 12:28:24 UTC
Created attachment 9765 [details]
Test case with readme file
Comment 2 Nishanth Aravamudan 2006-12-08 14:19:20 UTC
2.6.15 is rather old, can you confirm this happens with 2.6.19?

Thanks, Nish
Comment 3 Mark Groves 2006-12-09 16:52:05 UTC
I can confirm that the bug appears on the 2.6.17 kernel. Unfortunately, I don't
have access to a machine with 2.6.19, and redhat hasn't updated the yum repo
yet. I will see if I manually update the kernel.
Comment 4 Mark Groves 2006-12-22 09:06:22 UTC
Sorry that took so long, the upgrade was not an easy one.

Anyway, I can confirm that the bug is still present in 2.6.19
Comment 5 Mark Groves 2007-01-23 09:41:45 UTC
I believe that this bug may be related to a similar one that was recently fixed,
where memory pages were not being properly marked as dirty, due to concurrent
access. (http://lkml.org/lkml/2006/12/29/44)

I applied the patch and got no change. Is there anywhere else where pages are
marked dirty, or not properly protected from multi-threaded access?