Bug 8143

Summary: System freeze when two applications access a NAS while it is awakening
Product: Drivers Reporter: Eric Lacombe (tuxiko)
Component: NetworkAssignee: Francois Romieu (romieu)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: blocking CC: akpm, protasnb, sfrench, tuxiko
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.20.1 Subsystem:
Regression: --- Bisected commit-id:
Attachments: sequence that kill the host with the r8169 during the NAS awakening

Description Eric Lacombe 2007-03-07 14:12:16 UTC
Most recent kernel where this bug did *NOT* occur: none

Distribution: Gentoo

Hardware Environment: x86_64 Core 2 Duo, D-Link System Inc DGE-528T Gigabit
Ethernet Adapter 

Problem Description: 

http://lkml.org/lkml/2007/1/30/110

Steps to reproduce:

1. I wait for the NAS to sleep.

2. I access the NAS with the Amarok application to read a mp3 file. This 
action triggers the awakening of the NAS (~3 secs).

3. Before the NAS awakes, I access the NAS from another application : a file 
manager (Konqueror).
I click on the directory that contains the mp3 (which is the same directory 
that is accessed by amarok - this directory contains the mounted cifs volume 
of the NAS).

4. The system freezes.
Comment 1 Andrew Morton 2007-03-07 14:52:11 UTC
Well, this could be a net driver bug or it could be a CIFS bug.

One way to determine that would be to try the same operation 
from a machine whcih has a different type of NIC.  Or swap
out the NIC in that machine.  Is that possible?
Comment 2 Eric Lacombe 2007-03-08 11:38:26 UTC
I have tried 6-7 times to trigger the bug from my laptop since yesterday (IBM
T42 with an Intel Corporation 82540EP Gigabit Ethernet Controller) with both the
2.6.19.3 and 2.6.20.1 kernel, with no success.
It seems to be a net driver related bug.
Comment 3 Francois Romieu 2007-03-08 15:38:46 UTC
You have two options:
1. (shot in the dark) try http://bugzilla.kernel.org/attachment.cgi?id=10646
   against 2.6.21-rc3. Don't use netpoll.
2. start a 'tcpdump -s 0 -w /tmp/zoinx.pcap -i eth0' on your laptop and issue
   the same sequence that would otherwise kill the host with the r8169.

#2 is higher priority. I wonder what the NAS can push in the pipe when it
wakes up.

-- 
Ueimor
Comment 4 Eric Lacombe 2007-03-13 12:42:36 UTC
I just have a network switch and not a hub, so I can't monitor the connection 
between the NAS and the crashing system, from my laptop.

I will try to borrow a hub and do a tcpdump as soon as possible.
Comment 5 Eric Lacombe 2007-03-19 17:06:43 UTC
Created attachment 10854 [details]
sequence that kill the host with the r8169 during the NAS awakening

I have borrowed a Hub and so I have done a tcpdump.

I join the dump file, which contains the sequence that kill the host with the
r8169 during the NAS awakening.

Note that the NAS has the 192.168.1.60 address and the host that freezes has
the 192.168.1.10 address. (the laptop is 192.168.1.8).
Comment 6 Francois Romieu 2007-03-20 16:57:23 UTC
Thanks for your follow-up.

> 1. I wait for the NAS to sleep.
> 
> 2. I access the NAS with the Amarok application to read a mp3 file. This 
> action triggers the awakening of the NAS (~3 secs).

I am not a CIFS guru but it is probably:

17 00:51:55.366975 192.168.1.10 -> 192.168.1.60 SMB Trans2 Request,
QUERY_PATH_INFO, Query File Unix Basic, Path: \Pop Rock Alternatif
[...]
30 00:51:55.390461 192.168.1.10 -> 192.168.1.60 SMB Trans2 Request,
QUERY_PATH_INFO, Query File Unix Basic, Path: \Pop Rock
Alternatif\Radiohead\b-sides\The Bends\permanent daylight.mp3
31 00:51:55.390961 192.168.1.60 -> 192.168.1.10 SMB Trans2 Response, QUERY_PATH_INFO

> 3. Before the NAS awakes, I access the NAS from another application : a file 
> manager (Konqueror).
> I click on the directory that contains the mp3 (which is the same directory 
> that is accessed by amarok - this directory contains the mounted cifs volume 
> of the NAS).

36 00:51:58.182445 192.168.1.10 -> 192.168.1.60 SMB Trans2 Request,
QUERY_PATH_INFO, Query File Unix Basic, Path: \.directory

[lots of stuff related to 30 and 31 but not to 36 ?]
58 00:52:03.099267 192.168.1.60 -> 192.168.1.10 SMB Read AndX Response, FID:
0x2be2, 16384 bytes
59 00:52:03.099518 192.168.1.10 -> 192.168.1.60 TCP 35478 > microsoft-ds [ACK]
Seq=1419 Ack=11296 Win=5065 Len=0 
[...]
64 00:52:03.099772 192.168.1.10 -> 192.168.1.60 TCP 35478 > microsoft-ds [ACK]
Seq=1419 Ack=17583 Win=5065 Len=0 TSV=31819739 TSER=53387879
65 00:52:03.100017 192.168.1.10 -> 192.168.1.60 TCP 35478 > microsoft-ds
[FIN,ACK] Seq=1419 Ack=17583 Win=5065 Len=0 TSV=31819739 TSER=53387879
66 00:52:03.100019 192.168.1.10 -> 192.168.1.60 TCP 35478 > microsoft-ds
[RST,ACK] Seq=1420 Ack=17583 Win=5065 Len=0 TSV=31819739 TSER=53387879

Connexion closed by the client.

67 00:52:03.100020 192.168.1.10 -> 192.168.1.60 TCP 44317 > microsoft-ds [SYN]
Seq=0 Len=0 MSS=1460 TSV=31819739 TSER=0 WS=5

Connexion started again.

> 4. The system freezes.

68 00:52:03.100267 192.168.1.60 -> 192.168.1.10 TCP microsoft-ds > 44317
[SYN,ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 TSV=53387880 TSER=31819739 WS=1
69 00:52:07.098344 192.168.1.60 -> 192.168.1.10 TCP microsoft-ds > 44317
[SYN,ACK] Seq=0 Ack=1 Win=11584 Len=0 MSS=1460 TSV=53388280 TSER=31819739 WS=1
70 00:52:13.098335 192.168.1.60 -> 192.168.1.10 TCP microsoft-ds > 44317
[SYN,ACK] Seq=0 Ack=1 Win=11584 Len=0 MSS=1460 TSV=53388880 TSER=31819739 WS=1

Client dead.

Can you send the dump of the same sequence with the e1000 ?

You could try to add 16 bytes to RX_BUF_SIZE in the r8169 driver but I do not
expect it to make a difference. Feel free to ignore this suggestion for now if
you do not have much time.

-- 
Ueimor
Comment 7 Eric Lacombe 2007-03-21 18:12:18 UTC
I have monitor from the 192.168.1.10 machine, the communication between the NAS
(192.168.1.60) and my laptop (192.168.1.8) with the e1000. I have carried out
the same sequence.

Also, the file is just bigger than 1Mo (while I used the same tcpdump command
??!), so I can't upload it on bugzilla.kernel.org. Thus, I send you this dump by
email.
Comment 8 Francois Romieu 2007-03-22 15:36:18 UTC
I have compared the two sequences with 
1 - tethereal -l -V -t a -n -r nas-ko.pcap 'ip.addr == 192.168.1.10 or ip.addr
== 192.168.1.60'
2 - tethereal -l -V -t a -n -r nas-ok.pcap 'ip.addr == 192.168.1.8 or ip.addr ==
192.168.1.60'

(nas-ok.pcap received by e-mail as bugzilla barfed on it)

Before the host dies in the capture 1, the sequence looks like:
[...]
 56 00:52:03.097519 192.168.1.60 -> 192.168.1.10 TCP [TCP segment of a
reassembled PDU]
 57 00:52:03.098768 192.168.1.60 -> 192.168.1.10 TCP [TCP segment of a
reassembled PDU]
 58 00:52:03.099267 192.168.1.60 -> 192.168.1.10 SMB Read AndX Response, FID:
0x2be2, 16384 bytes

(this is tethereal output without the verbose -V flag)

Capture 2 seems close:
[...]
 57 01:48:31.185867 192.168.1.60 -> 192.168.1.8  TCP [TCP segment of a
reassembled PDU]
 58 01:48:31.187098 192.168.1.60 -> 192.168.1.8  TCP [TCP segment of a
reassembled PDU]
 59 01:48:31.188388 192.168.1.60 -> 192.168.1.8  TCP [TCP segment of a
reassembled PDU]
 60 01:48:31.189072 192.168.1.60 -> 192.168.1.8  SMB Read AndX Response, FID:
0x1cc7, 16384 bytes

However, the detailled output of frame #58 in 1 contains:
[...]
Frame 58 (561 bytes on wire, 561 bytes captured)
    Arrival Time: Mar 20, 2007 00:52:03.099267000
    [Time delta from previous packet: 0.000499000 seconds]
[...]
Internet Protocol, Src: 192.168.1.60 (192.168.1.60), Dst: 192.168.1.10
(192.168.1.10)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
        0000 00.. = Differentiated Services Codepoint: Default (0x00)
        .... ..0. = ECN-Capable Transport (ECT): 0
        .... ...0 = ECN-CE: 0
    Total Length: 547
[...]
Transmission Control Protocol, Src Port: 445 (445), Dst Port: 35478 (35478),
Seq: 17088, Ack: 1419, Len: 495
    Source port: 445 (445)
    Destination port: 35478 (35478)
[...]
NetBIOS Session Service
    Message Type: Session message
    Length: 16443
SMB (Server Message Block Protocol)
    SMB Header
        Server Component: SMB
        [Response to: 32]
        [Time from request: 7.707806000 seconds]
        SMB Command: Read AndX (0x2e)
        NT Status: STATUS_SUCCESS (0x00000000)
[...]
        Byte Count (BCC): 16384
        File Data: 4944330300000000046954414C42000000100000004D7920...
NetBIOS Session Service
    Message Type: Session message
    Length: 35
SMB (Server Message Block Protocol)
    SMB Header
        Server Component: SMB
        [Response to: 36]
        [Time from request: 4.916822000 seconds]
        SMB Command: Trans2 (0x32)
        NT Status: STATUS_OBJECT_NAME_NOT_FOUND (0xc0000034)
[...]

Two SMB back to back ?

Assuming I understand the output of tethereal correctly, the same SMB is never
piggybacked in stream 2. In stream 2, it appears in a packet of its own as:
[...]
 65 01:48:31.189732 192.168.1.60 -> 192.168.1.8  SMB Trans2 Response,
QUERY_PATH_INFO, Error: STATUS_OBJECT_NAME_NOT_FOUND

Steve, could you give a look at the attachment 10854 [details] above and tell if it is
completely safe for the CIFS stack to receive this thing

-- 
Ueimor



Comment 9 Eric Lacombe 2007-04-29 03:59:59 UTC
I've just tried to trigger the bug with the 2.6.21.
It is still present :/
Comment 10 Natalie Protasevich 2007-07-04 23:29:44 UTC
Any update on this bug? Is it still present in the latest kernel? (2.6.22+)
Thanks.
Comment 11 Eric Lacombe 2007-07-07 04:15:35 UTC
It seems that I can't trigger it anymore on a 2.6.22-rc7-g190045d5 ;)
Natalie Protasevich, do you know the "commit" that seems to have resolved the bug ?
Comment 12 Francois Romieu 2007-07-07 04:34:21 UTC
Can you try git bisect ?

There is no important change in the r8169 driver between 2.6.21.5 and
current mainline. I'd really like to know if this bug is fixed or if
it is simply hidden by some timing change (or such).

Btw expect some changes in the r8169 driver during 2.6.23-rc1.
Comment 13 Natalie Protasevich 2007-07-07 10:10:16 UTC
It is probably most likely that it's become hidden. I can only see some TCP patches, but nothing significant. (But there is a bit of hope someone fixed something that was causing the bug - directly or indirectly. git bisect would be the best)
I would check if any of bits and pieces have changed in your test environment, maybe something in user space that now is not triggering the bug.
Comment 14 Eric Lacombe 2007-07-07 16:26:07 UTC
yep, I'll do a git bisect as soon as possible and I'll recheck a 2.6.21 to validate my test environment again.
Comment 15 Eric Lacombe 2007-07-10 17:26:05 UTC
I didn't finish yet and I just begin to git-bisect but it takes lot of time and I have not a lot for now.

However, I checked a 2.6.21.5 and I triggered the bug.
I also checked a 2.6.22-rc1, and cannot be able to trigger it.
But maybe it is hidden as you said.

I will give you more results next time.
Comment 16 Eric Lacombe 2007-07-17 16:55:58 UTC
I have some partial results with git-bisect. I post them and hope that it would sufficiently restrict the set of commits (git-bisect say that there are 21 revisions left to test after the new one I don't test yet).

Note that for the commit 5cd471551 there is something strange: after I compiled the kernel and booted on it I see with 'uname -a': Linux 2.6.21-rc7 ??! (<- look at the 2 first tests)

Anyway via git bisect-visualize it seems there is a lot of commits from CIFS (and some from KVM but... ;) However I don't take time to well understand the figure shown by gitk, so I surely missed something.

----------

good -> I set "good" when I could trigger the bug.
bad -> I set "bad" when I couldn't

git-bisect start
# bad: [39403865d2e4590802553370a56c9ab93131e4ee] Linux 2.6.22-rc1
git-bisect bad 39403865d2e4590802553370a56c9ab93131e4ee
# good: [de46c33745f5e2ad594c72f2cf5f490861b16ce1] Linux 2.6.21
git-bisect good de46c33745f5e2ad594c72f2cf5f490861b16ce1
# good: [ea62ccd00fd0b6720b033adfc9984f31130ce195] Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
git-bisect good ea62ccd00fd0b6720b033adfc9984f31130ce195
# bad: [215d06780d13fd7de629b02b61b7b7bf88ce5039] Fix sunrpc warning noise
git-bisect bad 215d06780d13fd7de629b02b61b7b7bf88ce5039
# bad: [e7d709c096487078652a1384d7a2d0e4459e18b6] is_power_of_2 in fat
git-bisect bad e7d709c096487078652a1384d7a2d0e4459e18b6
# bad: [aeecf3142d82414d511135cc85f86caddfb58338] Convert h8/300 to generic timekeeping
git-bisect bad aeecf3142d82414d511135cc85f86caddfb58338
# bad: [6de410c2b0cc055ae9ee640c84331f6a70878d9b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
git-bisect bad 6de410c2b0cc055ae9ee640c84331f6a70878d9b
# good: [5cd47155155a32e5b944ac9fc3f3dc578e429aa0] Merge branch 'ixp4xx' into devel
git-bisect good 5cd47155155a32e5b944ac9fc3f3dc578e429aa0
# good: [5008fdf5b6a31240da060c0867d8f16f08ce2384] KVM: Use kernel-standard types
git-bisect good 5008fdf5b6a31240da060c0867d8f16f08ce2384
Comment 17 Francois Romieu 2007-07-18 15:02:39 UTC
tuxiko@free.fr  2007-07-17 16:55 :
[...]
> I have some partial results with git-bisect. I post them and hope that it
> would
> sufficiently restrict the set of commits (git-bisect say that there are 21
> revisions left to test after the new one I don't test yet).

It is very frustrating.

You should really git reset --hard 2ff81f70b56dc1cdd3bf2f08414608069db6ef1a
and try it.

It will tell if the trigger is in KVM or in CIFS.

> Note that for the commit 5cd471551 there is something strange: after I
> compiled
> the kernel and booted on it I see with 'uname -a': Linux 2.6.21-rc7 ??! (<-
> look at the 2 first tests)

Sorry to be dense but why do you think it is strange ?

5cd471551 produces a 2.6.21-rc7 Makefile here.
Comment 18 Eric Lacombe 2007-07-18 19:09:20 UTC
Sorry I forget to precise in my previous message that I would give you more precise results when I'll have time to go on. I expected that my first results would give you some ideas about the problem waiting for the new tests.

Now I come with more precise results (it remains 4 not tested commits of which one is clearly not relevant ;)

Well, it's in KVM (I forgot that I have enabled it since the beginning so I thought I missed a test).

(About the commit 5cd471551, what makes me puzzled is that 2.6.21-rc7 is before my first good-mark 2.6.21. But I suppose that it stems from the merging of a different branch? anyway I will appreciate an insight about that).

------------
git-bisect start
# bad: [39403865d2e4590802553370a56c9ab93131e4ee] Linux 2.6.22-rc1
git-bisect bad 39403865d2e4590802553370a56c9ab93131e4ee
# good: [de46c33745f5e2ad594c72f2cf5f490861b16ce1] Linux 2.6.21
git-bisect good de46c33745f5e2ad594c72f2cf5f490861b16ce1
# good: [ea62ccd00fd0b6720b033adfc9984f31130ce195] Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
git-bisect good ea62ccd00fd0b6720b033adfc9984f31130ce195
# bad: [215d06780d13fd7de629b02b61b7b7bf88ce5039] Fix sunrpc warning noise
git-bisect bad 215d06780d13fd7de629b02b61b7b7bf88ce5039
# bad: [e7d709c096487078652a1384d7a2d0e4459e18b6] is_power_of_2 in fat
git-bisect bad e7d709c096487078652a1384d7a2d0e4459e18b6
# bad: [aeecf3142d82414d511135cc85f86caddfb58338] Convert h8/300 to generic timekeeping
git-bisect bad aeecf3142d82414d511135cc85f86caddfb58338
# bad: [6de410c2b0cc055ae9ee640c84331f6a70878d9b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
git-bisect bad 6de410c2b0cc055ae9ee640c84331f6a70878d9b
# good: [5cd47155155a32e5b944ac9fc3f3dc578e429aa0] Merge branch 'ixp4xx' into devel
git-bisect good 5cd47155155a32e5b944ac9fc3f3dc578e429aa0
# good: [5008fdf5b6a31240da060c0867d8f16f08ce2384] KVM: Use kernel-standard types
git-bisect good 5008fdf5b6a31240da060c0867d8f16f08ce2384
# bad: [94dfbdb3894eda2f673b70e20da2743c4a8d3968] KVM: SVM: Only save/restore MSRs when needed
git-bisect bad 94dfbdb3894eda2f673b70e20da2743c4a8d3968
# good: [4d56c8a787aefb2e3fc4ac4be966db96c14d1ad8] KVM: VMX: Only save/restore MSR_K6_STAR if necessary
git-bisect good 4d56c8a787aefb2e3fc4ac4be966db96c14d1ad8
# good: [7807fa6ca5af2e5660a0eb3cd90276ca0c5bdfc8] KVM: Lazy FPU support for SVM
git-bisect good 7807fa6ca5af2e5660a0eb3cd90276ca0c5bdfc8
-------------

I plan to finish the last tests tomorrow but if you don't need more say it.
Comment 19 Eric Lacombe 2007-07-19 02:19:25 UTC
I forgot to post the 4 remaining commits:

2807696c3791d6dd1dcf20f022eaa2dc7615bc5d
2ab455ccceb07945368709ba852e49f4c3119331
25c4c2762e31a75403eca0dd59f2cab85e3a2532
(e0e5127d06957e76da3906b7a58d5d2665e81f59)
Comment 20 Eric Lacombe 2007-07-20 05:08:26 UTC
After finishing my test, I realized I made at least an error on a test, cause the git-bisect targeted commit was this one:

# bad: [94dfbdb3894eda2f673b70e20da2743c4a8d3968] KVM: SVM: Only save/restore
MSRs when needed

but it is really unrelated to the problem.
So I rechecked this one and I was able to trigger the bug.
So now it remains some commits to test but there are in a bunch of CIFS commits, what seems more suitable.

However I can only guarantee that I didn't make an error for the commits I tagged good, cause I was able to trigger the bug.
(- It seems the bug easily triggers itself when I do the right things, but maybe I act too quickly on the commit I was mistaken;
- Remember that to made a sole test I need to wait 15min for the NAS to sleep and then I proceed to trigger the bug so I usually did a single test per commit.)

I proceed now with git bisect on those CIFS commits (hoping I only do a single error).
I will give you the results as soon as possible.

the updated "git bisect log":
-----------
git-bisect start
# bad: [39403865d2e4590802553370a56c9ab93131e4ee] Linux 2.6.22-rc1
git-bisect bad 39403865d2e4590802553370a56c9ab93131e4ee
# good: [de46c33745f5e2ad594c72f2cf5f490861b16ce1] Linux 2.6.21
git-bisect good de46c33745f5e2ad594c72f2cf5f490861b16ce1
# good: [ea62ccd00fd0b6720b033adfc9984f31130ce195] Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
git-bisect good ea62ccd00fd0b6720b033adfc9984f31130ce195
# bad: [215d06780d13fd7de629b02b61b7b7bf88ce5039] Fix sunrpc warning noise
git-bisect bad 215d06780d13fd7de629b02b61b7b7bf88ce5039
# bad: [e7d709c096487078652a1384d7a2d0e4459e18b6] is_power_of_2 in fat
git-bisect bad e7d709c096487078652a1384d7a2d0e4459e18b6
# bad: [aeecf3142d82414d511135cc85f86caddfb58338] Convert h8/300 to generic timekeeping
git-bisect bad aeecf3142d82414d511135cc85f86caddfb58338
# bad: [6de410c2b0cc055ae9ee640c84331f6a70878d9b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
git-bisect bad 6de410c2b0cc055ae9ee640c84331f6a70878d9b
# good: [5cd47155155a32e5b944ac9fc3f3dc578e429aa0] Merge branch 'ixp4xx' into devel
git-bisect good 5cd47155155a32e5b944ac9fc3f3dc578e429aa0
# good: [5008fdf5b6a31240da060c0867d8f16f08ce2384] KVM: Use kernel-standard types
git-bisect good 5008fdf5b6a31240da060c0867d8f16f08ce2384
# good: [94dfbdb3894eda2f673b70e20da2743c4a8d3968] KVM: SVM: Only save/restore MSRs when needed
git-bisect good 94dfbdb3894eda2f673b70e20da2743c4a8d3968
-------
Comment 21 Eric Lacombe 2007-07-20 09:10:21 UTC
I think we have it! ;)

It is this commit that make the bug vanish:

-------------
aaf737adb6937339494d5a7111f0433cd9676db8 is first bad commit
commit aaf737adb6937339494d5a7111f0433cd9676db8
Author: Igor Mammedov <qwerty0987654321@mail.ru>
Date:   Tue Apr 3 19:16:43 2007 +0000

    [CIFS] Switch cifsd to kthread_run from kernel_thread

    cifsd was the only cifs thread that had not been switched to the newer
    kthread interface

    Signed-off-by: Igor Mammedov <niallain at gmail.com>
    Signed-off-by: Wilhelm Meier <wilhelm.meier@fh-kl.de>
    Signed-off-by: Steve French <sfrench@us.ibm.com>

:040000 040000 b367eb72d44b719420a26661da434ed51f900297 3e64cb1424d3db57a640054e7bad21fe3a817726 M      fs
--------------

And here the complete git-bisect log:

Right after the commit 94dfbdb3894, git-bisect give me a commit inside a bunch of CIFS commits that "oopsed" during the sequence that triggers the bug but didn't crash the system.

Therefore I "git-reset --hard" to aaf737adb6 that seems a good target to test to me. I did 3 tests and was not able to trigger the bug. Then, I tested the last commit c33f8d32746 and was able to trigger the bug at the first test. Hence the outcome.

---------
git-bisect start
# bad: [39403865d2e4590802553370a56c9ab93131e4ee] Linux 2.6.22-rc1
git-bisect bad 39403865d2e4590802553370a56c9ab93131e4ee
# good: [de46c33745f5e2ad594c72f2cf5f490861b16ce1] Linux 2.6.21
git-bisect good de46c33745f5e2ad594c72f2cf5f490861b16ce1
# good: [ea62ccd00fd0b6720b033adfc9984f31130ce195] Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
git-bisect good ea62ccd00fd0b6720b033adfc9984f31130ce195
# bad: [215d06780d13fd7de629b02b61b7b7bf88ce5039] Fix sunrpc warning noise
git-bisect bad 215d06780d13fd7de629b02b61b7b7bf88ce5039
# bad: [e7d709c096487078652a1384d7a2d0e4459e18b6] is_power_of_2 in fat
git-bisect bad e7d709c096487078652a1384d7a2d0e4459e18b6
# bad: [aeecf3142d82414d511135cc85f86caddfb58338] Convert h8/300 to generic timekeeping
git-bisect bad aeecf3142d82414d511135cc85f86caddfb58338
# bad: [6de410c2b0cc055ae9ee640c84331f6a70878d9b] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm
git-bisect bad 6de410c2b0cc055ae9ee640c84331f6a70878d9b
# good: [5cd47155155a32e5b944ac9fc3f3dc578e429aa0] Merge branch 'ixp4xx' into devel
git-bisect good 5cd47155155a32e5b944ac9fc3f3dc578e429aa0
# good: [5008fdf5b6a31240da060c0867d8f16f08ce2384] KVM: Use kernel-standard types
git-bisect good 5008fdf5b6a31240da060c0867d8f16f08ce2384
# good: [94dfbdb3894eda2f673b70e20da2743c4a8d3968] KVM: SVM: Only save/restore MSRs when needed
git-bisect good 94dfbdb3894eda2f673b70e20da2743c4a8d3968
# bad: [aaf737adb6937339494d5a7111f0433cd9676db8] [CIFS] Switch cifsd to kthread_run from kernel_thread
git-bisect bad aaf737adb6937339494d5a7111f0433cd9676db8
# good: [c33f8d32746db12ba353b0a05b25f7893a0ac344] [CIFS] Remove unnecessary checks
git-bisect good c33f8d32746db12ba353b0a05b25f7893a0ac344
----------------
Comment 22 Natalie Protasevich 2007-07-20 20:02:29 UTC
How interesting, speaking of indirect fixes... :) Great job bisecting Eric. 
Comment 23 Eric Lacombe 2008-06-10 05:34:28 UTC
The bug reappeared with the 2.6.25.4 (maybe before).

I don't have time for the moment to do some new tests (+ I don't have access for the moment to the NAS).
Nonetheless, I hope that all the information I already provided (git bisect, tcpdump, etc.) will be sufficient to pinpoint the problem.

Best regards.
Comment 24 Eric Lacombe 2008-09-10 07:11:23 UTC
It seems that the bug disappeared... It didn't trigger with new kernel releases I've installed since then.