Bug 31662

Summary: CIFS client performance regression >=2.6.36
Product: File System Reporter: Till Schäfer (till2.schaefer)
Component: CIFSAssignee: Jeff Layton (jlayton)
Status: CLOSED CODE_FIX    
Severity: normal CC: florian, jlayton, kernel, sfrench
Priority: P1    
Hardware: All   
OS: Linux   
URL: https://bugs.gentoo.org/show_bug.cgi?id=357463
Kernel Version: 2.6.38 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: patch -- cifs: set ra_pages in BDI

Description Till Schäfer 2011-03-22 15:45:24 UTC
The CIFS Performance drops to 4-12 MiB/s (30-50 MiB/s in
gentoo-sources-2.6.34-r12) when using a Kernel >= 2.6.36 (36-r5, 37-r1 and 38
tested). This happens for gentoo-sources as for vanilla-sources. Some people
report 60 MiB/s if they use smbclient/get (non kernel implementation) and only
8 MiB/s with mount.cifs. 
(see also the forum discussion: https://forums.gentoo.org/viewtopic.php?p=6600761#6600761 and the gentoo bug report: https://bugs.gentoo.org/show_bug.cgi?id=357463)

Reproducible: Always

Steps to Reproduce:
1. install kernel >= 2.6.36
2. use mount.cifs
3. dd if=/samba/share of=/dev/null bs=1M count=1k
Actual Results:  
slow speed (4-12 MB/s)

Expected Results:  
higher speed (30-50 MB/s)
Comment 1 Till Schäfer 2011-03-22 15:54:50 UTC
In addition: 

- write speed is ok with 40MB/s. Only read is very slow. 
- a windows 7 client did 50MB/s for read on the same server.
Comment 2 Jeff Layton 2011-03-22 15:57:51 UTC
I looked over the other bugs but I don't see any numbers for the earlier kernels. 

You mentioned that this is a regression, in what kernel version do you first see the performance drop, and how large is that drop?
Comment 3 Till Schäfer 2011-03-22 16:06:23 UTC
In gentoo-sources-2.6.34-r12 i got 30-50MB/s.
In gentoo-sources-2.6.36-r5 it got about 4-12MB/s.

I did not test the 2.6.35 kernel yet. if it is needed -> tell me and i will try. 

so the drop is about 30MB/s or 80%.
Comment 4 Jeff Layton 2011-03-22 16:25:30 UTC
Please do try 2.6.35. If you can bisect this down to a narrow range of kernel versions then we may be able to determine what caused the issue.
Comment 5 Till Schäfer 2011-03-22 17:03:31 UTC
i did a complete new bench:
write dd if=/dev/zero of=/samba/share/testfile bs=1M count=1k
read: dd if=/samba/share/somefile of=/dev/null bs=1M with filesize 800 MB

gentoo-sources:
2.6.36-r5 -> 9MB/s read; 35MB/s write
2.6.35-r9 -> 22MB/s read; 35MB/s write
2.6.34-r12 -> 24MB/s read; 35MB/s write

So the speeds differ a bit from last time, but the tendency is the same.
Comment 6 Jeff Layton 2011-03-22 17:09:19 UTC
What do the -r numbers represent? Is there any way you could bisect this down to a particular mainline commit?
Comment 7 Jeff Layton 2011-03-22 17:11:32 UTC
Also, do you have CONFIG_CIFS_FSCACHE enabled? If so, could you turn it off an re-test?
Comment 8 Till Schäfer 2011-03-22 17:15:01 UTC
Symbol: CIFS_FSCACHE [=n]


I will test the vanilla sources (without gentoo patchset). They have the versioning of the kernel.org releases. it will take a while because i have to compile them.
Comment 9 Till Schäfer 2011-03-22 17:42:11 UTC
vanilla sources (kernel.org):
2.6.36.1 -> 9MB/s read; 36MB/s write
2.6.35.11 -> 22MB/s read; 35MB/s write
Comment 10 Jeff Layton 2011-03-23 19:22:16 UTC
Sorry, thought I had responded to this yesterday. Ok, that's good to know. I took a look through the cifs changelogs from that time and don't see any obvious patches that would affect this.

Perhaps what would be best at this point is to get some wire captures that compare the transfer between 2.6.36.1 and 2.6.35.11. Instructions on how to do that are here:

https://wiki.samba.org/index.php/LinuxCIFS_troubleshooting#Wire_Captures

If we can see any differences on the wire then that may help point out the problem.
Comment 11 Steve French 2011-03-23 20:22:36 UTC
Would also be helpful to look at the counters (how many of each type of SMB/CIFS network operation is sent):

cat /proc/fs/cifs/Stats

for each case (2.6.35 vs 2.6.36) to see if the number of writes, or number of some unexpected operation (open/close for example) or bytes written varies between them.
Comment 12 Till Schäfer 2011-03-24 18:52:52 UTC
tcpdumps:

http://www.blue-oranges.com/upload/data/tcpdump_35.11.pcap
http://www.blue-oranges.com/upload/data/tcpdump_36.1.pcap

command was: tcpdump -i eth0 -s0 -w /home/till/temp/tcpdum_35.11.pcap host dfs and port 445

I used a smaler file for testing because the dump would be very large otherwise. 

(The dfs is no dfs anymore. It is a samba server without distributed file system.)

cat /proc/fs/cifs/Stats:

35.11 before dd:

Resources in use
CIFS Session: 1
Share (unique mount targets): 2
SMB Request/Response Buffer: 1 Pool size: 5
SMB Small Req/Resp Buffer: 1 Pool size: 30
Operations (MIDs): 0

0 session 0 share reconnects
Total vfs operations: 11 maximum at one time: 2

1) \\dfs\dfs
SMBs: 5 Oplock Breaks: 0
Reads:  0 Bytes: 0
Writes: 0 Bytes: 0
Flushes: 0
Locks: 0 HardLinks: 0 Symlinks: 0
Opens: 0 Closes: 0 Deletes: 0
Posix Opens: 0 Posix Mkdirs: 0
Mkdirs: 0 Rmdirs: 0
Renames: 0 T2 Renames 0
FindFirst: 0 FNext 0 FClose 0


35.11 after dd:

Resources in use
CIFS Session: 1
Share (unique mount targets): 2
SMB Request/Response Buffer: 1 Pool size: 5
SMB Small Req/Resp Buffer: 1 Pool size: 30
Operations (MIDs): 0

0 session 0 share reconnects
Total vfs operations: 136 maximum at one time: 2

1) \\dfs\dfs
SMBs: 934 Oplock Breaks: 0
Reads:  921 Bytes: 15083650
Writes: 0 Bytes: 0
Flushes: 0
Locks: 0 HardLinks: 0 Symlinks: 0
Opens: 0 Closes: 1 Deletes: 0
Posix Opens: 1 Posix Mkdirs: 0
Mkdirs: 0 Rmdirs: 0
Renames: 0 T2 Renames 0
FindFirst: 0 FNext 0 FClose 0



36.1 before dd:

Resources in use
CIFS Session: 1
Share (unique mount targets): 2
SMB Request/Response Buffer: 1 Pool size: 5
SMB Small Req/Resp Buffer: 1 Pool size: 30
Operations (MIDs): 0

0 session 0 share reconnects
Total vfs operations: 11 maximum at one time: 2

1) \\dfs\dfs
SMBs: 5 Oplock Breaks: 0
Reads:  0 Bytes: 0
Writes: 0 Bytes: 0
Flushes: 0
Locks: 0 HardLinks: 0 Symlinks: 0
Opens: 0 Closes: 0 Deletes: 0
Posix Opens: 0 Posix Mkdirs: 0
Mkdirs: 0 Rmdirs: 0
Renames: 0 T2 Renames 0
FindFirst: 0 FNext 0 FClose 0


36.1 after dd

Resources in use
CIFS Session: 1
Share (unique mount targets): 2
SMB Request/Response Buffer: 1 Pool size: 5
SMB Small Req/Resp Buffer: 1 Pool size: 30
Operations (MIDs): 0

0 session 0 share reconnects
Total vfs operations: 7387 maximum at one time: 2

1) \\dfs\dfs
SMBs: 3697 Oplock Breaks: 0
Reads:  3684 Bytes: 0
Writes: 0 Bytes: 0
Flushes: 0
Locks: 0 HardLinks: 0 Symlinks: 0
Opens: 0 Closes: 1 Deletes: 0
Posix Opens: 1 Posix Mkdirs: 0
Mkdirs: 0 Rmdirs: 0
Renames: 0 T2 Renames 0
FindFirst: 0 FNext 0 FClose 0
Comment 13 Jeff Layton 2011-03-24 19:03:52 UTC
Thanks. It looks like the more recent kernels are doing page-sized reads, whereas the older ones are doing "normal" sized reads (16k). That's probably the reason for the slowdown.

The question is -- what changed to cause it to do page-sized reads in this case? I'll have to look and see if I can reproduce this.

It just so happens that I'm working on asynchronous write support for CIFS at the moment. Once I was done with that I was going to start work on async reads too, so I may be doing some work in this area soon anyway...
Comment 14 Till Schäfer 2011-03-24 19:12:06 UTC
Thx too. I will keep the kernel versions installed on my system. If you can't reproduce the behaviour, i will stand by.
Comment 15 Jeff Layton 2011-03-25 18:08:57 UTC
Ok, looks like something changed at the VFS level. More recent kernels have generic_file_aio_read calling readpage instead of readpages.

From a 2.6.38-ish kernel with a dump_stack() in cifs_readpage:

[778531.763280] Pid: 25290, comm: dd Not tainted 2.6.38-1.fc15.x86_64.debug #1
[778531.770319] Call Trace:
[778531.772618]  [<ffffffffa016c140>] ? cifs_readpage+0xac/0x105 [cifs]
[778531.777579]  [<ffffffff810e9b22>] ? generic_file_aio_read+0x47f/0x5fb
[778531.779040]  [<ffffffff81133dae>] ? do_sync_read+0xbf/0xff
[778531.780289]  [<ffffffff8120419e>] ? security_file_permission+0x2e/0x33
[778531.783163]  [<ffffffff811340e5>] ? rw_verify_area+0xb6/0xd3
[778531.788766]  [<ffffffff81134472>] ? vfs_read+0xac/0xf3
[778531.793847]  [<ffffffff81135823>] ? fget_light+0x3a/0x83
[778531.799150]  [<ffffffff81134506>] ? sys_read+0x4d/0x74
[778531.802141]  [<ffffffff81009b82>] ? system_call_fastpath+0x16/0x1b

...and a 2.6.32-ish kernel (RHEL6.1 prerelease) with one in cifs_readpages:

Pid: 3761, comm: dd Not tainted 2.6.32-125.el6.x86_64.debug #1
Call Trace:
 [<ffffffffa037e40f>] ? cifs_readpages+0x11f/0x640 [cifs]
 [<ffffffff8116a6ca>] ? alloc_pages_current+0xaa/0x110
 [<ffffffff81138310>] ? __do_page_cache_readahead+0x1d0/0x260
 [<ffffffff811381ee>] ? __do_page_cache_readahead+0xae/0x260
 [<ffffffff8103624c>] ? kvm_clock_read+0x1c/0x20
 [<ffffffff811383c1>] ? ra_submit+0x21/0x30
 [<ffffffff811385de>] ? ondemand_readahead+0x15e/0x2c0
 [<ffffffff81138843>] ? page_cache_sync_readahead+0x33/0x50
 [<ffffffff81123478>] ? generic_file_aio_read+0x598/0x740
 [<ffffffff8103624c>] ? kvm_clock_read+0x1c/0x20
 [<ffffffff8118c7da>] ? do_sync_read+0xfa/0x140
 [<ffffffff810930c0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81012779>] ? sched_clock+0x9/0x10
 [<ffffffff8109a695>] ? sched_clock_local+0x25/0x90
 [<ffffffff8122ef5b>] ? selinux_file_permission+0xfb/0x150
 [<ffffffff81223676>] ? security_file_permission+0x16/0x20
 [<ffffffff8118d205>] ? vfs_read+0xb5/0x1a0
 [<ffffffff810e2b02>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8118d341>] ? sys_read+0x51/0x90
 [<ffffffff8100b172>] ? system_call_fastpath+0x16/0x1b


...so something has changed to cause do_generic_file_read() to start using ->readpage instead of ->readpages. We'll probably have to do some brute-force debugging to figure out what that is.
Comment 16 Jeff Layton 2011-03-25 18:48:17 UTC
On a lark, I decided to try backing out commit 522440ed and it looks like it solved the problem. To be clear, that's this one:

commit 522440ed55d2cc8855ea5f82bc067e0483b2e1be
Author: Jeff Layton <jlayton@redhat.com>
Date:   Wed Sep 29 09:49:54 2010 -0400

    cifs: set backing_dev_info on new S_ISREG inodes
    

Till, could you try backing that patch out of your kernel and see whether it helps? Once we figure out whether it does, I'd still like to understand why but that should help point us in the right direction.
Comment 17 Jeff Layton 2011-03-25 19:10:23 UTC
I see why now...

inode_init_always does this:

    mapping->backing_dev_info = &default_backing_dev_info;

...but, the above commit overrides that with the BDI that hangs off the superblock. That BDI conspicuously does not set ra_pages. We probably *do* want to make sure that we use the superblock's BDI, but we probably also want to set ra_pages in it to a sane value. There may be other fields we've missed as well. I'll see if I can come up with a patch...
Comment 18 Jeff Layton 2011-03-25 19:24:56 UTC
Created attachment 51962 [details]
patch -- cifs: set ra_pages in BDI

This patch seems to fix it for me. Till, could you test it? This is stable material too I think...
Comment 19 Till Schäfer 2011-03-25 19:38:15 UTC
(In reply to comment #16)
> On a lark, I decided to try backing out commit 522440ed and it looks like it
> solved the problem. To be clear, that's this one:
> 
> commit 522440ed55d2cc8855ea5f82bc067e0483b2e1be
> Author: Jeff Layton <jlayton@redhat.com>
> Date:   Wed Sep 29 09:49:54 2010 -0400
> 
>     cifs: set backing_dev_info on new S_ISREG inodes
> 
> 
> Till, could you try backing that patch out of your kernel and see whether it
> helps? Once we figure out whether it does, I'd still like to understand why
> but
> that should help point us in the right direction.

ok i just tested to remove commit 522440ed55d2cc8855ea5f82bc067e0483b2e1be and it seems to solve the problem. 
 
-> 24MB/s read

i will try the patch in short
Comment 20 Till Schäfer 2011-03-25 20:06:53 UTC
the patch fixed it also. i tried with 2.6.38. THX

-> 23 MB/s
Comment 21 Jeff Layton 2011-03-25 20:27:24 UTC
Thanks for testing it. Patch sent upstream. Hopefully it will go in soon.
Comment 22 Steve French 2011-03-28 22:33:42 UTC
Good work narrowing this down. merged into cifs-2.6.git
Comment 23 Florian Mickler 2011-04-19 08:45:46 UTC
A patch referencing this bug report has been merged in v2.6.39-rc4:

commit 2b6c26a0a62cc0bab0ad487533d5581d7c293fef
Author: Jeff Layton <jlayton@redhat.com>
Date:   Fri Mar 25 16:25:57 2011 -0400

    cifs: set ra_pages in backing_dev_info