Bug 7013 - Accessing multiple cifs mounts at once locks kernel on an SMP machine
Summary: Accessing multiple cifs mounts at once locks kernel on an SMP machine
Status: REJECTED DUPLICATE of bug 7903
Alias: None
Product: File System
Classification: Unclassified
Component: CIFS (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Steve French
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-16 08:12 UTC by Jussi Judin
Modified: 2007-02-21 13:09 UTC (History)
0 users

See Also:
Kernel Version: 2.6.17-5
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Jussi Judin 2006-08-16 08:12:42 UTC
Distribution:

Debian testing.

Hardware Environment:

Multiprocessor machine with two hyperthreaded Xeons (= 4 "processors"):

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 15
model           : 2
model name      : Intel(R) Xeon(TM) CPU 2.66GHz
stepping        : 7
cpu MHz         : 2666.330
cache size      : 512 KB

00:00.0 Host bridge: Intel Corporation E7501 Memory Controller Hub (rev 01)
00:00.1 Class ff00: Intel Corporation E7500/E7501 Host RASUM Controller (rev 01)
00:02.0 PCI bridge: Intel Corporation E7500/E7501 Hub Interface B PCI-to-PCI
Bridge (rev 01)
00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #1) (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #2) (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #3) (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 42)
00:1f.0 ISA bridge: Intel Corporation 82801CA LPC Interface Controller (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801CA Ultra ATA Storage Controller
(rev 02)
00:1f.3 SMBus: Intel Corporation 82801CA/CAM SMBus Controller (rev 02)
01:1c.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
01:1d.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
01:1e.0 PIC: Intel Corporation 82870P2 P64H2 I/OxAPIC (rev 04)
01:1f.0 PCI bridge: Intel Corporation 82870P2 P64H2 Hub PCI Bridge (rev 04)
02:03.0 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet
Controller (Copper) (rev 01)
02:03.1 Ethernet controller: Intel Corporation 82546EB Gigabit Ethernet
Controller (Copper) (rev 01)
03:03.0 RAID bus controller: Adaptec (formerly DPT) SmartRAID V Controller (rev 01)
04:01.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)


Software Environment:

clanmax@shire:/mnt/sda$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v
--enable-languages=c,c++,java,fortran,objc,obj-c++,ada,treelang --prefix=/usr
--enable-shared --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --enable-nls
--program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu
--enable-libstdcxx-debug --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-4.1-1.4.2.0/jre --enable-mpfr
--with-tune=i686 --enable-checking=release i486-linux-gnu
Thread model: posix
gcc version 4.1.2 20060613 (prerelease) (Debian 4.1.1-5)


Problem Description:

Computer locks up many times a day. It seems to happen when heavily accessing (=
reading) a large amount (hundreds) of small files (most under 2 kilobytes) on a
CIFS mount and then the remote computer locks up randomly so that all processes
are unusable but it still answers ping though.

Machines that this mounts to are running Windows XP and this has several mounts
running at once.

And once before this locked itself up, following message came into life at dmesg:

BUG: soft lockup detected on CPU#2!
 <c012da2b> softlockup_tick+0x9b/0xa8  <c011dcba> update_process_times+0x38/0x5d
 <c010c38b> smp_apic_timer_interrupt+0x51/0x5a  <c014e820>
generic_fillattr+0x68/0xa1
 <c010312c> apic_timer_interrupt+0x1c/0x24  <c014e820> generic_fillattr+0x68/0xa1
 <f9cf2f38> cifs_getattr+0x1f/0x27 [cifs]  <c014e9a2> vfs_fstat+0x25/0x36
 <c014f000> sys_fstat64+0x10/0x26  <c01026d3> syscall_call+0x7/0xb
BUG: soft lockup detected on CPU#3!
 <c012da2b> softlockup_tick+0x9b/0xa8  <c011dcba> update_process_times+0x38/0x5d
 <c010c38b> smp_apic_timer_interrupt+0x51/0x5a  <c014e820>
generic_fillattr+0x68/0xa1
 <c010312c> apic_timer_interrupt+0x1c/0x24  <c014e820> generic_fillattr+0x68/0xa1
 <f9cf2f38> cifs_getattr+0x1f/0x27 [cifs]  <c014e9a2> vfs_fstat+0x25/0x36
 <c014f000> sys_fstat64+0x10/0x26  <c01026d3> syscall_call+0x7/0xb

Also after this one came into life, MySQL server wasn't working anymore though
other programs were and machine had to be rebooted.

Steps to reproduce:

Run multiple programs that accesses small files over multiple CIFS mounts at
once and see the machine to lock up randomly during the day.
Comment 1 Jussi Judin 2006-08-16 08:20:23 UTC
The MySQL problem was just due disk getting full from its logs and it happened
to happen at the same time as the "soft lockup detected" message so ignore that =)
Comment 2 Brian Wang 2007-01-30 18:09:29 UTC
I believe this is the same lockup problem as 7903. we figured it out. this 
only happens with SMP/32 bit enviroment. th problem is i_size_write is called 
without holding inode mutex.
Comment 3 Steve French 2007-02-21 13:09:52 UTC
Brian,
I agree - looks like you are correct.

I wonder if the aio/vectored-io/mm changes altered the default locking for that
path.

*** This bug has been marked as a duplicate of 7903 ***

Note You need to log in before you can comment on or make changes to this bug.