Bug 3395 - crash on nfs-server
Summary: crash on nfs-server
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-09-14 07:45 UTC by Jan-Frode Myklebust
Modified: 2007-01-21 04:13 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.8.1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Jan-Frode Myklebust 2004-09-14 07:45:59 UTC
Distribution: Whitebox linux 3.0
Hardware Environment:IBM xseries 330, dual cpu Pentium III, 2 GB memory, qla2300
FC-HBA.
Software Environment: Very much a plain whitebox 3.0, plus the 2.6-kernel and
xfs-bits. Working as NFS-server for the home-directories of about 50 active
clients, mostly linux but also solaris.
Problem Description:

I got this oops on my NFS-server. I've had lots of problems with the 2.6 kernel
this summer, but after going back to 8K stacks it's been stable for about a
month. Unfortunately it crashed again today :(

Unable to handle kernel paging request at virtual address ffffff83
 printing eip:
c013e1e2
*pde = 00003067
*pte = 00000000
Oops: 0002 [#1]
SMP
Modules linked in: ipv6 lp ipt_REJECT ipt_state ip_conntrack iptable_filter
ip_tables ohci_hcd dm_mod
CPU:    1
EIP:    0060:[<c013e1e2>]    Not tainted
EFLAGS: 00010016   (2.6.8.1)
EIP is at free_block+0x52/0xe0
eax: ffffff7f   ebx: cea6b000   ecx: cea6b8c0   edx: f0c90000
esi: f7fff2e0   edi: 00000000   ebp: f7fff2f8   esp: c2135ed8
ds: 007b   es: 007b   ss: 0068
Process swapper (pid: 0, threadinfo=c2134000 task=c2131110)
Stack: f7fff308 0000000b c214c010 c214c010 c214c000 0000000b f7fff2e0 c013ea30
       f7fff2e0 f7fff360 c20225a0 c2135f34 c013eb84 c2134000 c2135f30 c20236a0
       00000001 c20225a0 c2135f34 c013ec89 c013ec60 c20236a0 c0124e34 c2135f34
Call Trace:
 [<c013ea30>] drain_array+0x70/0xa0
 [<c013eb84>] cache_reap+0x84/0x160
 [<c013ec89>] reap_timer_fnc+0x29/0x50
 [<c013ec60>] reap_timer_fnc+0x0/0x50
 [<c0124e34>] run_timer_softirq+0xc4/0x160
 [<c0120e35>] __do_softirq+0xb5/0xc0
 [<c0120e6d>] do_softirq+0x2d/0x30
 [<c01131dc>] smp_apic_timer_interrupt+0xcc/0x130
 [<c0103d30>] default_idle+0x0/0x40
 [<c010676e>] apic_timer_interrupt+0x1a/0x20
 [<c0103d30>] default_idle+0x0/0x40
 [<c0103d5d>] default_idle+0x2d/0x40
 [<c0103df6>] cpu_idle+0x46/0x50
 [<c011d3b6>] __call_console_drivers+0x56/0x60
 [<c011d4c0>] call_console_drivers+0x90/0x120
 [<c025ff67>] vscnprintf+0x17/0x30
Code: 89 50 04 89 02 8b 43 0c c7 03 00 01 10 00 31 d2 c7 43 04 00
 <0>Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing

Steps to reproduce:

Don't know..
Comment 1 Jan-Frode Myklebust 2004-09-30 05:48:46 UTC
Might this be related to this patch ?

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&th=4af201717088ea8e&seekm=2JV7H-6J1-25%40gated-at.bofh.it&frame=off
Comment 2 Adrian Bunk 2007-01-20 09:17:37 UTC
Is this issue still present with recent kernels?
Comment 3 Jan-Frode Myklebust 2007-01-21 03:48:38 UTC
2.5 years later it's hard to remember how I resolved/worked around this
problem.. Most likely it got fixed by a kernel upgrade. Seem to remember I was
chasing the bleeding edge for a while until this server got stable.

So, please close tgis bug.

Note You need to log in before you can comment on or make changes to this bug.