Bug 12331 - Server down after 1-2 days, many of processes in state "D".
Summary: Server down after 1-2 days, many of processes in state "D".
Status: CLOSED OBSOLETE
Alias: None
Product: Virtualization
Classification: Unclassified
Component: Xen (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: virtualization_xen
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-12-30 09:14 UTC by Ján ONDREJ (SAL)
Modified: 2012-05-24 13:46 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.27.9-159.fc10.i686
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Ján ONDREJ (SAL) 2008-12-30 09:14:01 UTC
Latest working kernel version: kernel-xen-2.6.21.7-5.fc8.i686
Earliest failing kernel version: kernel-PAE-2.6.27.7-134.fc10.i686
Distribution: Fedora 10
Hardware Environment: this problematic guest is running on Fedora 8 dom0
Software Environment: kernel-xen-2.6.21.7-5.fc8, xen-3.1.2-5.fc8
Problem Description:
After some time (aprox. 1-2 days) my server has many of processes in state "D".
Mostly they are postfix, dovecot, crond and mysql.
They blocks my server which after some time hangs.
Problem started after upgrade to Fedora 10, on Fedora 8 there was no problems
with same configuration.

Curious, that after reboot all my logged data is gone. This one is a copy
created before reboot of my machine.
Command "sync" also fails after this problem started.

May be something is with xen blk driver. My dom0 is an fully updated Fedora 8.
Guest is an paravirtualized guest on lvm disk storages.
Steps to reproduce:
I can't reproduce this. It only happens every 1-2 days.

Same problems for kernel.x86_64 and kernel-PAE.i386.

This bug has been also reported at redhat bugzilla:
  https://bugzilla.redhat.com/show_bug.cgi?id=478414
Comment 1 Ján ONDREJ (SAL) 2008-12-30 09:15:47 UTC
Forgot to say, that logs are here:
  http://www.salstar.sk/fedora-error/

Before debugging there was nothing special in dmesg or messages. 
Comment 2 Ján ONDREJ (SAL) 2009-01-05 04:14:47 UTC
After downgrade to fc8 kernel my system is up more than 3 days.
I think there is something wrong with fc10 kernels.

Note You need to log in before you can comment on or make changes to this bug.