Bug 2316 - Resync of Linux-Software-RAID1 swamps out access to file system
Summary: Resync of Linux-Software-RAID1 swamps out access to file system
Status: REJECTED WILL_NOT_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: MD (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Neil Brown
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-03-16 08:08 UTC by Hans-Peter Bock
Modified: 2008-03-03 18:41 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.13
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Hans-Peter Bock 2004-03-16 08:08:17 UTC
Distribution: Debian testing
Hardware Environment:
* Asus P4P800 Deluxe
* 2x Promise FastTrakS150TX4
* 6x SATA-HDD

Software Environment:
* vanilla-kernel 2.6.4
* gcc version 3.3.3 (Debian)

Problem Description:
normal processes block on file access a short time after starting a
resynchronisation of a RAID-1

Steps to reproduce:
* /boot is on /dev/md0
* / is on /dev/md1
* create (50GB) RAID-1 with "mdadm --create /dev/md1 --level=raid1
--raid-disks=2 /dev/sda7 /dev/sdc7"
* run (for example) "while [ 1 ]; do clear; cat /proc/mdstat; sleep 1; done";
this will block after a short period of time
* pings to and portscans of the RAID-machine still work
* after the resynchronization has finished, everything continues as usual

This problem does not seem to exist in vanilla kernel 2.6.3
Comment 1 Hans-Peter Bock 2004-03-16 08:40:43 UTC
The "/dev/md1" in the "mdadm --create" command is a typo and should be "/dev/md2".
Comment 2 Alasdair G Kergon 2005-07-29 07:01:24 UTC
Is this still an issue or can we close this?
Comment 3 Hans-Peter Bock 2005-10-10 02:22:51 UTC
I'm sorry, but it's still an issue, as I had to notice last saturday.
Comment 4 Neil Brown 2005-10-10 02:41:07 UTC
1/ May I introduce the progam 'watch' to you? very useful for watching
  /proc/mdstat.

2/ Could you get me a copy of /proc/mdstat just before it stops responding?

3/ Can you try
   dd if=/dev/sda7 of=/dev/sdc7 bs=1024k &
  (to effectively do the copy by hand)
  and see how responsive the system is while that is happening?
 (This will ofcourse make a mess of /dev/sdc7, don't do it if you
  value the data there).

Thanks,
NeilBrown
Comment 5 Hans-Peter Bock 2005-10-11 13:10:37 UTC
1/ I recently got introduced to watch, but thank you for the hint. =;)

3/ ok, let's see:

$ sudo mdadm /dev/md2 -r /dev/sde7
$ sudo dd bs=1024k if=/dev/sda7 of=/dev/sde7
This is not an issue.

$ sudo dd bs=1024k if=/dev/sdc7 of=/dev/sde7
This also is not an issue.

2/

$ sudo mdadm /dev/md2 -a /dev/sde7
$ sudo mdadm /dev/md2 -f /dev/sdc7; sudo sudo mdadm /dev/md2 -r /dev/sdc7; sudo
mdadm /dev/md2 -a /dev/sdc7

Every 2s: cat /proc/mdstat                              Tue Oct 11 21:53:19 2005

Personalities : [raid1] [raid5] [raid6]
md2 : active raid1 sdc7[2] sde7[3] sda7[0]
      51375744 blocks [2/1] [U_]
      [>....................]  recovery =  0.2% (130048/51375744) finish=19.6min
 speed=43349K/sec

md4 : active raid1 sde8[2] sdc8[0] sda8[1]
      51375744 blocks [2/2] [UU]

md6 : active raid1 sdc9[2] sde9[0] sda9[1]
      51375744 blocks [2/2] [UU]
[...]

Every 2s: cat /proc/mdstat                              Tue Oct 11 22:08:59 2005

Personalities : [raid1] [raid5] [raid6]
md2 : active raid1 sdc7[2] sde7[3] sda7[0]
      51375744 blocks [2/1] [U_]
      [============>........]  recovery = 60.5% (31095168/51375744) finish=8.0mi
n speed=41885K/sec
[...]

This is an issue. The cat blocks on reading /proc/mdstat often for several
seconds sometimes up to some minutes. Access to the filesystems also blocks
during that.

BTW: All this is not an issue on another machine, which has 2 IDE drives and is
running kernel 2.4.27.
Comment 6 Hans-Peter Bock 2005-10-21 11:11:59 UTC
The problem does not seem to exist any longer when using the cfq I/O scheduler
instead of the anticipatory i/o scheduler.
Comment 7 Natalie Protasevich 2007-10-16 05:15:07 UTC
Hans-Peter,
Does this sufficiently resolve the problem for you? Is this still true with current kernel? Then we can close this bug, if no objections.
Thanks.
Comment 8 Hans-Peter Bock 2007-10-16 05:25:32 UTC
Hello Natalie,
yes, this workaround solves the problem for me.
Best regards, Hans-Peter
Comment 9 Natalie Protasevich 2008-03-03 18:41:49 UTC
Closing the bug. If someone objects and wishes to look more into scheduler, please reopen.

Note You need to log in before you can comment on or make changes to this bug.