Bug 16219 - (Solved) GUID partition tables made with Ubuntu9.10 Cd produces slowdown and temporary freezes. 'umount' blocked for more than 120 seconds
(Solved) GUID partition tables made with Ubuntu9.10 Cd produces slowdown and ...
Status: RESOLVED CODE_FIX
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA
All Linux
: P1 blocking
Assigned To: Jeff Garzik
:
: 16000 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-06-15 16:21 UTC by Manuel Iglesias
Modified: 2010-06-30 10:36 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.31 & 2.6.32
Tree: Mainline
Regression: No


Attachments
kern.log May 20, 2010 after a bad system freeze. (110.36 KB, text/plain)
2010-06-15 16:21 UTC, Manuel Iglesias
Details
aio-stress 32 bits Please rename to aio-stress-bin to use with aio-stress shell (32.07 KB, application/octet-stream)
2010-06-15 18:06 UTC, Manuel Iglesias
Details
aio-stress 64 bits Please rename to aio-stress-bin to use with aio-stress shell (33.35 KB, application/octet-stream)
2010-06-15 18:07 UTC, Manuel Iglesias
Details
Totals and averages (3.87 KB, text/plain)
2010-06-15 18:09 UTC, Manuel Iglesias
Details
5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:11 UTC, Manuel Iglesias
Details
Log02.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:11 UTC, Manuel Iglesias
Details
Log03.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:12 UTC, Manuel Iglesias
Details
Log04.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:13 UTC, Manuel Iglesias
Details
Log05.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:13 UTC, Manuel Iglesias
Details
Log06.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:14 UTC, Manuel Iglesias
Details
Log07.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:14 UTC, Manuel Iglesias
Details
Log08.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:15 UTC, Manuel Iglesias
Details
Log09.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:15 UTC, Manuel Iglesias
Details
Log10.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:16 UTC, Manuel Iglesias
Details
Log11.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:16 UTC, Manuel Iglesias
Details
Log12.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:17 UTC, Manuel Iglesias
Details
Log13.txt 5 runs result (refer to ResultAverages.txt) (2.20 KB, text/plain)
2010-06-15 18:17 UTC, Manuel Iglesias
Details
aio-stress shell script (471 bytes, text/plain)
2010-06-15 18:19 UTC, Manuel Iglesias
Details
kern.log after system freeze. (141.93 KB, text/plain)
2010-06-19 14:38 UTC, Manuel Iglesias
Details

Description Manuel Iglesias 2010-06-15 16:21:47 UTC
Created attachment 26783 [details]
 kern.log May 20, 2010 after a bad system freeze.

System1 (new computer)
  Motherboard: Asus P7P55D LE
    BIOS updated 2 times (Same problems with all versions)
  CPU: Intel Core i5 661 3.33Ghz
  4 Gb Ram
  Graphics card: Asus Gforce GT 220, 1GB
  2* SATA Hd Western Digital Caviar Green 2 Terabytes
    /dev/sda GUID partition table
      /dev/sda1 swap 6GB
      /dev/sda2 ReiserFs 20GB
      /dev/sda3 ReiserFs 20GB
      /dev/sda4 ReiserFs 1816GB
    /dev/sdb GUID partition table
      /dev/sdb1 swap 6GB
      /dev/sdb2 ReiserFs 23GB
      /dev/sdb3 ReiserFs 1833GB
  OS: Ubuntu10.04 2.6.32-22-generic kernel 64 bits (Same problems with 2.6.31)
  Desktop: Gnome

System2 (old computer) (No problems used for testing)
  Motherboard: Abit IC7-G
  CPU: Pentium 4 Hyperthreading
  1 Gb Ram
  Graphics card: Ati Radeon 920
  2* SATA: Maxtor 300GB + Seagate 250 GB
    /dev/sda GUID partition table
      /dev/sda1 swap 1.86GB
      /dev/sda2 ReiserFs 9.4GB
      /dev/sda3 ReiserFs 4.7GB
      /dev/sda4 ReiserFs 283GB
    /dev/sdb GUID partition table
      /dev/sdb1 swap 0.5GB
      /dev/sdb2 ReiserFs 9.4GB
      /dev/sdb3 ReiserFs 224GB
  OS: Ubuntu10.04 2.6.31-17-generic kernel 32 bits
  Desktop: Gnome

Problems only with System1
  Temporary freezes:
    Opening applications
    Closing applications
    Opening application's dialogs (specially opening/saving documents)
    Opening and saving (gedit)
    Refreshing nautilus
    Audio playing stops temporarily
  Disk operations seem to take a long time, specially 'umount'
  Occasionally shutdown takes 10..15 minutes apparently waiting for 'umount'
  Freezes can be produced by copying big files
    -The slowdown of the system seems to be exponential,
     the longer it takes to copy the more the system slows down,
     the more the system slows down the longer it takes to copy... etc
    -When copying in nautilus progress dialog starts moving fast to
     slowdown and almost hang at the end.
  Sometimes when I double click on a text file in Nautilus both gedit and
   nautilus windows go grey.
  Sometimes when I double click on an audio file in Nautilus, and alsaplayer starts,
   I can not change focus to another file in nautilus window until alsaplayer finishes
   playing audio file

Checked:
  RAM memory checked several times: Ok
  CPU checked: running several stress tests: Ok
  When problems appear, all 4 cpus (2 cores + hyperthreading):
    -Have low loads (10..20%)
    -Temperature (sensors) low: 32°C
  Changed some settings in BIOS without any improvement
    Drives set to AHCI or IDE
    Intel SpeedStep (CPU speed controlled by OS) disabled
    Virtualization tech. disabled
  Hard disks are Ok:
    Tested them with palimpsest and smartctl
    Same problem booting from both disks (both have Linux)
  Problems happen with both free and non-free nvidia drivers
  No error entries in logs except for blocked process warnings
  (Idle-priority scheduling bug?):
    INFO: task application-name:1878 blocked for more than 120 seconds.
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    .........
    Call Trace:
    .........
  acpi does not seem to be functional
    acpi -V
      Cooling 0: Processor 0 of 0
      Cooling 1: Processor 0 of 0
      Cooling 2: Processor 0 of 0
      Cooling 3: Processor 0 of 0
    acpidump >/tmp/table.txt
      Wrong checksum for OEMB!
  Warning in kernel log at boot:
    ACPI Warning: Incorrect checksum in table [OEMB] - BA, should be B9 (20090903/tbutils-314)
  This 'wrong checksum' warning has happened with 3 different version of BIOS software.

  I have attached kernel messages corresponding to a day when there was a particularly bad
  freeze soon after boot (It usually happens soon after booting when anacron runs /etc/cron.daily).
Comment 1 Manuel Iglesias 2010-06-15 18:04:31 UTC
I have made a mistake in the bug description and can not edit it :-(
Both HDs in 'System2 (old computer)' have MBR Partition tables!
Please note:
  System1 (64 bits)
    2 HD with GUID Partition tables
  System2 (32 bits)
    2 HD with MBR Partition tables

I have always had problems with 'System1 (new computer)' so I was not sure if it had a hardware or a software problem.
I needed to find out to make use of the manufacturer (Motherboard?) guarantee in case of a hardware problem.

I suspected the problem was in hard drives access and decided to use the minimum software neccessary to test it:
A stress linux live CD created with ISO images (http://www.stresslinux.org/sl/downloads):
64 bits live CD:
  stresslinux_64bit.x86_64-0.5.113.iso.bz2
    kernel 2.6.31.12.0.2-default
32 bits live CD:
  stresslinux_32bit.i686-0.5.113.iso.bz2 
    kernel 2.6.31.12.02-pae
aio-stress. A program to test random writes in HDs which is used by phoronix-test-suite
I have used version 0.21 of aio-stress compiled for x86_64 and x86_32.
I have made attachments in case anyone wants to repeat my tests:
  aio-stress-bin_64bits & aio-stress-bin_32bits (rename to aio-stress-bin)
Have used a shell script to run 'aio-stress-bin -s 1g -r 64k -t 3 -o 2' (same as phoronix-test-suite uses. random write to a 1 GB file) 5 times. Also attached:
  aio-stress

aio-stress gives two numbers. I guess the first refers to the first time the 1 GB file is written and the second to several random writes to it. Here are the results of my tests:
System1
  64 bits
    /dev/sda System1 (SATA 2 TB, GUID Partition table, ReiserFs)
      Mb/s: 14.33; Mb/s totals: 6.34
    /dev/sda System2 (SATA 4.7GB, MBR Partition table, ReiserFs)
      Mb/s: 144.20; Mb/s totals: 20.77
  32 bits
    /dev/sda System1 (SATA 2 TB, GUID Partition table, ReiserFs)
      Mb/s: 2.01; Mb/s totals: 1.54
    /dev/sda System2 (SATA 4.7GB, MBR Partition table, ReiserFs)
      Mb/s: 14.12; Mb/s totals: 11.66
System2 32 bits
  /dev/sda System1 (SATA 2 TB, GUID Partition table, ReiserFs)
    Mb/s: 2.05; Mb/s totals: 1.56
  /dev/sda System2 (SATA 4.7GB, MBR Partition table, ReiserFs)
    Mb/s: 13.60; Mb/s totals: 11.11
Note: Other tests show that the results are similar for /dev/sda and /dev/sdb in both systems.

I have also tested an external disk connected to an Usb port:
System1 64 bits
  /dev/sdc External HD Usb (PATA 120GB, MBR Partition table, ReiserFs)
    Mb/s: 184.54; Mb/s totals: 26.36
System2 32 bits
  /dev/sdc External HD Usb (PATA 120GB, MBR Partition table, ReiserFs)
    Mb/s: 19.14; Mb/s totals: 15.56
The results are better than those of SATA drives probably because a buffer is used.

Attached:
  ResultAverages.txt (totals and averages)
  Log01.txt .. Log13.txt outputs of aio-stress shell

As you can see the results for the disk with GUID Partition tables are extremely bad compared to those of the disk with MBR Partition table. In both systems!
My conclusion is that the linux kernel (2.6.31 & 2.6.32) has a very poor performance with GUID Partition table.
When I was running the tests in tty1, I was checking the results in tty2 with 'cat aio-stress-log.txt'. When I was testing the disk with GUID Partition table, and the current run was midway, the 'cat' command sometimes was blocked.

The obvious solution to my problem would be to reformat the drives in my new computer to MBR Partition tables (2TB is the limit for MBR) but I'd prefer if this kernel's bug is solved.

Any help heartily welcome! Thanks in advance.
Comment 2 Manuel Iglesias 2010-06-15 18:06:05 UTC
Created attachment 26786 [details]
aio-stress 32 bits Please rename to aio-stress-bin to use with aio-stress shell
Comment 3 Manuel Iglesias 2010-06-15 18:07:06 UTC
Created attachment 26787 [details]
aio-stress 64 bits Please rename to aio-stress-bin to use with aio-stress shell
Comment 4 Manuel Iglesias 2010-06-15 18:09:50 UTC
Created attachment 26788 [details]
Totals and averages
Comment 5 Manuel Iglesias 2010-06-15 18:11:02 UTC
Created attachment 26789 [details]
5 runs result (refer to ResultAverages.txt)
Comment 6 Manuel Iglesias 2010-06-15 18:11:50 UTC
Created attachment 26790 [details]
Log02.txt 5 runs result (refer to ResultAverages.txt)
Comment 7 Manuel Iglesias 2010-06-15 18:12:22 UTC
Created attachment 26791 [details]
Log03.txt 5 runs result (refer to ResultAverages.txt)
Comment 8 Manuel Iglesias 2010-06-15 18:13:30 UTC
Created attachment 26792 [details]
Log04.txt 5 runs result (refer to ResultAverages.txt)
Comment 9 Manuel Iglesias 2010-06-15 18:13:59 UTC
Created attachment 26793 [details]
Log05.txt 5 runs result (refer to ResultAverages.txt)
Comment 10 Manuel Iglesias 2010-06-15 18:14:24 UTC
Created attachment 26794 [details]
Log06.txt 5 runs result (refer to ResultAverages.txt)
Comment 11 Manuel Iglesias 2010-06-15 18:14:52 UTC
Created attachment 26795 [details]
Log07.txt 5 runs result (refer to ResultAverages.txt)
Comment 12 Manuel Iglesias 2010-06-15 18:15:18 UTC
Created attachment 26796 [details]
Log08.txt 5 runs result (refer to ResultAverages.txt)
Comment 13 Manuel Iglesias 2010-06-15 18:15:42 UTC
Created attachment 26797 [details]
Log09.txt 5 runs result (refer to ResultAverages.txt)
Comment 14 Manuel Iglesias 2010-06-15 18:16:11 UTC
Created attachment 26798 [details]
Log10.txt 5 runs result (refer to ResultAverages.txt)
Comment 15 Manuel Iglesias 2010-06-15 18:16:39 UTC
Created attachment 26799 [details]
Log11.txt 5 runs result (refer to ResultAverages.txt)
Comment 16 Manuel Iglesias 2010-06-15 18:17:05 UTC
Created attachment 26800 [details]
Log12.txt 5 runs result (refer to ResultAverages.txt)
Comment 17 Manuel Iglesias 2010-06-15 18:17:27 UTC
Created attachment 26801 [details]
Log13.txt 5 runs result (refer to ResultAverages.txt)
Comment 18 Manuel Iglesias 2010-06-15 18:19:12 UTC
Created attachment 26802 [details]
aio-stress shell script
Comment 19 Tejun Heo 2010-06-15 18:34:53 UTC
Can you please attach parted dump of the partition table?  Please set the unit to sectors.  Thanks.
Comment 20 Manuel Iglesias 2010-06-15 18:44:01 UTC
parted -l
Model: ATA WDC WD20EARS-00S (scsi)
Disk /dev/sda: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system     Name             Flags
 1      17.4kB  6000MB  6000MB  linux-swap(v1)  MainLinux swap
 2      6000MB  26.0GB  20.0GB  reiserfs        MainLinux /      bios_grub
 3      26.0GB  46.0GB  20.0GB  reiserfs        MainLinux home
 4      46.0GB  2000GB  1954GB  reiserfs        MainLinux Store


Model: ATA WDC WD20EARS-00S (scsi)
Disk /dev/sdb: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system     Name                  Flags
 1      17.4kB  6000MB  6000MB  linux-swap(v1)  EmergencyLinux swap
 2      6000MB  29.0GB  23.0GB  reiserfs        EmergencyLinux /
 3      29.0GB  2000GB  1971GB  reiserfs        EmergencyLinux Store

fdisk -l -u /dev/sda

WARNING: GPT (GUID Partition Table) detected on '/dev/sda'! The util fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sda: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1  3907029167  1953514583+  ee  GPT

fdisk -l -u /dev/sdb

WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk doesn't support GPT. Use GNU Parted.


Disk /dev/sdb: 2000.4 GB, 2000398934016 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1  3907029167  1953514583+  ee  GPT



No edit option :-(
I want to add something to my previous message
"When I was running the tests in tty1, I was checking the results in tty2 with
'cat aio-stress-log.txt'. When I was testing the disk with GUID Partition
table, and the current run was midway, the 'cat' command sometimes was
blocked."
In both systems!
Comment 21 Tejun Heo 2010-06-15 18:48:37 UTC
You can tell parted to use sectors by running

# parted /dev/sda
(parted) u s
(parted) p

Can you please get the output of the above command?
Comment 22 Manuel Iglesias 2010-06-15 18:57:11 UTC
parted /dev/sda
GNU Parted 2.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) u s                                                              
(parted) p                                                                
Model: ATA WDC WD20EARS-00S (scsi)
Disk /dev/sda: 3907029168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start      End          Size         File system     Name             Flags
 1      34s        11718784s    11718751s    linux-swap(v1)  MainLinux swap
 2      11718785s  50781285s    39062501s    reiserfs        MainLinux /      bios_grub
 3      50781286s  89843786s    39062501s    reiserfs        MainLinux home
 4      89843787s  3907027380s  3817183594s  reiserfs        MainLinux Store

(parted) q                                                                

parted /dev/sdb
GNU Parted 2.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) u s                                                              
(parted) p                                                                
Model: ATA WDC WD20EARS-00S (scsi)
Disk /dev/sdb: 3907029168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start      End          Size         File system     Name                  Flags
 1      34s        11718784s    11718751s    linux-swap(v1)  EmergencyLinux swap
 2      11718785s  56640660s    44921876s    reiserfs        EmergencyLinux /
 3      56640661s  3907027379s  3850386719s  reiserfs        EmergencyLinux Store

(parted) q

Thanks for helping Tejun! :-)
Comment 23 Tejun Heo 2010-06-16 08:40:29 UTC
Hmmm... WD20EARS is 4k physical sector drive and your partition table isn't aligned which will result in significantly lower write performance and the green drives are not the fastest drives to begin with.  Can you please repartition so that sectors are properly aligned and see whether the problem goes away?

Fore more info on 4k sector issues,

  http://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues

Thanks.
Comment 24 Manuel Iglesias 2010-06-16 09:13:19 UTC
Thanks Tejun.

I did the current partitioning with the Ubuntu9.10 distro Cd (64 bits). I don't think there was a sector-size option.

I am not an expert and backing data and reformatting a 2 TB drive is not a trivial task.
Would you please give me step by step instructions on how to do it?

Now sector size seems to be 512 bytes, do you want me to change it to 4 KB?
Should I use gparted/parted?
Comment 25 Tejun Heo 2010-06-16 09:17:56 UTC
Heh... I can't really help you with backing up data, sorry.  :-P

But if you're on ubuntu, the new release 10.04 should align partitions properly for 4k drives, so if you do clean install (including repartitioning) of 10.04, it should work automatically.  I'll check whether it actually works.

Thanks.
Comment 26 Manuel Iglesias 2010-06-16 09:26:17 UTC
I did not mean I needed help with backing up. I only wanted to make sure I did the partitioning correctly first time because it is so much work to back up and recover ;-)

Are you going to check Ubuntu10.04 partitioning? Shall I wait until you give me the Ok?

I read this in  http://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues:

"L-1. Kernel support
...................
As of v2.6.33, Linux ATA drivers do not support drives with 4KiB logical sector size although there is a development branch containing experimental support[11]. For ATA drives connected via bridges to different buses - USB and IEEE 1394, as long as the bridges support 4KiB logical sector size correctly, the SCSI disk driver can handle them."
Comment 27 Tejun Heo 2010-06-16 10:03:51 UTC
Okay, just checked 10.04 and yeap it aligns partitions correctly, so if you delete existing partitions from 10.04 boot media and create new ones you should be fine.  The L-1 is for logical sector size.  There's no logical 4k drives on market now.  They're all logical 512b, physical 4k ones.

Thanks.
Comment 28 Manuel Iglesias 2010-06-16 10:06:23 UTC
Thanks a lot Tejun.
Will test it right now and post results.
Comment 29 Manuel Iglesias 2010-06-16 12:01:13 UTC
Hi Tejun!

Have reformatted /dev/sdb with Ubuntu10.04 CD (fresh install) and I have made some tests before restoring data:

Before
  System1
    64 bits
      /dev/sda System1 (SATA 2 TB, GUID Partition table, ReiserFs)
        Mb/s: 14.33; Mb/s totals: 6.34
Now
  System1
    64 bits
      /dev/sdb System1 (SATA 2 TB, GUID Partition table, ReiserFs)
      Mb/s: 187.45; Mb/s totals: 29.18

As you can see there has been an enormous improvement :-) But sector size is still 512 Bytes:
parted /dev/sdb
GNU Parted 2.2
Using /dev/sdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) u s                                                              
(parted) p                                                                
Model: ATA WDC WD20EARS-00S (scsi)
Disk /dev/sdb: 3907029168s
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start      End          Size         File system     Name  Flags
 1      2048s      11718655s    11716608s    linux-swap(v1)
 2      11718656s  56641535s    44922880s    reiserfs
 3      56641536s  3907028991s  3850387456s  reiserfs

(parted) q

And I could not get rid of a 1 MB gap between sdb1 (swap partition) and sdb2.

Do you think current partitioning is all right? Should I continue with restoring data and reformatting /dev/sda?

Thanks again for your help!
Comment 30 Tejun Heo 2010-06-16 12:04:07 UTC
Oh, the visible block size is 512byte although they're backed by 4k physical sectors.  All that's necessary is aligning partition tables to those 4k boundaries.  parted now aligns everything to 1M boundary (2048 sectors) so that gap is expected.  Yeap, everything looks good to me.

Good luck.
Comment 31 Manuel Iglesias 2010-06-17 15:19:43 UTC
Thanks to Tejun repartitioning advice, access to both HDs in my system is much faster. This are the new aio-stress results:
/dev/sdb2 (New), AHCI, StressLinux_LiveCd-tty1 (formatted with Ubuntu10.04)
	Mb/s	Mb/s totals				
	176.61	24.33			
	96.37	23.78			
	126.21	19.94			
	125.57	22.02			
	412.49	55.81		Mb/s	Mb/s totals
		Averages:	187.45	29.18	
					
/dev/sda3 (New), AHCI, StressLinux_LiveCd-tty1 (formatted with Ubuntu10.04)
	Mb/s	Mb/s totals
	126.50	19.91			
	105.49	19.41			
	88.63	17.92			
	98.46	18.33			
	310.04	46.40		Mb/s	Mb/s totals
		Averages:	145.82	24.39	

sda seems to be slower than sdb. Could it be because sda has more partitions (sdb: swap+2; sda: swap+3)?

It is too early to say but everything seems to go faster and I have not experienced any significant slowdown or freezes. :-D

But there is still a problem: 'umount -l' sometimes is blocked for a long time. I use shell scripts to run some tasks periodically (Have used them without problems for many years). If I am using Linux on sda some of this tasks imply mounting sdb partitions, reading/writing in those partitions and then umounting them. As several processes may need to access the same partition at the same time they use a 'Semaphore' script to mount/umount the partitions.
It is when these scripts invoke 'umount -l' when, sometimes, umount hangs. Last time it happened one of the scripts was calling 'umount -l' to unmount partition sdb2 while another was also calling 'umount -l' to umount sdb3. 'umount' took more than 10 minutes to complete both tasks! Here are the corresponding messages in kern.log:
{
INFO: task umount:13856 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 00000000ffffffff     0 13856  13855 0x00000000
ffff88008c841d98 0000000000000086 0000000000015bc0 0000000000015bc0
ffff8800986e1ab0 ffff88008c841fd8 0000000000015bc0 ffff8800986e16f0
0000000000015bc0 ffff88008c841fd8 0000000000015bc0 ffff8800986e1ab0
Call Trace:
[<ffffffff811650d0>] ? bdi_sched_wait+0x0/0x20
[<ffffffff811650de>] bdi_sched_wait+0xe/0x20
[<ffffffff8153f40f>] __wait_on_bit+0x5f/0x90
[<ffffffff811650d0>] ? bdi_sched_wait+0x0/0x20
[<ffffffff8153f4b8>] out_of_line_wait_on_bit+0x78/0x90
[<ffffffff81085360>] ? wake_bit_function+0x0/0x40
[<ffffffff81165094>] ? bdi_queue_work+0xa4/0xe0
[<ffffffff8116640f>] bdi_sync_writeback+0x6f/0x80
[<ffffffff81166440>] sync_inodes_sb+0x20/0x30
[<ffffffff81169f12>] __sync_filesystem+0x82/0x90
[<ffffffff81169ff9>] sync_filesystems+0xd9/0x130
[<ffffffff8115ece1>] sys_umount+0xb1/0xd0
[<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
INFO: task umount:19440 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 00000000ffffffff     0 19440  19439 0x00000000
ffff8800985ade88 0000000000000082 0000000000015bc0 0000000000015bc0
ffff8800986e31a0 ffff8800985adfd8 0000000000015bc0 ffff8800986e2de0
0000000000015bc0 ffff8800985adfd8 0000000000015bc0 ffff8800986e31a0
Call Trace:
[<ffffffff8153f8c7>] __mutex_lock_slowpath+0xe7/0x170
[<ffffffff8153f7bb>] mutex_lock+0x2b/0x50
[<ffffffff81169f3f>] sync_filesystems+0x1f/0x130
[<ffffffff8115ecd7>] sys_umount+0xa7/0xd0
[<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
INFO: task umount:19440 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
umount        D 00000000ffffffff     0 19440  19439 0x00000000
ffff8800985ade88 0000000000000082 0000000000015bc0 0000000000015bc0
ffff8800986e31a0 ffff8800985adfd8 0000000000015bc0 ffff8800986e2de0
0000000000015bc0 ffff8800985adfd8 0000000000015bc0 ffff8800986e31a0
Call Trace:
[<ffffffff8153f8c7>] __mutex_lock_slowpath+0xe7/0x170
[<ffffffff8153f7bb>] mutex_lock+0x2b/0x50
[<ffffffff81169f3f>] sync_filesystems+0x1f/0x130
[<ffffffff8115ecd7>] sys_umount+0xa7/0xd0
[<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
}

Before repartitioning I had also observed that sometimes 'umount -l' hangs for a long time when it is called by one of the mentioned processes while another is waiting for its chance (sleep loop) to mount a partition. All processes are spawned from a parent process (run by cron) and I am wondering if there is some scheduling problem that does not asign cpu time to 'umount -l' because another sibling process is running. This task running scheme (I call it CommonCron) runs perfectly on my old computer (2.6.31.17-generic kernel 32 bits).

Any ideas or suggestions?
Comment 32 Tejun Heo 2010-06-17 15:49:36 UTC
(cc'd Andrew)

The performance difference could be most likely due to different location and size of those partitions.  ISTR recently reading something about fixing a bug which makes umount take a very long time but I haven't really followed it.  Andrew, does anything ring a bell?

Thanks.
Comment 33 Manuel Iglesias 2010-06-19 14:37:28 UTC
Faster access to HDs makes a big difference but I am disappointed that some of the above described problems still persist :-(

This morning, soon after booting, the system froze. I had to wait a long time until I could switch to tty and the system was so slow that I could only check that CPUs had low loads (top).

The system unfroze somewhat after a few minutes. Kern.log shows the usual 'blocked for more than 120 seconds' messages plus some dealing with DMA and swap memory.

I don't know much about the Linux kernel and would be grateful if somebody could tell me what is going on.

I'll attach the corresponding kern.log (Kern19.06.10.log),
Thanks in advance.
Comment 34 Manuel Iglesias 2010-06-19 14:38:56 UTC
Created attachment 26861 [details]
kern.log after system freeze.
Comment 35 Tejun Heo 2010-06-19 14:45:13 UTC
Does it still happen w/o amule running?  It looks like amule is leaking memory very quickly filling up the memory and swap leading to massive thrashing.  Some distros limit the amount of memory a process can consume these days.  It looks like ubuntu isn't doing that.  Anyways, w/ memory bomb running w/o any resource limit, the kernel can do only so much.

Thanks.
Comment 36 Manuel Iglesias 2010-06-19 21:43:37 UTC
Thank you Tejun.
I'll check if amule is running next time my system freezes.
Normally amule does not use any swap memory. It is running now and this is the memory use:
free
             total       used       free     shared    buffers     cached
Mem:       4055212    3984380      70832          0     123036    3178964
-/+ buffers/cache:     682380    3372832
Swap:      5858296          0    5858296

There has only been the freeze mentioned above since I repartitioned the HDs. I am happy now. Using the computer before repartitioning was a nightmare. Thank you again.
Comment 37 Manuel Iglesias 2010-06-30 10:36:39 UTC
*** Bug 16000 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.