Bug 16562

Summary: 2.6.35: cpu_idle bug report / on i7 870 cpu (x86_64)
Product: Platform Specific/Hardware Reporter: Maciej Rutecki (maciej.rutecki)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: RESOLVED OBSOLETE    
Severity: normal CC: alan, jpiszcz, maciej.rutecki, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.35 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16055    

Description Maciej Rutecki 2010-08-11 14:10:10 UTC
Subject    : 2.6.35: cpu_idle bug report / on i7 870 cpu (x86_64)
Submitter  : Justin Piszcz <jpiszcz@lucidpixels.com>
Date       : 2010-08-06 22:09
Message-ID : alpine.DEB.2.00.1008061800530.5241@p34.internal.lan
References : http://marc.info/?l=linux-kernel&m=128113260904048&w=2

This entry is being used for tracking a regression from 2.6.34. Please don't
close it until the problem is fixed in the mainline.
Comment 1 Justin Piszcz 2010-09-21 08:36:46 UTC
Hi,

Please close this bug for now, I have a case open with 3ware regarding the controllers that appear to be causing I/O lockups, when the tests were performed (w/cpu_idle), it turned out that I have cron jobs that run at certain times (rss2email) and when they run, they produce a certain kind of I/O that locks up the machine until the controller(s) reset with 2.6.35.x, with 2.6.34.x the problem does not occur.


[  593.967176] 3w-9xxx: scsi0: WARNING: (0x06:0x0037): Character ioctl (0x108)
timed out, resetting card.
[  730.483812] 3w-9xxx: scsi0: WARNING: (0x06:0x0037): Character ioctl (0x108)
timed out, resetting card.
Comment 2 Justin Piszcz 2010-09-21 09:11:10 UTC
http://forums.storagereview.com/index.php/topic/28920-3ware-9650se-controller-resets-under-load-on-linux/page__st__30__gopid__264286&#entry264286

It may be CPU related afterall according to this post that I found, I am going to test what he found and see if the 2.6.35.x kernel is stable with the options he recommends to be disabled.
Comment 3 Rafael J. Wysocki 2010-09-21 18:14:44 UTC
Well, failing with CONFIG_NOHZ is still a bug and the fact that it didn't occur with 2.6.34 means we've changed something that triggers it recently.