Bug 16052

Summary: one core is always at 100% while the others at less than 5%
Product: Platform Specific/Hardware Reporter: Andrew (atswartz)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: CLOSED CODE_FIX    
Severity: normal CC: akpm, kay, maciej.rutecki, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.34-git9 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16055    
Attachments: sysrq-trigger

Description Andrew 2010-05-26 13:16:25 UTC
After compiling with a known good config, a kernel with the git9 patch applied to the 2.6.34 source on an AMD Phenom 9500 will use 100% of one core at idle ( and all other times.)

Steps to reproduce:  
1)    compile .34 with git9 patch or linux-next-20100525 
2)    reboot

Actual results:
one core pegged at 100%

Expected results: 
all four cores 0% at idle

Does not occur with 2.6.34
Comment 1 Andrew Morton 2010-05-26 14:34:20 UTC
How are you observing this CPU consumption?

Is there a process running on the CPU?  If so, which?

Does /proc/interrupts indicate that there's a lot of interrupt activity?

Do sysrq-p or sysrq-t enable you to see what's running on that CPU?

I'll mark this as a regression, thanks.
Comment 2 Andrew 2010-05-26 16:13:03 UTC
(In reply to comment #1)
> How are you observing this CPU consumption?
> 
GNOME system monitor initially alerted me to the problem.

> Is there a process running on the CPU?  If so, which?
> 

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND     
  666 root      16  -4  6532 1044  408 R  100  0.1   5:17.84 udevd     
> Does /proc/interrupts indicate that there's a lot of interrupt activity?
> 
looks normal to me.
$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:        126          0          0          1   IO-APIC-edge      timer
  1:          0          0          0         53   IO-APIC-edge      i8042
  7:          1          0          0          0   IO-APIC-edge    
  8:          0          0          1        140   IO-APIC-edge      rtc0
  9:          0          0          0          1   IO-APIC-fasteoi   acpi
 16:          0          0          6       6461   IO-APIC-fasteoi   ohci_hcd:usb2, pata_marvell, hda_intel
 17:          0          0          0          2   IO-APIC-fasteoi   ohci_hcd:usb3, ohci_hcd:usb5
 18:          0          0          0          2   IO-APIC-fasteoi   ohci_hcd:usb4, ohci_hcd:usb6
 19:          0          0          0          2   IO-APIC-fasteoi   ehci_hcd:usb1, pata_marvell
 22:          0          0         17       8758   IO-APIC-fasteoi   ahci, ohci1394
 44:          0          0          9       2243   PCI-MSI-edge      radeon
 45:          0          0          5       1212   PCI-MSI-edge      sky2@pci:0000:03:00.0
 46:          0          0          0        262   PCI-MSI-edge      hda_intel
NMI:          0          0          0          0   Non-maskable interrupts
LOC:      55478     124463      58209      90241   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0   Performance monitoring interrupts
PND:          0          0          0          0   Performance pending work
RES:      45147        866      32578      12309   Rescheduling interrupts
CAL:         38         63         66         37   Function call interrupts
TLB:       1046        113       1295       1663   TLB shootdowns
THR:          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0   Machine check exceptions
MCP:          1          1          1          1   Machine check polls
ERR:          1
MIS:          0
> Do sysrq-p or sysrq-t enable you to see what's running on that CPU?
> 
<Alt><SysRq><t>  nothing 
echo t > /proc/sysrq-trigger nothing
<Alt><SysRq><p>
[root@asus ~]# 2010 May 26 11:10:42 asus [  571.044780] Stack:
2010 May 26 11:10:42 asus [  571.044811] Call Trace:
2010 May 26 11:10:42 asus [  571.044832] Code: 71 00 85 db 75 3c 65 48 8b 04 25 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 7c e8 d5 3a 09 00 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 48 83 c4 08 5b 
> I'll mark this as a regression, thanks.
Comment 3 Andrew Morton 2010-05-26 17:17:17 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > Do sysrq-p or sysrq-t enable you to see what's running on that CPU?
> > 
> <Alt><SysRq><t>  nothing 
> echo t > /proc/sysrq-trigger nothing
> <Alt><SysRq><p>
> [root@asus ~]# 2010 May 26 11:10:42 asus [  571.044780] Stack:
> 2010 May 26 11:10:42 asus [  571.044811] Call Trace:
> 2010 May 26 11:10:42 asus [  571.044832] Code: 71 00 85 db 75 3c 65 48 8b 04
> 25
> 48 b5 00 00 83 a0 3c e0 ff ff fb 0f ae f0 48 8b 80 38 e0 ff ff a8 08 75 7c e8
> d5 3a 09 00 fb f4 <65> 48 8b 04 25 48 b5 00 00 83 88 3c e0 ff ff 04 48 83 c4
> 08
> 5b 

Please run `dmesg -n 1' then retry these.
Comment 4 Andrew 2010-05-26 18:01:52 UTC
Created attachment 26551 [details]
sysrq-trigger

Here is the output of <Alt><SysRq><T>
Comment 5 Andrew Morton 2010-05-26 18:16:16 UTC
ah, OK, sorry, I didn't look at comment #2 closely enough - udevd is spinning.  Your sysrq-t trace shows the udevd stack trace.  Let me go and ask Kay and Greg..
Comment 6 Kay Sievers 2010-05-26 20:11:59 UTC
Probably this, a borked FIONREAD kernel bug:
  http://lkml.org/lkml/2010/5/23/100
Comment 7 Andrew 2010-06-07 22:00:53 UTC
this problem is fixed for me in 2.6.34-git16 (20100530) & 2.6.35-rc1
Comment 8 Andrew Morton 2010-06-07 22:13:06 UTC
Thanks.