Subject : latest -git: [x86/oprofile] BUG: using smp_processor_id() in preemptible
Submitter : "Vegard Nossum" <email@example.com>
Date : 2008-08-19 19:51
References : http://marc.info/?l=linux-kernel&m=121917562207756&w=4
This entry is being used for tracking a regression from 2.6.26. Please don't
close it until the problem is fixed in the mainline.
On Wednesday, 20 of August 2008, Vegard Nossum wrote:
> On Wed, Aug 20, 2008 at 2:44 PM, Rafael J. Wysocki <firstname.lastname@example.org> wrote:
> > On Wednesday, 20 of August 2008, Ingo Molnar wrote:
> >> * Vegard Nossum <email@example.com> wrote:
> >> > On Wed, Aug 20, 2008 at 11:20 AM, Ingo Molnar <firstname.lastname@example.org> wrote:
> >> > >
> >> > > * Andrew Morton <email@example.com> wrote:
> >> > >
> >> > >> A post-2.6.26 regression, I assume
> >> > >
> >> > > not sure about that, but fix is queued up already. Vegard, have you
> >> > > this with v2.6.26?
> >> >
> >> > I do believe it is a regression, if not from 2.6.26, at least from
> >> > 2.6.25. I _have_ been using oprofile lately, and this was the first
> >> > time I saw such a message. On the other hand, I tried to look for
> >> > changes which could have induced it, but found none. (That is not a
> >> > guarantee that such a change does not exist, however.) It seemed 100%
> >> > reproducible, but as the fix is already known, I guess bisecting will
> >> > be a waste of time.
> >> >
> >> > (Is the fix for this the same that Andi posted, was it yesterday? I
> >> > didn't realize it was the same issue, didn't look too closely, and
> >> > IIRC, the point of error was different; opcontrol --start vs. cpu
> >> > hot-unplug.)
> >> would be worth checking.
> > Well, let's assume it is a recent regression for now. If it turns out
> > otherwise, I'll just drop the bug from the list:
> So it does happen with 2.6.26 as well:
> BUG: using smp_processor_id() in preemptible  code: oprofiled/3965
> caller is get_stagger+0x9/0x30
> Pid: 3965, comm: oprofiled Not tainted 2.6.26 #13
> [<c036c93d>] debug_smp_processor_id+0xbd/0xc0
> [<c05750c9>] get_stagger+0x9/0x30
> [<c057578e>] p4_fill_in_addresses+0x1e/0x3a0
> [<c05744aa>] nmi_setup+0xda/0x1e0
> [<c057253a>] oprofile_setup+0x3a/0xc0
> [<c0573406>] event_buffer_open+0x56/0x80
> [<c01a1044>] __dentry_open+0xf4/0x1f0
> [<c01a1187>] nameidata_to_filp+0x47/0x60
> [<c05733b0>] ? event_buffer_open+0x0/0x80
> [<c01ad216>] do_filp_open+0x186/0x710
> [<c01a0d88>] ? get_unused_fd_flags+0xc8/0xf0
> [<c065a007>] ? _spin_unlock+0x27/0x50
> [<c01a0df9>] do_sys_open+0x49/0xe0
> [<c01a0ef9>] sys_open+0x29/0x40
> [<c0104cdb>] sysenter_past_esp+0x78/0xd1
> Will try 2.6.25 and then see if Andi's patch makes any difference here.
A patch posted in the referenced mail-discussion got merged 5 days later...
I assume this fixed the issue...
Author: H. Peter Anvin <firstname.lastname@example.org>
Date: Mon Aug 25 17:07:14 2008 -0700
smp: have smp_call_function_single() detect invalid CPUs
Have smp_call_function_single() return invalid CPU indicies and return
-ENXIO. This function is already executed inside a
get_cpu()..put_cpu() which locks out CPU removal, so rather than
having the higher layers doing another layer of locking to guard
against unplugged CPUs do the test here.
Signed-off-by: H. Peter Anvin <email@example.com>