Subject : latest -git: [x86/oprofile] BUG: using smp_processor_id() in preemptible Submitter : "Vegard Nossum" <vegard.nossum@gmail.com> Date : 2008-08-19 19:51 References : http://marc.info/?l=linux-kernel&m=121917562207756&w=4 This entry is being used for tracking a regression from 2.6.26. Please don't close it until the problem is fixed in the mainline.
On Wednesday, 20 of August 2008, Vegard Nossum wrote: > On Wed, Aug 20, 2008 at 2:44 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Wednesday, 20 of August 2008, Ingo Molnar wrote: > >> > >> * Vegard Nossum <vegard.nossum@gmail.com> wrote: > >> > >> > On Wed, Aug 20, 2008 at 11:20 AM, Ingo Molnar <mingo@elte.hu> wrote: > >> > > > >> > > * Andrew Morton <akpm@linux-foundation.org> wrote: > >> > > > >> > >> A post-2.6.26 regression, I assume > >> > > > >> > > not sure about that, but fix is queued up already. Vegard, have you > seen > >> > > this with v2.6.26? > >> > > >> > I do believe it is a regression, if not from 2.6.26, at least from > >> > 2.6.25. I _have_ been using oprofile lately, and this was the first > >> > time I saw such a message. On the other hand, I tried to look for > >> > changes which could have induced it, but found none. (That is not a > >> > guarantee that such a change does not exist, however.) It seemed 100% > >> > reproducible, but as the fix is already known, I guess bisecting will > >> > be a waste of time. > >> > > >> > (Is the fix for this the same that Andi posted, was it yesterday? I > >> > didn't realize it was the same issue, didn't look too closely, and > >> > IIRC, the point of error was different; opcontrol --start vs. cpu > >> > hot-unplug.) > >> > >> would be worth checking. > > > > Well, let's assume it is a recent regression for now. If it turns out > > otherwise, I'll just drop the bug from the list: > > So it does happen with 2.6.26 as well: > > BUG: using smp_processor_id() in preemptible [00000000] code: oprofiled/3965 > caller is get_stagger+0x9/0x30 > Pid: 3965, comm: oprofiled Not tainted 2.6.26 #13 > [<c036c93d>] debug_smp_processor_id+0xbd/0xc0 > [<c05750c9>] get_stagger+0x9/0x30 > [<c057578e>] p4_fill_in_addresses+0x1e/0x3a0 > [<c05744aa>] nmi_setup+0xda/0x1e0 > [<c057253a>] oprofile_setup+0x3a/0xc0 > [<c0573406>] event_buffer_open+0x56/0x80 > [<c01a1044>] __dentry_open+0xf4/0x1f0 > [<c01a1187>] nameidata_to_filp+0x47/0x60 > [<c05733b0>] ? event_buffer_open+0x0/0x80 > [<c01ad216>] do_filp_open+0x186/0x710 > [<c01a0d88>] ? get_unused_fd_flags+0xc8/0xf0 > [<c065a007>] ? _spin_unlock+0x27/0x50 > [<c01a0df9>] do_sys_open+0x49/0xe0 > [<c01a0ef9>] sys_open+0x29/0x40 > [<c0104cdb>] sysenter_past_esp+0x78/0xd1 > ======================= > > Will try 2.6.25 and then see if Andi's patch makes any difference here.
A patch posted in the referenced mail-discussion got merged 5 days later... I assume this fixed the issue... commit f73be6dedf4fa058ce80846dae604b08fa805ca1 Author: H. Peter Anvin <hpa@zytor.com> Date: Mon Aug 25 17:07:14 2008 -0700 smp: have smp_call_function_single() detect invalid CPUs Have smp_call_function_single() return invalid CPU indicies and return -ENXIO. This function is already executed inside a get_cpu()..put_cpu() which locks out CPU removal, so rather than having the higher layers doing another layer of locking to guard against unplugged CPUs do the test here. Signed-off-by: H. Peter Anvin <hpa@zytor.com>