Bug 5397 - init_workqueues() should be called before enabling timer interrupts
Summary: init_workqueues() should be called before enabling timer interrupts
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Timers
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: john stultz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-07 15:18 UTC by Boris Weissman
Modified: 2006-09-21 12:53 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.13
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Boris Weissman 2005-10-07 15:18:15 UTC
Most recent kernel where this bug did not occur:
Distribution: all 2.6 based ones
Hardware Environment: i386 & x64
Software Environment:
Problem Description:

Steps to reproduce:

Verified that this is present in 2.6.13 (and all earlier 2.6 kernels
that I checked).

There is a simple initialization order bug: worqueues are initialized
too late. init/main.c:init() executes initializers in this order:
	smp_prepare_cpus(max_cpus);  <-- start APIC timer interrupts

	do_pre_smp_initcalls();

	fixup_cpu_present_map();
	smp_init();
	sched_init_smp();

	cpuset_init_smp();

	/*
	 * Do this before initcalls, because some drivers want to access
	 * firmware files.
	 */
	populate_rootfs();

	do_basic_setup();            <-- calls init_workqueues() - too late

The problem is that we have APIC timer interrupts enabled before
workqueues are initialized. This is bad because some timer interrupt
callbacks defer work via workqueues. For instance, con_init() schedules
a timer callback (well, in 10 minutes). This callback uses a workqueue:

    static void blank_screen_t(unsigned long dummy)
    {
	blank_timer_expired = 1;
	schedule_work(&console_work);   <<--- the culprit
    }

There may be other similar callbacks now or there will be some in the
future. Since workqueues are a mechanism to defer work, it is better to
have them initialized before timer (and other) interrupts are enabled.
workqueue is a simple data structure - it should be possible to move
init_workqueues() above smp_prepare_cpus() in init().

We are actually hitting this bug in VMware. In our case, the screen 
blanking callback above is triggered before init_workqueues(). It is
triggered before its scheduled 10min timeout because of Bug 5366 that
might cause jiffies to jump forward into the future.
http://bugzilla.kernel.org/show_bug.cgi?id=5366

Filing under "Timers", but this might not be the most appopriate category.
Comment 1 Andrew Morton 2005-10-10 20:13:39 UTC
hm, I thought this came up before and we fixed it.  Oh well.

Please send a patch ;)
Comment 2 john stultz 2006-02-01 13:47:39 UTC
Does this bug still exist now that #5366 is closed?
Comment 3 john stultz 2006-07-10 11:38:50 UTC
Boris: This bug is a bit stale. Now that bug #5366 is closed, do you still see
the issue w/ 2.6.16+ ?
Comment 4 john stultz 2006-09-21 12:52:56 UTC
No response for awhile now. Closing.

Note You need to log in before you can comment on or make changes to this bug.