Bug 7598 - SMP kernel hangs up after load processor.ko on UP system
Summary: SMP kernel hangs up after load processor.ko on UP system
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-30 07:46 UTC by Yauhen Kharuzhy
Modified: 2008-11-16 22:39 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.19
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Full config (73.65 KB, text/plain)
2006-11-30 07:47 UTC, Yauhen Kharuzhy
Details
acpidump from acer 9300 laptop (117.72 KB, text/plain)
2008-01-09 12:58 UTC, Sam Liddicott
Details
dmesg of boot on 2.6.20 kernel (from ubuntu fesity) which doesn't hang (28.08 KB, text/plain)
2008-01-09 13:26 UTC, Sam Liddicott
Details
dmesg (without noacpi !!) for 2.6.20 (from ubuntu feisty) which doesn't hang (25.88 KB, text/plain)
2008-01-09 13:50 UTC, Sam Liddicott
Details
dmesg for 2.6.22 without noacpi (24.35 KB, text/plain)
2008-01-09 13:51 UTC, Sam Liddicott
Details

Description Yauhen Kharuzhy 2006-11-30 07:46:41 UTC
Most recent kernel where this bug did *NOT* occur:
unknown (works good with UP 2.6.16 kernel)

Distribution: Debian (vanilla kernel sources)

Hardware Environment: 

IBM T23 notebook

jek@jeknote:/tmp$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 11
model name      : Intel(R) Pentium(R) III Mobile CPU      1133MHz
stepping        : 1
cpu MHz         : 1130.500

jek@jeknote:/tmp$ sudo lspci
00:00.0 Host bridge: Intel Corporation 82830 830 Chipset Host Bridge (rev 02)
00:01.0 PCI bridge: Intel Corporation 82830 830 Chipset AGP Bridge (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #1) (rev 01)
00:1d.1 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #2) (rev 01)
00:1d.2 USB Controller: Intel Corporation 82801CA/CAM USB (Hub #3) (rev 01)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev 41)
00:1f.0 ISA bridge: Intel Corporation 82801CAM ISA Bridge (LPC) (rev 01)
00:1f.1 IDE interface: Intel Corporation 82801CAM IDE U100 (rev 01)
00:1f.3 SMBus: Intel Corporation 82801CA/CAM SMBus Controller (rev 01)
00:1f.5 Multimedia audio controller: Intel Corporation 82801CA/CAM AC'97 Audio
Controller (rev 01)
01:00.0 VGA compatible controller: S3 Inc. SuperSavage IX/C SDR (rev 05)
02:00.0 CardBus bridge: Texas Instruments PCI1420
02:00.1 CardBus bridge: Texas Instruments PCI1420
02:02.0 Communication controller: Agere Systems WinModem 56k (rev 01)
02:08.0 Ethernet controller: Intel Corporation 82801CAM (ICH3) PRO/100 VE (LOM)
Ethernet Controller (rev 41)
07:00.0 Ethernet controller: Atheros Communications, Inc. AR5212 802.11abg NIC
(rev 01)


Software Environment:

Problem Description:
After loading processor.ko module system hangs up. Sometimes kernel oops raised:

ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
BUG: NMI Watchdog detected LOCKUP on CPU0, eip c01106c8, registers:
Modules linked in: processor fan
CPU: 0
EIP: 0060:[<c01106c8>] Not tainted VLI
EFLAGS: 00000002 (2.6.19 #2)
EIP is at send_IPI_mask_bitmask+0x4a/0x75
eax: 000018ef ebx: 00000001 ecx: 00000001 edx: 000000ef
esi: 00000082 edi: 000000ef ebp: 00000000 esp: cfe6fe08
ds: 007b es: 007b ss: 0068
Process modprobe (pid: 390, ti=cfe6e000 task=cfe3c030 task.ti=cfe6e000)
Stack: 00000001 ffe4203a ffffffff c02e5f00 c02e5f00 00000000 00000000 c0106270 
 c0144849 c032a0e0 00000000 c032a108 cfe6fe78 c0145baa c03660f0 00000000 
 00000000 c010537d b7eb9000 00000001 c0117b50 00000000 c032a0e0 d082c35f 
Call Trace:
 [<c0106270>] timer_interrupt+0x65/0x6d
 [<c0144849>] handle_IRQ_event+0x1a/0x3f
 [<c0145baa>] handle_level_irq+0x9e/0xe8
 [<c010537d>] do_IRQ+0x7d/0xa4
 [<c0117b50>] do_page_fault+0x281/0x531
 [<c01036ee>] common_interrupt+0x1a/0x20
 [<c013713b>] lookup_symbol+0x18/0x2f
 [<c0137619>] __find_symbol+0x24/0x2af
 [<c0138fe7>] sys_init_module+0xd11/0x18fe
 [<c01c4372>] prio_tree_insert+0x1d/0x1ec
 [<c0102d67>] syscall_call+0x7/0xb
 =======================
Code: c0 c7 44 24 08 b1 00 00 00 c7 44 24 04 df d4 2a c0 c7 04 24 0c c4 2a c0 e8
 bc fb 00 00 e8 e6 3b ff ff eb 02 f3 90 a1 00 d3 ff ff <f6> c4 10 75 f4 c1 e3 18
 83 ff 02 89 1d 10 d3 ff ff b8 00 0c 00 

It's annoying for me, because Debian maintainers do not build UP kernel images.


Steps to reproduce:
1) build SMP kernel with ACPI (see config below).
2) run kernel on UP system and load module processor.ko


Kernel config:

CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_IPC_NS=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
# CONFIG_TASKSTATS is not set
CONFIG_UTS_NS=y
CONFIG_AUDIT=y
# CONFIG_AUDITSYSCALL is not set
# CONFIG_IKCONFIG is not set
CONFIG_CPUSETS=y
# CONFIG_RELAY is not set
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
CONFIG_BLOCK=y
CONFIG_LBD=y
# CONFIG_BLK_DEV_IO_TRACE is not set
CONFIG_LSF=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
CONFIG_M686=y
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_PPRO_FENCE=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_HPET_TIMER=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_SMT=y
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_BKL is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=m
CONFIG_X86_MCE_P4THERMAL=y
CONFIG_VM86=y
CONFIG_TOSHIBA=m
CONFIG_I8K=m
# CONFIG_X86_REBOOTFIXUPS is not set
CONFIG_MICROCODE=m
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m

#
# Firmware Drivers
#
CONFIG_EDD=m
CONFIG_EFI_VARS=m
CONFIG_DELL_RBU=m
CONFIG_DCDBAS=m
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_ARCH_POPULATES_NODE_MAP=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_RESOURCES_64BIT is not set
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_EFI=y
CONFIG_IRQBALANCE=y
CONFIG_BOOT_IOREMAP=y
CONFIG_REGPARM=y
# CONFIG_SECCOMP is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_KEXEC=y
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x100000
CONFIG_HOTPLUG_CPU=y
CONFIG_COMPAT_VDSO=y
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
CONFIG_PM_LEGACY=y
# CONFIG_PM_DEBUG is not set
# CONFIG_PM_SYSFS_DEPRECATED is not set
CONFIG_SOFTWARE_SUSPEND=y
CONFIG_PM_STD_PARTITION=""
CONFIG_SUSPEND_SMP=y

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
# CONFIG_ACPI_SLEEP_PROC_SLEEP is not set
CONFIG_ACPI_AC=m
CONFIG_ACPI_BATTERY=m
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_HOTKEY=m
CONFIG_ACPI_FAN=m
CONFIG_ACPI_DOCK=m
CONFIG_ACPI_PROCESSOR=m
CONFIG_ACPI_HOTPLUG_CPU=y
CONFIG_ACPI_THERMAL=m
CONFIG_ACPI_ASUS=m
CONFIG_ACPI_IBM=m
CONFIG_ACPI_TOSHIBA=m
CONFIG_ACPI_BLACKLIST_YEAR=0
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=m
CONFIG_ACPI_SBS=m
Comment 1 Yauhen Kharuzhy 2006-11-30 07:47:55 UTC
Created attachment 9688 [details]
Full config
Comment 2 ykzhao 2007-10-17 01:41:08 UTC
Hi, Yauhen 
Can the system work well ?
Comment 3 Yauhen Kharuzhy 2007-10-17 03:45:59 UTC
Yes, kernels >= 2.6.21 work well.
Comment 4 Sam Liddicott 2007-10-20 05:56:17 UTC
2.6.22, 2.6.23 both hang for me a short while after processor.ko loads.
They hang so tightly that even alt-sysreq-* fail to do anything.

Many details are posted here:
https://bugs.launchpad.net/linux/+bug/144030

The easiest way to reproduce the hang on-demand is to:
insmod /somewhere-else/processor.ko
/etc/init.d/gdm start
Comment 5 Sam Liddicott 2007-10-20 07:02:08 UTC
BTW I forgot to mention this occurs with SMP or UP kernels on a UP machine.
and in fact while in an X session, insmod processor.ko will hang the system right away.
Comment 6 Venkatesh Pallipadi 2007-11-20 14:53:19 UTC
Can you try following two boot options and let me know if any helps
- "max_cstate=1"
- "max_cstate=2"

Also, is 2.6.16 last known kernel that did not have the problem? Can you give dmesg with that kernel (with ACPI_PROCESSOR loaded).
Comment 7 Venkatesh Pallipadi 2007-11-20 14:54:49 UTC
Also attach acpidump output from your system please.
(in latest pmtools here - http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ )
Comment 8 Sam Liddicott 2008-01-09 12:58:25 UTC
Created attachment 14386 [details]
acpidump from acer 9300 laptop

Here's the latest acpidump of my acer 9300 laptop that also hangs soon after boot
Comment 9 Sam Liddicott 2008-01-09 13:10:32 UTC
with max_cstate=1 or max_cstate=2 ubuntu still hangs (no alt-sysreq) as soon as GDM starts.
Comment 10 Sam Liddicott 2008-01-09 13:26:00 UTC
Created attachment 14388 [details]
dmesg of boot on 2.6.20 kernel (from ubuntu fesity) which doesn't hang

The ubuntu feisty fawn kernel does not hang; here is a dmesg from that kernel.
Comment 11 Sam Liddicott 2008-01-09 13:50:19 UTC
Created attachment 14389 [details]
dmesg (without noacpi !!) for 2.6.20 (from ubuntu feisty) which doesn't hang

this time I booted the fesity kernel without noacpi
Comment 12 Sam Liddicott 2008-01-09 13:51:53 UTC
Created attachment 14390 [details]
dmesg for 2.6.22 without noacpi 

This dmesg is the hanging kernel, before it hung.
(modprobe processor.ko from an X session is a sure way to hang it.
It takes a while to hang without X.)
Comment 13 Sam Liddicott 2008-01-12 08:51:22 UTC
I'm not getting a problem with 2.6.24-3 s part of Ubuntu Hardy Heron alpha 3.
Sadly the upgrade failed an I had to fresh install making debuging the previous kernel sort of difficult... however I will make a fresh feisty->gutsy installation to debug this further if you want.

Otherwise... it seems "fixed" in 2.6.24
Comment 14 Sam Liddicott 2008-03-14 01:35:47 UTC
I think this has come back with 2.6.24-12 as part of Ubuntu Hardy Heron updates
Comment 15 Venkatesh Pallipadi 2008-04-21 16:16:20 UTC
Is this still a problem with 2.6.25.
Comment 16 Zhang Rui 2008-07-14 20:42:46 UTC
can you please verify if the problem still exists with 2.6.26?
Comment 17 Sam Liddicott 2008-07-16 00:02:06 UTC
Is there a particular 2.6.26 set of debs you want me to try? I'll need 
the nvidia to go with it.

Sam
Comment 18 Zhang Rui 2008-07-16 00:37:58 UTC
no,
but you'd better try a clean kernel.org kernel.
Comment 19 Sam Liddicott 2008-07-16 22:56:14 UTC
Would you be happy with an intrepid ibex test?

Sam
Comment 20 ykzhao 2008-07-21 22:16:24 UTC
Hi, Sam
   Will you please add the boot option of "hpet=off pci=nobios" on the latest kernel and see whether the problem still exists? 
   Thanks.
Comment 21 ykzhao 2008-07-21 23:34:50 UTC
Sorry the correct boot option should be "hpet=disable pci=nobios".("hpet=off" is not correct option).
Thanks.
Comment 22 Sam Liddicott 2008-07-22 22:56:03 UTC
I'll try that on 2.6.24 but it may take a while to gather enough statistical evidence as more recent kernels crashed less often.

I'll put those in the boot conf today.

Thanks for the tip.

Sam

-----Original Message-----
From: bugme-daemon@bugzilla.kernel.org
Sent: Tuesday, July 22, 2008 7:34 AM
To: sam@liddicott.com
Subject: [Bug 7598] SMP kernel hangs up after load processor.ko on UP system

http://bugzilla.kernel.org/show_bug.cgi?id=7598





------- Comment #21 from yakui.zhao@intel.com  2008-07-21 23:34 -------
Sorry the correct boot option should be "hpet=disable pci=nobios".("hpet=off"
is not correct option).
Thanks.
Comment 23 Sam Liddicott 2008-07-25 23:47:00 UTC
> ------- Comment #21 from yakui.zhao@intel.com  2008-07-21 23:34 -------
> Sorry the correct boot option should be "hpet=disable pci=nobios".("hpet=off"
> is not correct option).
> Thanks.
>   
Did You want me to try this on 2.6.26 or 2.6.24 ?

Sam
Comment 24 ykzhao 2008-07-27 08:13:49 UTC
Hi, Sam 
   Either of 2.6.24 and 2.6.26 is OK.
Comment 25 Zhang Rui 2008-09-09 19:34:59 UTC
sam, any updates?
Comment 26 Sam Liddicott 2008-09-09 23:21:53 UTC
With a intrepid kernel and the two boot options I was asked to use there have been no hangs.

It's been enough for me to be statistically sure that these 2 boot options made a difference.

Sam

-----Original Message-----
From: bugme-daemon@bugzilla.kernel.org
Sent: 10 September 2008 03:34
To: sam@liddicott.com
Subject: [Bug 7598] SMP kernel hangs up after load processor.ko on UP system

http://bugzilla.kernel.org/show_bug.cgi?id=7598


rui.zhang@intel.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |NEEDINFO




------- Comment #25 from rui.zhang@intel.com  2008-09-09 19:34 -------
sam, any updates?
Comment 27 Shaohua 2008-10-14 22:24:47 UTC
does a latest kernel work for you without any boot option? There are a lot of time/cstate updates recently.
Comment 28 Zhang Rui 2008-11-16 22:39:52 UTC
no response from the bug reporter.
sam, please re-open the bug if the problem still exists in the latest kernel, e.g. 2.6.27

Note You need to log in before you can comment on or make changes to this bug.