Bug 61501
Summary: | [BISECTED]Kernels greater than 3.9.11 (all 3.10 and 3.11 kernels I have tried) will not operate in smp with more than one processor. | ||
---|---|---|---|
Product: | Platform Specific/Hardware | Reporter: | dustin (dustin.glidden) |
Component: | SPARC64 | Assignee: | platform_sparc64 |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | adrien.dessemond, alan, joel.bertrand |
Priority: | P1 | ||
Hardware: | Sparc64 | ||
OS: | Linux | ||
Kernel Version: | >3.9.11 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
dustin
2013-09-16 17:37:43 UTC
It was this commit: e0972916e8fe943f342b0dd1c9d43dbf5bc261c2 which caused the issue. Unfortunately this was when the perf-core-for-linus branch was merged and reverting the files breaks everything. Same constatation here with several T1000 and kernel 3.12.9. Have you found a workaround ? I have not, though not for a lack of trying. I am currently running on 3.4.78. (In reply to BERTRAND Joël from comment #2) > Same constatation here with several T1000 and kernel 3.12.9. Have you found > a workaround ? Same probleme here with a T1000 machine al (Sorry for the previous bogus post, my mistake) I do have a very similar issue here with a T1000 machine. I tried linux 3.14-rc7: in my case the kernel boots without any trouble or special error message until it tries to initialize the SAS controller. The very last message is: [ 36.843027] Copyright (c) 1999-2008 LSI Corporation [ 36.843104] Fusion MPT SAS Host driver 3.04.20 [ 81.965701] Fusion MPT SAS Host driver 3.04.20 [ 81.966264] mptbase: ioc0: Initiating bringup [ 115.865372] ioc0: LSISAS1064 A3: Capabilities={Initiator} Then, a few seconds later, the kernel complains about CPU stalls: [ 141.865227] INFO: rcu_sched detected stalls on CPUs/tasks: { 12 14 16 17 18 20 21 22 23 24 25 26 27 28 29 30 31} (detected by 0, t=60431 jiffies, g=18446744073709551320, c=18446744073709551319, q=1766) [ 141.866285] * CPU[ 0]: TSTATE[0000000080001603] TPC[000000000042c174] TNPC[000000000042c178] TASK[swapper/0:0] [ 141.866596] TPC[arch_cpu_idle+0x74/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[start_kernel+0x3b4/0x3c4] [ 141.866892] CPU[ 1]: TSTATE[0000000080001602] TPC[000000000042c170] TNPC[000000000042c174] TASK[swapper/1:0] [ 141.867033] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] [ 141.867148] CPU[ 2]: TSTATE[0000000080001602] TPC[000000000042c170] TNPC[000000000042c174] TASK[swapper/2:0] [ 141.867286] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] [ 141.867399] CPU[ 3]: TSTATE[0000000080001602] TPC[000000000042c170] TNPC[000000000042c174] TASK[swapper/3:0] (..) [ 186.891368] scsi0 : ioc0: LSISAS1064 A3, FwRev=010a0000h, Ports=1, MaxQ=511, IRQ=25 [ 186.949673] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 0, sas_addr 0x******************* [ 186.952366] scsi 0:0:0:0: Direct-Access SEAGATE ST973401LSUN72G 0556 PQ: 0 ANSI: 3 [ 186.956645] sd 0:0:0:0: [sda] 143374738 512-byte logical blocks: (73.4 GB/68.3 GiB) [ 186.958001] sd 0:0:0:0: [sda] Write Protect is off [ 186.959832] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 1, phy 1, sas_addr 0x******************* [ 186.960482] sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 186.962365] scsi 0:0:1:0: Direct-Access SEAGATE ST973401LSUN72G 0556 PQ: 0 ANSI: 3 [ 186.966631] sd 0:0:1:0: [sdb] 143374738 512-byte logical blocks: (73.4 GB/68.3 GiB) [ 186.968692] sd 0:0:1:0: [sdb] Write Protect is off [ 186.969007] Fusion MPT misc device (ioctl) driver 3.04.20 [ 186.969298] mptctl: Registered with Fusion MPT base driver [ 186.969361] mptctl: /dev/mptctl @ (major,minor=10,220) [ 186.970212] sd 0:0:1:0: [sdb] Write cache: disabled, read cache: enabled, supports DPO and FUA [ 186.970525] mousedev: PS/2 mouse device common for all mice [ 186.974788] rtc-sun4v rtc-sun4v: rtc core: registered sun4v as rtc0 [ 186.976103] sda: sda1 sda2 sda3 sda4 [ 186.979675] TCP: cubic registered [ 186.979723] NET: Registered protocol family 17 [ 186.979854] Key type dns_resolver registered [ 186.980828] registered taskstats version 1 [ 186.983061] sd 0:0:0:0: [sda] Attached SCSI disk [ 186.985611] rtc-sun4v rtc-sun4v: setting system clock to 2014-03-23 23:19:15 UTC (1395616755) [ 187.010470] sdb: sdb1 sdb2 sdb3 [ 187.016016] sd 0:0:1:0: [sdb] Attached SCSI disk [ 187.054958] EXT4-fs (sda4): mounted filesystem with ordered data mode. Opts: (null) [ 187.055058] VFS: Mounted root (ext4 filesystem) readonly on device 8:4. INIT: version 2.88 booting [ 247.865477] INFO: rcu_sched detected stalls on CPUs/tasks: { 10} (detected by 0, t=60883 jiffies, g=18446744073709551325, c=18446744073709551324, q=424) [ 247.865846] * CPU[ 0]: TSTATE[0000000080001603] TPC[000000000042c174] TNPC[000000000042c178] TASK[swapper/0:0] [ 247.866181] TPC[arch_cpu_idle+0x74/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[start_kernel+0x3b4/0x3c4] (...) 141.878969] CPU[ 31]: TSTATE[0000000080001602] TPC[000000000042c170] TNPC[000000000042c174] TASK[swapper/31:0] [ 141.879105] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] [ 186.891368] scsi0 : ioc0: LSISAS1064 A3, FwRev=010a0000h, Ports=1, MaxQ=511, IRQ=25 [ 186.949673] mptsas: ioc0: attaching ssp device: fw_channel 0, fw_id 0, phy 0, sas_addr 0x******************** [ 186.952366] scsi 0:0:0:0: Direct-Access SEAGATE ST973401LSUN72G 0556 PQ: 0 ANSI: 3 [ 186.956645] sd 0:0:0:0: [sda] 143374738 512-byte logical blocks: (73.4 GB/68.3 GiB) (....) [ 187.016016] sd 0:0:1:0: [sdb] Attached SCSI disk [ 187.054958] EXT4-fs (sda4): mounted filesystem with ordered data mode. Opts: (null) [ 187.055058] VFS: Mounted root (ext4 filesystem) readonly on device 8:4. INIT: version 2.88 booting [ 247.865477] INFO: rcu_sched detected stalls on CPUs/tasks: { 10} (detected by 0, t=60883 jiffies, g=18446744073709551325, c=18446744073709551324, q=424) [ 247.865846] * CPU[ 0]: TSTATE[0000000080001603] TPC[000000000042c174] TNPC[000000000042c178] TASK[swapper/0:0] [ 247.866181] TPC[arch_cpu_idle+0x74/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[start_kernel+0x3b4/0x3c4] [ 247.866450] CPU[ 1]: TSTATE[0000000080001602] TPC[000000000042c170] TNPC[000000000042c174] TASK[swapper/1:0] [ 247.866758] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] [ 247.867043] CPU[ 2]: TSTATE[0000000080001602] TPC[000000000042c170] TNPC[000000000042c174] TASK[swapper/2:0] [ 247.867328] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] (...) [ 247.878323] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] [ 247.878437] CPU[ 31]: TSTATE[0000000080001602] TPC[000000000042c170] TNPC[000000000042c174] TASK[swapper/31:0] [ 247.878574] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] [ 248.119922] random: nonblocking pool is initialized [ 308.865618] INFO: rcu_sched detected stalls on CPUs/tasks: { 1 2 3 4 5 6 8 9 10 11 12 14 15 20 22 23 24 25 26 27 28 29 30 31} (detected by 0, t=60986 jiffies, g=18446744073709551326, c=18446744073709551325, q=423) (...) [ 369.947353] TPC[arch_cpu_idle+0x70/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[0x951378] OpenRC 0.10 is starting up Funtoo Linux (sparc64) * Mounting /proc ... [ ok ] * Mounting /run ... * /run/openrc: creating directory * /run/lock: creating directory * /run/lock: correcting owner [ 430.865908] INFO: rcu_sched detected stalls on CPUs/tasks: { 1 4 8 15 16 24 25 26 27 28 29 30 31} (detected by 0, t=60651 jiffies, g=18446744073709551332, c=18446744073709551331, q=355) [ 430.866927] * CPU[ 0]: TSTATE[0000000080001603] TPC[000000000042c174] TNPC[000000000042c178] TASK[swapper/0:0] [ 430.867189] TPC[arch_cpu_idle+0x74/0xa0] O7[arch_cpu_idle+0x5c/0xa0] I7[cpu_startup_entry+0x114/0x1a0] RPC[start_kernel+0x3b4/0x3c4] (...) but seems to stall there forever. With maxcpus=1 the kernel boots without any complaint like explained in a previous comment. Reported on my side as bug #72841, although we might have the exact same issue. Also I found a discussion here (started 2 days ago) : http://www.spinics.net/lists/sparclinux/msg11805.html I hope it will help you. May I suggest you to try with Linux 3.14-rc8? It solved the issue on my side. FYI: https://www.kernel.org/diff/diffview.cgi?file=%2Fpub%2Flinux%2Fkernel%2Fv3.x%2Ftesting%2Fpatch-3.14-rc8.xz;z=2363 I can confirm that 3.14-rc8 fixed the issue, thanks! |