Bug 2317

Summary: How to disable HT
Product: ACPI Reporter: Len Brown (lenb)
Component: Config-ProcessorsAssignee: acpi_config-processors
Status: CLOSED CODE_FIX    
Severity: normal CC: acpi-bugzilla
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.4, 2.6 Subsystem:
Regression: --- Bisected commit-id:
Attachments: 2.6 patch to move maxcpus=N to parse-time

Description Len Brown 2004-03-16 11:19:04 UTC
Kernel config adds HT support when the CPU family supports it 
and CONFIG_SMP is enabled.  It is not possible to disable 
CONFIG_HT if these two are true.  The intent of this design is to 
"do the right thing" for the typical user by minimizing config surprises 
for the common case.  However, this frustrates experts who wish to disable HT. 
 
Kernel boot flags used to include "noht" -- but it was broken so it was deleted.  
Further, "acpi=off" is (intentionally) not effective at disabling HT. 
 
So how can one disable HT? 
 
1. BIOS SETUP 
 
This is always the preferred method.  The reason is if an HT implementation 
requires special initialization to optimally coalesce duplicated hardware 
resources, the BIOS will know how to do it.  OS methods may run 
too late to be effective and so maximum UP performance may not be reached. 
 
However, some OEMs do not provide a !HT BIOS SETUP option. 
Also, sometimes it is not practical to run BIOS SETUP -- 
such as on a large cluster of systems. 
 
2. Run a UniProcessor Kernel 
 
The UP kernel will never start up logical processors, but this has 2 problems. 
 
A. only appropriate for a single physical processor system. 
Systems with multiple physical processors will run just 1 processor 
when running the UP kernel. 
 
B. popular distro kernels exclude IOAPIC support from UP kernel. 
The workaround is to boot the SMP kernel with "maxcpus=1" so 
that it will run just 1 processor and still enable the IOAPIC. 
(n.b. "nosmp" and "maxcpus=0" will also run just 1 processor, 
 but will disable the IOAPIC, which is sub-optimal) 
 
3. build-time CPU enumeration order 
 
Intel's BIOS writer's guide suggests that vendors enumerate processors 
physical first, and then logical.  This is why you may see processor LAPIC 
numbers enumerated 0,6,1,7 -- for example.  0 and 6 are primary, and 
1 and 7 are secondary siblings. 
 
The CPU enumeration code will refuse to add more than NR_CPUS processors. 
If the system follows the guidelines., then NR_CPUS can 
be used to enable just the physical processors and exclude the logical siblings. 
However, this requires a kernel re-build to tune NR_CPUS. 
 
Note that not all BIOS vendors enumerate processors in this order. 
But if they do, the following boot messages confirm that HT is disabled 
(here 2 physical of 4 logical processors are started) 
 
Total of 2 processors activated (11091.96 BogoMIPS). 
WARNING: No sibling found for CPU 0. 
WARNING: No sibling found for CPU 1. 
 
4. boot-time CPU enumeration order 
 
maxcpus=N currently runs at smpboot time, when the processors are started 
in LAPIC order, eg 0,1,6,7 -- so it is not effective for disabling siblings.  However 
a patch to this bug report will move maxcpus=N processing to enumeration time, 
where it will effectively be the boot-time equivalent of method #3 above. 
 
5. LAPIC id decoding at enumeration time 
 
Per the documentation on http://developer.intel.com/technology/hyperthread/ 
the package numbers and sibling numbers are encoded in the local-APIC id. 
 
It is possible at cpu-enumeration time to run CPUID to identify the number 
of siblings, and then decode the LAPIC numbers as given in the ACPI 
MADT, refusing to enumerate processors that are logical siblings. 
 
The problem with this is that the LAPIC-id is programmable, and may 
have been over-written by the OEM's BIOS such that it no longer 
properly encodes the package/sibling bits. 
 
6. Run-time CPUID 
 
The RESET-LAPIC-id can be accessed by running the CPUID instruction. 
This is not re-programmable, and thus is not subject to error such as the 
LAPIC-id in technique #3.  But there are two problems with this technique: 
 
A. CPUID must run on the local processor.  This means that the processor 
must be started up in order to learn if it should be started up or not... 
When Linux has reliable cpu offline features, this will be the preferred 
way to offline logical HT siblings.  But Linux does not currently have 
cpu offline features. 
 
B. exotic hardware, such as the the IBM Summit, does not give an appropriate 
answer in the RESET-LAPIC-id.  This, however, is a CPU sub-architecture issue.
Comment 1 Len Brown 2004-03-18 18:21:45 UTC
Created attachment 2367 [details]
2.6 patch to move maxcpus=N to parse-time
Comment 2 Robin Humble 2004-03-19 19:56:00 UTC
the patch works for us. machines are intel 'westville' dual Xeons.
http://support.intel.com/support/motherboards/server/se7500wv2/

maxcpus=2 with a vanilla 2.6.4 gives us one physical and one logical cpu.
maxcpus=2 with this patch gives us 2 physical cpus (physical id: 0 and 3).

make -j3 kernel compiles, and serial and parallel computationally intensive
codes verify that the machine is in the state that /proc/cpuinfo says.
they run slowly without the patch, and one on each physical cpu with the patch.

cheers,
robin
Comment 3 Len Brown 2004-04-13 23:18:00 UTC
integrated into 2.4.26 and 2.6.5 -- closing 
 
ps. the statement about "acpi=off" not disabling HT is incorrect. 
We fixed an issue where "acpi=off" was insufficient 
to avoid garbled ACPI tables.  Now "acpi=off" skips 
all table parsing, which does disable HT (and everything else).