Check out my first novel, midnight's simulacra!
SMP on x86: Difference between revisions
No edit summary |
|||
(11 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
The primary specification for multiprocessor [[x86]]-based setups is the [http://www.intel.com/design/pentium/datashts/242016.HTM Intel MultiProcessor Specification] (last updated, AFAIK, to revision-006 on 1995-05-15). | The primary specification for multiprocessor [[x86]]-based setups is the [http://www.intel.com/design/pentium/datashts/242016.HTM Intel MultiProcessor Specification] (last updated, AFAIK, to revision-006 on 1995-05-15). | ||
==SMT== | |||
* HyperThreading (Intel SMT) requires CPU, BIOS and OS support. Introduced on the P4. | |||
** Found on (all) i7's, [http://www.intel.com/technology/atom/microarchitecture.htm some Atoms], and some P4's and Core2Duo's (especially those with Xeon branding). | |||
** Pentium-M and Celerons usually lack SMT. | |||
* Unduplicated resources are either split or shared between logical cores: | |||
** Shared: reservation stations, data caches | |||
** Split: reorder buffers, load/store buffers | |||
** Duplicated: registers | |||
* No programmable priority control exported, no implicit priorities defined | |||
==Intel MP IDs== | |||
* Each logical processor is assigned a unique (but not necessarily sequential) 8-bit identifier at boot, the APIC ID | |||
** The initial APIC ID can be retrieved via [[cpuid]], as can packaging data: | |||
*** "logical cores per package" (CPUID.1.EBX[23:16]): Maximum number of logical processors in a physical package, as manufactured | |||
*** "cores per package" (CPUID.4.EAX[31:26] + 1): Maximum number of physical processors (cores) in a physical package, as manufactured | |||
*** "logical processors sharing a cache" (CPUID.4.EAX[25:14] + 1): Maximum number of logical processors in a physical package sharing a given cachelevel | |||
*** Intel MP only addresses homogeneous setups, so these three values are (as of October 2009) equivalent for all processors | |||
*** These last two require leaf level 4 [[cpuid]] support; if it is not provided, the package is a unicore | |||
* APIC ID is formed of SMT_ID|CORE_ID|PACKAGE_ID, having widths defined by the packaging data: | |||
** SMT_ID is 0 bits on a non-HyperThreaded processor. | |||
** CORE_ID is 0 bits on a unicore package | |||
** All remaining bits are devoted to PACKAGE_ID | |||
==Interrupts== | ==Interrupts== | ||
* IO-APIC routes hardware interrupts to various CPUs ([http://www.mjmwired.net/kernel/Documentation/x86/i386/IO-APIC.txt Linux's IO-APIC.txt]) | * IO-APIC routes hardware interrupts to various CPUs ([http://www.mjmwired.net/kernel/Documentation/x86/i386/IO-APIC.txt Linux's IO-APIC.txt]) | ||
Line 27: | Line 51: | ||
MIS: 0 | MIS: 0 | ||
[recombinator](0) $ </pre> | [recombinator](0) $ </pre> | ||
==[[ACPI]] | ==Discovery== | ||
* [[ACPI]]'s MADT table can supply multiprocessor configuration, and is usually used if present | |||
* MP Table. The FreeBSD program <tt>mptable(1)</tt> can dump this: | * MP Table. The FreeBSD program <tt>mptable(1)</tt> can dump this: | ||
<pre>[bryhlath](0) $ sudo mptable | <pre>[bryhlath](0) $ sudo mptable | ||
Line 116: | Line 140: | ||
[bryhlath](0) $ </pre> | [bryhlath](0) $ </pre> | ||
==/proc/cpuinfo== | ==/proc/cpuinfo== | ||
Line 476: | Line 496: | ||
* The [[cpuid]] instruction can interrogate each processing unit | * The [[cpuid]] instruction can interrogate each processing unit | ||
* [[Cpuset|CPUsets]] | * [[Cpuset|CPUsets]] | ||
* "[http://software.intel.com/en-us/articles/multi-core-detect/ Detecting Multicore Processors]", Intel Software Network | |||
[[Category: x86]] | [[Category: x86]] | ||
[[CATEGORY: Hardware]] |
Latest revision as of 19:35, 22 March 2010
The primary specification for multiprocessor x86-based setups is the Intel MultiProcessor Specification (last updated, AFAIK, to revision-006 on 1995-05-15).
SMT
- HyperThreading (Intel SMT) requires CPU, BIOS and OS support. Introduced on the P4.
- Found on (all) i7's, some Atoms, and some P4's and Core2Duo's (especially those with Xeon branding).
- Pentium-M and Celerons usually lack SMT.
- Unduplicated resources are either split or shared between logical cores:
- Shared: reservation stations, data caches
- Split: reorder buffers, load/store buffers
- Duplicated: registers
- No programmable priority control exported, no implicit priorities defined
Intel MP IDs
- Each logical processor is assigned a unique (but not necessarily sequential) 8-bit identifier at boot, the APIC ID
- The initial APIC ID can be retrieved via cpuid, as can packaging data:
- "logical cores per package" (CPUID.1.EBX[23:16]): Maximum number of logical processors in a physical package, as manufactured
- "cores per package" (CPUID.4.EAX[31:26] + 1): Maximum number of physical processors (cores) in a physical package, as manufactured
- "logical processors sharing a cache" (CPUID.4.EAX[25:14] + 1): Maximum number of logical processors in a physical package sharing a given cachelevel
- Intel MP only addresses homogeneous setups, so these three values are (as of October 2009) equivalent for all processors
- These last two require leaf level 4 cpuid support; if it is not provided, the package is a unicore
- The initial APIC ID can be retrieved via cpuid, as can packaging data:
- APIC ID is formed of SMT_ID|CORE_ID|PACKAGE_ID, having widths defined by the packaging data:
- SMT_ID is 0 bits on a non-HyperThreaded processor.
- CORE_ID is 0 bits on a unicore package
- All remaining bits are devoted to PACKAGE_ID
Interrupts
- IO-APIC routes hardware interrupts to various CPUs (Linux's IO-APIC.txt)
[recombinator](0) $ cat /proc/interrupts CPU0 CPU1 0: 491 0 IO-APIC-edge timer 8: 87 0 IO-APIC-edge rtc0 9: 0 0 IO-APIC-fasteoi acpi 16: 609652 0 IO-APIC-fasteoi uhci_hcd:usb1, heci 17: 0 0 IO-APIC-fasteoi pata_marvell 18: 33141 0 IO-APIC-fasteoi sata_promise, uhci_hcd:usb5, ehci_hcd:usb6 19: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 21: 15985105 0 IO-APIC-fasteoi uhci_hcd:usb2, ath 23: 8730160 0 IO-APIC-fasteoi uhci_hcd:usb3, ehci_hcd:usb7 29: 1855556 0 PCI-MSI-edge i915 30: 1109316 0 PCI-MSI-edge ahci 31: 66952 0 PCI-MSI-edge e1000 NMI: 0 0 Non-maskable interrupts LOC: 14417376 16273096 Local timer interrupts SPU: 0 0 Spurious interrupts RES: 150950 182106 Rescheduling interrupts CAL: 278 611 Function call interrupts TLB: 33349 53705 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts ERR: 0 MIS: 0 [recombinator](0) $
Discovery
- ACPI's MADT table can supply multiprocessor configuration, and is usually used if present
- MP Table. The FreeBSD program mptable(1) can dump this:
[bryhlath](0) $ sudo mptable =============================================================================== MPTable ------------------------------------------------------------------------------- MP Floating Pointer Structure: location: BIOS physical address: 0x000fbd10 signature: '_MP_' length: 16 bytes version: 1.4 checksum: 0xc6 mode: Virtual Wire ------------------------------------------------------------------------------- MP Config Table Header: physical address: 0x000fbb10 signature: 'PCMP' base table length: 508 version: 1.4 checksum: 0x84 OEM ID: 'QEMUCPU ' Product ID: '0.1 ' OEM table pointer: 0x00000000 OEM table size: 0 entry count: 34 local APIC address: 0xfee00000 extended table length: 0 extended table checksum: 0 ------------------------------------------------------------------------------- MP Config Base Table Entries: -- Processors: APIC ID Version State Family Model Step Flags 0 0x11 BSP, usable 6 0 0 0x0201 1 0x11 AP, unusable 6 0 0 0x0201 2 0x11 AP, unusable 6 0 0 0x0201 3 0x11 AP, unusable 6 0 0 0x0201 4 0x11 AP, unusable 6 0 0 0x0201 5 0x11 AP, unusable 6 0 0 0x0201 6 0x11 AP, unusable 6 0 0 0x0201 7 0x11 AP, unusable 6 0 0 0x0201 8 0x11 AP, unusable 6 0 0 0x0201 9 0x11 AP, unusable 6 0 0 0x0201 10 0x11 AP, unusable 6 0 0 0x0201 11 0x11 AP, unusable 6 0 0 0x0201 12 0x11 AP, unusable 6 0 0 0x0201 13 0x11 AP, unusable 6 0 0 0x0201 14 0x11 AP, unusable 6 0 0 0x0201 15 0x11 AP, unusable 6 0 0 0x0201 -- Bus: Bus ID Type 0 ISA -- I/O APICs: APIC ID Version State Address 1 0x11 usable 0xfec00000 -- I/O Ints: Type Polarity Trigger Bus ID IRQ APIC ID PIN# INT conforms conforms 0 0 1 0 INT conforms conforms 0 1 1 1 INT conforms conforms 0 2 1 2 INT conforms conforms 0 3 1 3 INT conforms conforms 0 4 1 4 INT conforms conforms 0 5 1 5 INT conforms conforms 0 6 1 6 INT conforms conforms 0 7 1 7 INT conforms conforms 0 8 1 8 INT conforms conforms 0 9 1 9 INT conforms conforms 0 10 1 10 INT conforms conforms 0 11 1 11 INT conforms conforms 0 12 1 12 INT conforms conforms 0 13 1 13 INT conforms conforms 0 14 1 14 INT conforms conforms 0 15 1 15 =============================================================================== [bryhlath](0) $
/proc/cpuinfo
On Linux kernels with the proc filesystem enabled (and FreeBSD kernels with the linprocfs module loaded, although this does not provide all of the information as native Linux /proc/cpuinfo), the mounted proc/linprocfs filesystem contains a file cpuinfo (this is independent of any CPU-related modules being loaded, particularly cpuid on Linux or either OS's cpu module). Interpreting this file, as it pertains to multiple execution units, can be difficult. The following applies to Linux 2.6 kernels:
- A physical_id corresponds to a socket ("physical package"), of which there are >=1 per machine. Physical IDs do not necessarily monotonically increase across processors, and thus the maximum physical_id does not by itself determine the number of sockets!
- A core_id corresponds to a core ("logical processor"), of which there are >=1 per physical_id
- A processor ID corresponds to an architectural state (HyperThreading == 2 per HyperThreaded core)
- HyperThreading is in use only if 'siblings' != 'cpu cores' (from http://kbase.redhat.com/faq/FAQ_46_10715.shtm)
- The 'ht' processor capabilities bit corresponds not to HyperThreading, but to the ability to report sibling count
Examples
I've removed all output from the following examples, save that related to SMP identification.
- EMT64 Xeon, no HyperThreading support, 4 cores per socket, 4 sockets: 16 total execution units (2.6.26) (Dell R900)
[wopr](0) $ cat /proc/cpuinfo | egrep ^proc\|^model\ \|^phys\|^sib\|^core\|^cpu\ c\|^ap\|^init processor : 0 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 processor : 1 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 2 siblings : 4 core id : 0 cpu cores : 4 apicid : 8 initial apicid : 8 processor : 2 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 4 siblings : 4 core id : 0 cpu cores : 4 apicid : 16 initial apicid : 16 processor : 3 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 6 siblings : 4 core id : 0 cpu cores : 4 apicid : 24 initial apicid : 24 processor : 4 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 0 siblings : 4 core id : 2 cpu cores : 4 apicid : 2 initial apicid : 2 processor : 5 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 2 siblings : 4 core id : 2 cpu cores : 4 apicid : 10 initial apicid : 10 processor : 6 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 4 siblings : 4 core id : 2 cpu cores : 4 apicid : 18 initial apicid : 18 processor : 7 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 6 siblings : 4 core id : 2 cpu cores : 4 apicid : 26 initial apicid : 26 processor : 8 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 0 siblings : 4 core id : 1 cpu cores : 4 apicid : 1 initial apicid : 1 processor : 9 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 2 siblings : 4 core id : 1 cpu cores : 4 apicid : 9 initial apicid : 9 processor : 10 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 4 siblings : 4 core id : 1 cpu cores : 4 apicid : 17 initial apicid : 17 processor : 11 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 6 siblings : 4 core id : 1 cpu cores : 4 apicid : 25 initial apicid : 25 processor : 12 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 initial apicid : 3 processor : 13 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 2 siblings : 4 core id : 3 cpu cores : 4 apicid : 11 initial apicid : 11 processor : 14 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 4 siblings : 4 core id : 3 cpu cores : 4 apicid : 19 initial apicid : 19 processor : 15 model name : Intel(R) Xeon(R) CPU X7350 @ 2.93GHz physical id : 6 siblings : 4 core id : 3 cpu cores : 4 apicid : 27 initial apicid : 27 [wopr](0) $
- Xeon E5520, HyperThreading enabled, 4 cores per socket, 2 sockets: 16 total execution units (Dell R710)
[dumbledore](0) $ egrep '^(proc|phys|sib|acpi|model )|core' /proc/cpuinfo processor : 0 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 0 cpu cores : 4 processor : 1 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 0 cpu cores : 4 processor : 2 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 1 cpu cores : 4 processor : 3 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 1 cpu cores : 4 processor : 4 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 2 cpu cores : 4 processor : 5 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 2 cpu cores : 4 processor : 6 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 3 cpu cores : 4 processor : 7 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 3 cpu cores : 4 processor : 8 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 0 cpu cores : 4 processor : 9 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 0 cpu cores : 4 processor : 10 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 1 cpu cores : 4 processor : 11 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 1 cpu cores : 4 processor : 12 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 2 cpu cores : 4 processor : 13 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 2 cpu cores : 4 processor : 14 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 1 siblings : 8 core id : 3 cpu cores : 4 processor : 15 model name : Intel(R) Xeon(R) CPU E5520 @ 2.27GHz physical id : 0 siblings : 8 core id : 3 cpu cores : 4 [dumbledore](0) $
- Core 2 Duo, no HyperThreading support, 2 cores per socket, 1 socket: 2 total execution units (2.6.26)
[recombinator](0) $ cat /proc/cpuinfo processor : 0 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 processor : 1 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz physical id : 0 siblings : 2 core id : 1 cpu cores : 2 apicid : 1 initial apicid : 1 [recombinator](0) $
- P4 Xeon, HyperThreading enabled, 1 core per socket, 2 sockets: 4 total execution units (2.6.25-2-686)
scurdev@hrududu:~$ cat /proc/cpuinfo processor : 0 model name : Intel(R) Xeon(TM) CPU 2.40GHz physical id : 0 siblings : 2 core id : 0 cpu cores : 1 processor : 1 model name : Intel(R) Xeon(TM) CPU 2.40GHz physical id : 0 siblings : 2 core id : 0 cpu cores : 1 processor : 2 model name : Intel(R) Xeon(TM) CPU 2.40GHz physical id : 3 siblings : 2 core id : 0 cpu cores : 1 processor : 3 model name : Intel(R) Xeon(TM) CPU 2.40GHz physical id : 3 siblings : 2 core id : 0 cpu cores : 1 scurdev@hrududu:~$
- P4 Xeon, HyperThreading disabled, 1 core per socket, 2 sockets: 2 total execution units (2.6.25-2-686)
[aho](0) $ cat /proc/cpuinfo processor : 0 model name : Intel(R) Xeon(TM) CPU 2.80GHz processor : 1 model name : Intel(R) Xeon(TM) CPU 2.80GHz [aho](0) $
- P4 Xeon Celeron, HyperThreading disabled, 1 core per socket, 1 socket: 1 total execution unit (2.6.25-2-686)
[knuth](0) $ cat /proc/cpuinfo processor : 0 model name : Intel(R) Celeron(R) CPU 2.00GHz [knuth](0) $
Sysctl
On FreeBSD, CPU/SMP information is primarily exported through the sysctl(8) interface. Seemingly relevant sysctls are listed below:
- kern.smp.cpus and hw.ncpu
- machdep.hyperthreading.allowed
- kern.threads.virtual_cpu
libvirt
- virsh(1)'s nodeinfo command can be pretty useful:
[wopr](0) $ virsh nodeinfo CPU model: x86_64 CPU(s): 16 CPU frequency: 1600 MHz CPU socket(s): 4 Core(s) per socket: 4 Thread(s) per core: 1 NUMA cell(s): 1 Memory size: 66113480 kB [wopr](0) $
[recombinator](0) $ sudo virsh nodeinfo CPU model: x86_64 CPU(s): 2 CPU frequency: 1596 MHz CPU socket(s): 1 Core(s) per socket: 2 Thread(s) per core: 1 NUMA cell(s): 1 Memory size: 3908568 kB [recombinator](0) $
See also
- This LKML thread provides much information
- The cpuid instruction can interrogate each processing unit
- CPUsets
- "Detecting Multicore Processors", Intel Software Network