Check out my first novel, midnight's simulacra!

SMP on x86: Difference between revisions

From dankwiki
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
The primary specification for multiprocessor [[x86]]-based setups is the [http://www.intel.com/design/pentium/datashts/242016.HTM Intel MultiProcessor Specification] (last updated, AFAIK, to revision-006 on 1995-05-15).
The primary specification for multiprocessor [[x86]]-based setups is the [http://www.intel.com/design/pentium/datashts/242016.HTM Intel MultiProcessor Specification] (last updated, AFAIK, to revision-006 on 1995-05-15).
==SMT==
* HyperThreading (Intel SMT) requires CPU, BIOS and OS support. Introduced on the P4.
** Found on (all) i7's, [http://www.intel.com/technology/atom/microarchitecture.htm some Atoms], and some P4's and Core2Duo's (especially those with Xeon branding).
** Pentium-M and Celerons usually lack SMT.
* Unduplicated resources are either split or shared between logical cores:
** Shared: reservation stations, data caches
** Split: reorder buffers, load/store buffers
** Duplicated: registers
* No programmable priority control exported, no implicit priorities defined
==Intel MP IDs==
* Each logical processor is assigned a unique (but not necessarily sequential) 8-bit identifier at boot, the APIC ID
** The initial APIC ID can be retrieved via [[cpuid]], as can packaging data:
*** "logical cores per package" (CPUID.1.EBX[23:16]): Maximum number of logical processors in a physical package, as manufactured
*** "cores per package" (CPUID.4.EAX[31:26] + 1): Maximum number of physical processors (cores) in a physical package, as manufactured
*** "logical processors sharing a cache" (CPUID.4.EAX[25:14] + 1): Maximum number of logical processors in a physical package sharing a given cachelevel
*** Intel MP only addresses homogeneous setups, so these three values are (as of October 2009) equivalent for all processors
*** These last two require leaf level 4 [[cpuid]] support; if it is not provided, the package is a unicore
* APIC ID is formed of SMT_ID|CORE_ID|PACKAGE_ID, having widths defined by the packaging data:
** SMT_ID is 0 bits on a non-HyperThreaded processor.
** CORE_ID is 0 bits on a unicore package
** All remaining bits are devoted to PACKAGE_ID
==Interrupts==
==Interrupts==
* IO-APIC routes hardware interrupts to various CPUs ([http://www.mjmwired.net/kernel/Documentation/x86/i386/IO-APIC.txt Linux's IO-APIC.txt])
* IO-APIC routes hardware interrupts to various CPUs ([http://www.mjmwired.net/kernel/Documentation/x86/i386/IO-APIC.txt Linux's IO-APIC.txt])
Line 27: Line 51:
MIS:          0
MIS:          0
[recombinator](0) $ </pre>
[recombinator](0) $ </pre>
==[[Discovery]]==
==Discovery==
* ACPI's MADT table can supply multiprocessor configuration, and is usually used if present
* [[ACPI]]'s MADT table can supply multiprocessor configuration, and is usually used if present
* MP Table. The FreeBSD program <tt>mptable(1)</tt> can dump this:
* MP Table. The FreeBSD program <tt>mptable(1)</tt> can dump this:
<pre>[bryhlath](0) $ sudo mptable
<pre>[bryhlath](0) $ sudo mptable
Line 116: Line 140:


[bryhlath](0) $ </pre>
[bryhlath](0) $ </pre>
* HyperThreading (Intel SMT) requires CPU, BIOS and OS support. Introduced on the P4.
** Found on (all) i7's, [http://www.intel.com/technology/atom/microarchitecture.htm some Atoms], and some P4's and Core2Duo's (especially those with Xeon branding).
** Pentium-M and Celerons usually lack SMT.


==/proc/cpuinfo==
==/proc/cpuinfo==
Line 475: Line 496:
* The [[cpuid]] instruction can interrogate each processing unit
* The [[cpuid]] instruction can interrogate each processing unit
* [[Cpuset|CPUsets]]
* [[Cpuset|CPUsets]]
 
* "[http://software.intel.com/en-us/articles/multi-core-detect/ Detecting Multicore Processors]", Intel Software Network
[[Category: x86]]
[[Category: x86]]
[[CATEGORY: Hardware]]

Latest revision as of 19:35, 22 March 2010

The primary specification for multiprocessor x86-based setups is the Intel MultiProcessor Specification (last updated, AFAIK, to revision-006 on 1995-05-15).

SMT

  • HyperThreading (Intel SMT) requires CPU, BIOS and OS support. Introduced on the P4.
    • Found on (all) i7's, some Atoms, and some P4's and Core2Duo's (especially those with Xeon branding).
    • Pentium-M and Celerons usually lack SMT.
  • Unduplicated resources are either split or shared between logical cores:
    • Shared: reservation stations, data caches
    • Split: reorder buffers, load/store buffers
    • Duplicated: registers
  • No programmable priority control exported, no implicit priorities defined

Intel MP IDs

  • Each logical processor is assigned a unique (but not necessarily sequential) 8-bit identifier at boot, the APIC ID
    • The initial APIC ID can be retrieved via cpuid, as can packaging data:
      • "logical cores per package" (CPUID.1.EBX[23:16]): Maximum number of logical processors in a physical package, as manufactured
      • "cores per package" (CPUID.4.EAX[31:26] + 1): Maximum number of physical processors (cores) in a physical package, as manufactured
      • "logical processors sharing a cache" (CPUID.4.EAX[25:14] + 1): Maximum number of logical processors in a physical package sharing a given cachelevel
      • Intel MP only addresses homogeneous setups, so these three values are (as of October 2009) equivalent for all processors
      • These last two require leaf level 4 cpuid support; if it is not provided, the package is a unicore
  • APIC ID is formed of SMT_ID|CORE_ID|PACKAGE_ID, having widths defined by the packaging data:
    • SMT_ID is 0 bits on a non-HyperThreaded processor.
    • CORE_ID is 0 bits on a unicore package
    • All remaining bits are devoted to PACKAGE_ID

Interrupts

[recombinator](0) $ cat /proc/interrupts 
           CPU0       CPU1       
  0:        491          0   IO-APIC-edge      timer
  8:         87          0   IO-APIC-edge      rtc0
  9:          0          0   IO-APIC-fasteoi   acpi
 16:     609652          0   IO-APIC-fasteoi   uhci_hcd:usb1, heci
 17:          0          0   IO-APIC-fasteoi   pata_marvell
 18:      33141          0   IO-APIC-fasteoi   sata_promise, uhci_hcd:usb5, ehci_hcd:usb6
 19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
 21:   15985105          0   IO-APIC-fasteoi   uhci_hcd:usb2, ath
 23:    8730160          0   IO-APIC-fasteoi   uhci_hcd:usb3, ehci_hcd:usb7
 29:    1855556          0   PCI-MSI-edge      i915
 30:    1109316          0   PCI-MSI-edge      ahci
 31:      66952          0   PCI-MSI-edge      e1000
NMI:          0          0   Non-maskable interrupts
LOC:   14417376   16273096   Local timer interrupts
SPU:          0          0   Spurious interrupts
RES:     150950     182106   Rescheduling interrupts
CAL:        278        611   Function call interrupts
TLB:      33349      53705   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
ERR:          0
MIS:          0
[recombinator](0) $ 

Discovery

  • ACPI's MADT table can supply multiprocessor configuration, and is usually used if present
  • MP Table. The FreeBSD program mptable(1) can dump this:
[bryhlath](0) $ sudo mptable

===============================================================================

MPTable

-------------------------------------------------------------------------------

MP Floating Pointer Structure:

  location:			BIOS
  physical address:		0x000fbd10
  signature:			'_MP_'
  length:			16 bytes
  version:			1.4
  checksum:			0xc6
  mode:				Virtual Wire

-------------------------------------------------------------------------------

MP Config Table Header:

  physical address:		0x000fbb10
  signature:			'PCMP'
  base table length:		508
  version:			1.4
  checksum:			0x84
  OEM ID:			'QEMUCPU '
  Product ID:			'0.1         '
  OEM table pointer:		0x00000000
  OEM table size:		0
  entry count:			34
  local APIC address:		0xfee00000
  extended table length:	0
  extended table checksum:	0

-------------------------------------------------------------------------------

MP Config Base Table Entries:

--
Processors:	APIC ID	Version	State		Family	Model	Step	Flags
		 0	 0x11	 BSP, usable	 6	 0	 0	 0x0201
		 1	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 2	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 3	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 4	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 5	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 6	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 7	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 8	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		 9	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		10	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		11	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		12	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		13	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		14	 0x11	 AP, unusable	 6	 0	 0	 0x0201
		15	 0x11	 AP, unusable	 6	 0	 0	 0x0201
--
Bus:		Bus ID	Type
		 0	 ISA   
--
I/O APICs:	APIC ID	Version	State		Address
		 1	 0x11	 usable		 0xfec00000
--
I/O Ints:	Type	Polarity    Trigger	Bus ID	 IRQ	APIC ID	PIN#
		INT	 conforms    conforms	     0	   0	      1	   0
		INT	 conforms    conforms	     0	   1	      1	   1
		INT	 conforms    conforms	     0	   2	      1	   2
		INT	 conforms    conforms	     0	   3	      1	   3
		INT	 conforms    conforms	     0	   4	      1	   4
		INT	 conforms    conforms	     0	   5	      1	   5
		INT	 conforms    conforms	     0	   6	      1	   6
		INT	 conforms    conforms	     0	   7	      1	   7
		INT	 conforms    conforms	     0	   8	      1	   8
		INT	 conforms    conforms	     0	   9	      1	   9
		INT	 conforms    conforms	     0	  10	      1	  10
		INT	 conforms    conforms	     0	  11	      1	  11
		INT	 conforms    conforms	     0	  12	      1	  12
		INT	 conforms    conforms	     0	  13	      1	  13
		INT	 conforms    conforms	     0	  14	      1	  14
		INT	 conforms    conforms	     0	  15	      1	  15

===============================================================================

[bryhlath](0) $ 

/proc/cpuinfo

On Linux kernels with the proc filesystem enabled (and FreeBSD kernels with the linprocfs module loaded, although this does not provide all of the information as native Linux /proc/cpuinfo), the mounted proc/linprocfs filesystem contains a file cpuinfo (this is independent of any CPU-related modules being loaded, particularly cpuid on Linux or either OS's cpu module). Interpreting this file, as it pertains to multiple execution units, can be difficult. The following applies to Linux 2.6 kernels:

  • A physical_id corresponds to a socket ("physical package"), of which there are >=1 per machine. Physical IDs do not necessarily monotonically increase across processors, and thus the maximum physical_id does not by itself determine the number of sockets!
  • A core_id corresponds to a core ("logical processor"), of which there are >=1 per physical_id
  • A processor ID corresponds to an architectural state (HyperThreading == 2 per HyperThreaded core)
  • HyperThreading is in use only if 'siblings' != 'cpu cores' (from http://kbase.redhat.com/faq/FAQ_46_10715.shtm)
  • The 'ht' processor capabilities bit corresponds not to HyperThreading, but to the ability to report sibling count

Examples

I've removed all output from the following examples, save that related to SMP identification.

  • EMT64 Xeon, no HyperThreading support, 4 cores per socket, 4 sockets: 16 total execution units (2.6.26) (Dell R900)
[wopr](0) $ cat /proc/cpuinfo | egrep ^proc\|^model\ \|^phys\|^sib\|^core\|^cpu\ c\|^ap\|^init
processor	: 0
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0

processor	: 1
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 2
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 8
initial apicid	: 8

processor	: 2
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 4
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 16
initial apicid	: 16

processor	: 3
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 6
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 24
initial apicid	: 24

processor	: 4
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 2
initial apicid	: 2

processor	: 5
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 2
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 10
initial apicid	: 10

processor	: 6
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 4
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 18
initial apicid	: 18

processor	: 7
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 6
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 26
initial apicid	: 26

processor	: 8
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 1
initial apicid	: 1

processor	: 9
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 2
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 9
initial apicid	: 9

processor	: 10
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 4
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 17
initial apicid	: 17

processor	: 11
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 6
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 25
initial apicid	: 25

processor	: 12
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3

processor	: 13
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 2
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 11
initial apicid	: 11

processor	: 14
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 4
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 19
initial apicid	: 19

processor	: 15
model name	: Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz
physical id	: 6
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 27
initial apicid	: 27
[wopr](0) $ 
  • Xeon E5520, HyperThreading enabled, 4 cores per socket, 2 sockets: 16 total execution units (Dell R710)
[dumbledore](0) $ egrep '^(proc|phys|sib|acpi|model )|core' /proc/cpuinfo 
processor	: 0
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 0
cpu cores	: 4
processor	: 1
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
processor	: 2
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 1
cpu cores	: 4
processor	: 3
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 1
cpu cores	: 4
processor	: 4
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 2
cpu cores	: 4
processor	: 5
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 2
cpu cores	: 4
processor	: 6
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 3
cpu cores	: 4
processor	: 7
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
processor	: 8
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 0
cpu cores	: 4
processor	: 9
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 0
cpu cores	: 4
processor	: 10
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 1
cpu cores	: 4
processor	: 11
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 1
cpu cores	: 4
processor	: 12
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 2
cpu cores	: 4
processor	: 13
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 2
cpu cores	: 4
processor	: 14
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 1
siblings	: 8
core id		: 3
cpu cores	: 4
processor	: 15
model name	: Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
physical id	: 0
siblings	: 8
core id		: 3
cpu cores	: 4
[dumbledore](0) $ 
  • Core 2 Duo, no HyperThreading support, 2 cores per socket, 1 socket: 2 total execution units (2.6.26)
[recombinator](0) $ cat /proc/cpuinfo 
processor	: 0
model name	: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0

processor	: 1
model name	: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
[recombinator](0) $
  • P4 Xeon, HyperThreading enabled, 1 core per socket, 2 sockets: 4 total execution units (2.6.25-2-686)
scurdev@hrududu:~$ cat /proc/cpuinfo 
processor	: 0
model name	: Intel(R) Xeon(TM) CPU 2.40GHz
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1

processor	: 1
model name	: Intel(R) Xeon(TM) CPU 2.40GHz
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 1

processor	: 2
model name	: Intel(R) Xeon(TM) CPU 2.40GHz
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1

processor	: 3
model name	: Intel(R) Xeon(TM) CPU 2.40GHz
physical id	: 3
siblings	: 2
core id		: 0
cpu cores	: 1
scurdev@hrududu:~$
  • P4 Xeon, HyperThreading disabled, 1 core per socket, 2 sockets: 2 total execution units (2.6.25-2-686)
[aho](0) $ cat /proc/cpuinfo 
processor	: 0
model name	: Intel(R) Xeon(TM) CPU 2.80GHz

processor	: 1
model name	: Intel(R) Xeon(TM) CPU 2.80GHz
[aho](0) $ 
  • P4 Xeon Celeron, HyperThreading disabled, 1 core per socket, 1 socket: 1 total execution unit (2.6.25-2-686)
[knuth](0) $ cat /proc/cpuinfo 
processor	: 0
model name	: Intel(R) Celeron(R) CPU 2.00GHz
[knuth](0) $ 

Sysctl

On FreeBSD, CPU/SMP information is primarily exported through the sysctl(8) interface. Seemingly relevant sysctls are listed below:

  • kern.smp.cpus and hw.ncpu
  • machdep.hyperthreading.allowed
  • kern.threads.virtual_cpu

libvirt

  • virsh(1)'s nodeinfo command can be pretty useful:
[wopr](0) $ virsh nodeinfo
CPU model:           x86_64
CPU(s):              16
CPU frequency:       1600 MHz
CPU socket(s):       4
Core(s) per socket:  4
Thread(s) per core:  1
NUMA cell(s):        1
Memory size:         66113480 kB
[wopr](0) $ 
[recombinator](0) $ sudo virsh nodeinfo
CPU model:           x86_64
CPU(s):              2
CPU frequency:       1596 MHz
CPU socket(s):       1
Core(s) per socket:  2
Thread(s) per core:  1
NUMA cell(s):        1
Memory size:         3908568 kB

[recombinator](0) $ 


See also