Check out my first novel, midnight's simulacra!

Azure: Difference between revisions

From dankwiki
No edit summary
 
(12 intermediate revisions by the same user not shown)
Line 1: Line 1:
Microsoft's cloud platform, similar to Amazon's AWS or Google's GCP.
Microsoft's commodity cloud platform, similar to Amazon's AWS or Google's GCP.


To get KeyVault stuff working with Debian's 2.18.0, I had to run: <tt>pip3 install azure-keyvault==1.1.0</tt>
To get KeyVault stuff working with Debian's 2.18.0, I had to run: <tt>pip3 install azure-keyvault==1.1.0</tt>
Line 7: Line 7:
* On [[Debian]]-derived systems, install <tt>azure-cli</tt>
* On [[Debian]]-derived systems, install <tt>azure-cli</tt>
** <tt>az account list</tt> -- list subscription groups
** <tt>az account list</tt> -- list subscription groups
==Time==
Rather than using NTP with chrony or SNTP with systemd-timesyncd, use the HyperV-provided PTP source due the VMICTimeSync provider. Ensure the HyperV daemons are installed and enabled, and configure chrony thusly:
<pre>
makestep 1.0 -1
local stratum 2
refclock PHC /dev/ptp_hyperv poll 3 dpoll -2 offset 0
</pre>
The <tt>makestep</tt> parameter forces a hard sync if the clock differs too much from upstream (rather than exiting with error). The <tt>/dev/ptp_hyperv</tt> symlink ought be set up by udev.


==Networking==
==Networking==
Azure hosts use ConnectX devices from Mellanox (now NVIDIA). ConnectX-4 needs the [https://docs.kernel.org/networking/device_drivers/ethernet/mellanox/mlx5.html mlx5_core] driver, while ConnectX-3 needs mlx4_en. It is not possible as of February 2023 to force allocation of one or the other for a given VM, and the NIC type can change if the VM is moved. Azure documentation typically refers to both classes as "SmartNICs".
Azure hosts use ConnectX devices from Mellanox (now NVIDIA). ConnectX-4 needs the [https://docs.kernel.org/networking/device_drivers/ethernet/mellanox/mlx5.html mlx5_core] driver, while ConnectX-3 needs mlx4_en. It is not possible as of February 2023 to force allocation of one or the other for a given VM, and the NIC type can change if the VM is moved. Azure documentation typically refers to both classes as "SmartNICs".


The fundamental network object of Azure is the VNet. Hosts connected to the same VNet can communicate with one another without intermediary gateways. VNets ought use RFC1918 or RFC6598 addresses, and best practices include using distinct sections of these address spaces in your distinct VNets (since VNets can be peered with one another). Broadcast and multicast are not supported with an VNet. ARP requests are answered by the (virtual) gateway (actually SDN), not broadcast to the various machines of the VNet. You will not typically see incoming ARP requests (though you will see your own requests, and any replies you receive). ARP replies will generally contain the bogus hardware address 12:34:56:78:9a:bc, and this will typically be the only MAC address in your neighbor tables. You do see a "real" MAC locally when examining your interfaces. Your private IP addresses will be acquired using DHCP.
The fundamental networking object in Azure is the VNet. VNets are manually created, scoped to a single region and subscription, and no larger than a /16. A VM's NICs are assigned to VNets. VNets ought use RFC 1918 or RFC 6598 addresses, and it is best practice to keep the addresses of your organization's VNets distinct. VNets can be peered together. VNets are <b>not a broadcast domain</b>, and indeed support neither broadcast nor multicast using typical mechanisms. ARP requests will be answered by the SDN, and never seen by the other VNet hosts. The reply will always contain the MAC address 12:34:56:78:9a:bc, which will make up most (if not all) of your neighbor table. The SDN does not answer for IPs not active in the VNet. MTUs within a VNet can be safely taken to around 8000 bytes. VNets can be further broken down into distinct subnets. NICs can be assigned public IP addresses in addition to their private addresses within some subnet (and thus VNet), and connecting into the VNet generally requires one or more such IPs. The first four addresses and the last address within each VNet are reserved (network, gateway, dns1, dns2, broadcast). IPv4 VNets must be at least /29s, and no larger than /2s. IPv6 VNets must be /64.
 
You'll see traffic with the IP 168.63.129.16; it is used by numerous internal Microsoft services, including DHCP. No host can transmit with this address.


VMs can be assigned public IPs, but they will not be visible on the VM proper, and should not be added using <tt>ip-address</tt>. The incoming traffic will be directed to your VM by the SDN, and DNATted such that you see your internal IP as the destination address.
Other than their definitions as L3 entities, VNets work not unlike VLANs. Most importantly, traffic within a VNet cannot be read by entities outside the VNet. VLANs are not supported within Azure.


===Accelerated Networking===
===Accelerated Networking===
"Accelerated Networking" (AN) is available on most VM classes, and provides the VM with an [[SR-IOV]] Virtual Function; this will show up as a distinct interface within the VM. It can be selected at VM creation time, or whenever the VM is deallocated (stopped). Some VM classes *require* AN. The mapping of VF devices to synthetic devices is undefined, and ought be managed with [[Systemd#systemd-networkd|systemd-networkd]] or [[udev]] rules in the presence of multiple virtual interfaces. The synthetic interface and its associated VF interface will have the same MAC address. From the perspective of other network entities, these are a single interface. They can be distinguished by checking [[ethtool]] for the driver (the synthetic interface is "hv_netsvc") or checking <tt>ip-link</tt> output for the slave designator (the VF is the slave). From the Azure docs:
"Accelerated Networking" (AN) is available on most VM classes, and provides the VM with an [[SR-IOV]] Virtual Function; this will show up as a distinct interface within the VM. It can be selected at VM creation time, or whenever the VM is deallocated (stopped). Some VM classes *require* AN. The mapping of VF devices to synthetic devices is undefined, and ought be managed with [[Systemd#systemd-networkd|systemd-networkd]] or [[udev]] rules in the presence of multiple virtual interfaces. The synthetic interface and its associated VF interface will have the same MAC address. From the perspective of other network entities, these are a single interface. They can be distinguished by checking [[ethtool]] for the driver (the synthetic interface is "hv_netsvc") or checking [[iproute]] output for the slave designator (the VF is the slave). From the Azure docs:


<pre>Applications should interact only with the synthetic interface...Outgoing network packets are passed from the netvsc driver to the VF driver and then transmitted through the VF interface. Incoming packets are received and processed on the VF interface before being passed to the synthetic interface. Exceptions are incoming TCP SYN packets and broadcast/multicast packets that are processed by the synthetic interface only.</pre>
<pre>Applications should interact only with the synthetic interface...Outgoing network packets are passed from the netvsc driver to the VF driver and then transmitted through the VF interface. Incoming packets are received and processed on the VF interface before being passed to the synthetic interface. Exceptions are incoming TCP SYN packets and broadcast/multicast packets that are processed by the synthetic interface only.</pre>
Verify presence of the VF using e.g. <tt>lsvmbus</tt>. Check [[ethtool]] stats on the synthetic interface with e.g. <tt>ethtool -S eth0</tt> and verify that <tt>vf_rx_bytes</tt> and <tt>vf_tx_bytes</tt> are indicating traffic. If they are, the VF is being used. Note that hardware queues are managed independently for the synthetic and VF interfaces, and [[RSS]] ought be configured using the VF.
The <tt>netvsc</tt> driver is registered on the vmbus, which then loads the virtual PCIe driver (<tt>hv_pci</tt>) necessary for detection of the VF. The PCIe domain ID must be unique across all virtualized PCIe devices. The <tt>mlx4</tt> or <tt>mlx5</tt> driver detects the VF and initializes it.


==VMs==
==VMs==
Line 118: Line 129:
* [https://techcommunity.microsoft.com/t5/azure-compute-blog/accelerated-networking-on-hb-hc-hbv2-hbv3-and-ndv2/ba-p/2067965 Accelerated Networking on HB, HC, HBv2, HBv3, and NDv2] (mostly about [[InfiniBand]])
* [https://techcommunity.microsoft.com/t5/azure-compute-blog/accelerated-networking-on-hb-hc-hbv2-hbv3-and-ndv2/ba-p/2067965 Accelerated Networking on HB, HC, HBv2, HBv3, and NDv2] (mostly about [[InfiniBand]])
* [https://techcommunity.microsoft.com/t5/azure-high-performance-computing/performance-impact-of-enabling-accelerated-networking-on-hbv3/ba-p/2103451 Performance Impact of Enabling Accelerated Networking on HBv3, HBv2, and HC VMs]
* [https://techcommunity.microsoft.com/t5/azure-high-performance-computing/performance-impact-of-enabling-accelerated-networking-on-hbv3/ba-p/2103451 Performance Impact of Enabling Accelerated Networking on HBv3, HBv2, and HC VMs]
* [https://learn.microsoft.com/en-us/azure/virtual-network/accelerated-networking-how-it-works Accelerated Networking: How it Works]
* [https://learn.microsoft.com/en-us/azure/virtual-network/virtual-networks-udr-overview Virtual network traffic routing]
* [https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/ipv6-overview Overview of IPv6 for Azure]
* [https://learn.microsoft.com/en-us/azure/virtual-network/virtual-networks-faq Azure Virtual Networks FAQ]
* [https://learn.microsoft.com/en-us/windows-server/virtualization/hyper-v/best-practices-for-running-linux-on-hyper-v Best Practices for Running Linux on Hyper-V]
* [https://learn.microsoft.com/en-us/azure/virtual-machines/linux/time-sync Time Sync for Linux VMs in Azure]

Latest revision as of 03:09, 4 May 2023

Microsoft's commodity cloud platform, similar to Amazon's AWS or Google's GCP.

To get KeyVault stuff working with Debian's 2.18.0, I had to run: pip3 install azure-keyvault==1.1.0

CLI

Time

Rather than using NTP with chrony or SNTP with systemd-timesyncd, use the HyperV-provided PTP source due the VMICTimeSync provider. Ensure the HyperV daemons are installed and enabled, and configure chrony thusly:

makestep 1.0 -1
local stratum 2
refclock PHC /dev/ptp_hyperv poll 3 dpoll -2 offset 0

The makestep parameter forces a hard sync if the clock differs too much from upstream (rather than exiting with error). The /dev/ptp_hyperv symlink ought be set up by udev.

Networking

Azure hosts use ConnectX devices from Mellanox (now NVIDIA). ConnectX-4 needs the mlx5_core driver, while ConnectX-3 needs mlx4_en. It is not possible as of February 2023 to force allocation of one or the other for a given VM, and the NIC type can change if the VM is moved. Azure documentation typically refers to both classes as "SmartNICs".

The fundamental networking object in Azure is the VNet. VNets are manually created, scoped to a single region and subscription, and no larger than a /16. A VM's NICs are assigned to VNets. VNets ought use RFC 1918 or RFC 6598 addresses, and it is best practice to keep the addresses of your organization's VNets distinct. VNets can be peered together. VNets are not a broadcast domain, and indeed support neither broadcast nor multicast using typical mechanisms. ARP requests will be answered by the SDN, and never seen by the other VNet hosts. The reply will always contain the MAC address 12:34:56:78:9a:bc, which will make up most (if not all) of your neighbor table. The SDN does not answer for IPs not active in the VNet. MTUs within a VNet can be safely taken to around 8000 bytes. VNets can be further broken down into distinct subnets. NICs can be assigned public IP addresses in addition to their private addresses within some subnet (and thus VNet), and connecting into the VNet generally requires one or more such IPs. The first four addresses and the last address within each VNet are reserved (network, gateway, dns1, dns2, broadcast). IPv4 VNets must be at least /29s, and no larger than /2s. IPv6 VNets must be /64.

Other than their definitions as L3 entities, VNets work not unlike VLANs. Most importantly, traffic within a VNet cannot be read by entities outside the VNet. VLANs are not supported within Azure.

Accelerated Networking

"Accelerated Networking" (AN) is available on most VM classes, and provides the VM with an SR-IOV Virtual Function; this will show up as a distinct interface within the VM. It can be selected at VM creation time, or whenever the VM is deallocated (stopped). Some VM classes *require* AN. The mapping of VF devices to synthetic devices is undefined, and ought be managed with systemd-networkd or udev rules in the presence of multiple virtual interfaces. The synthetic interface and its associated VF interface will have the same MAC address. From the perspective of other network entities, these are a single interface. They can be distinguished by checking ethtool for the driver (the synthetic interface is "hv_netsvc") or checking iproute output for the slave designator (the VF is the slave). From the Azure docs:

Applications should interact only with the synthetic interface...Outgoing network packets are passed from the netvsc driver to the VF driver and then transmitted through the VF interface. Incoming packets are received and processed on the VF interface before being passed to the synthetic interface. Exceptions are incoming TCP SYN packets and broadcast/multicast packets that are processed by the synthetic interface only.

Verify presence of the VF using e.g. lsvmbus. Check ethtool stats on the synthetic interface with e.g. ethtool -S eth0 and verify that vf_rx_bytes and vf_tx_bytes are indicating traffic. If they are, the VF is being used. Note that hardware queues are managed independently for the synthetic and VF interfaces, and RSS ought be configured using the VF.

The netvsc driver is registered on the vmbus, which then loads the virtual PCIe driver (hv_pci) necessary for detection of the VF. The PCIe domain ID must be unique across all virtualized PCIe devices. The mlx4 or mlx5 driver detects the VF and initializes it.

VMs

  • Fsv2: 8370C (Ice Lake), 8272CL (Cascade Lake), 8168 (Skylake), 3.4--3.7
  • FX: 8246R (Cascade Lake), 4.0
  • Dv2: 8370C (Ice Lake), 8272CL (Cascade Lake), 8171M (Skylake), 2673v3 (Haswell), 2673v4 (Broadwell)
  • DSv2: 8370C (Ice Lake), 8272CL (Cascade Lake), 8171M (Skylake), 2673v3 (Haswell), 2673v4 (Broadwell)
  • Eav4, Easv4: EPYC 7452
  • H: 2667v3 without hyperthreading or SR-IOV but w/ MPI
Class vCPUS GiB SSD Disks NICs Mbps
F2sv2 2 4 16 4 2 5000
F4sv2 4 8 32 8 2 10000
F8sv2 8 16 64 16 4 12500
F16sv2 16 32 128 32 4 12500
F32sv2 32 64 256 32 8 16000
F48sv2 48 96 384 32 8 21000
F64sv2 64 128 512 32 8 28000
F72sv2 72 144 576 32 8 30000
FX4mds 4 84 168 8 2 4000
FX12mds 12 252 504 24 4 8000
FX24mds 24 504 1008 32 4 16000
FX36mds 36 756 1512 32 8 24000
FX48mds 48 1008 2016 32 8 32000
D1v2 1 3.5 504 4 2 750
D2v2 2 7 100 8 2 1500
D3v2 4 14 200 16 4 3000
D4v2 8 28 400 32 8 6000
D5v2 16 56 800 64 8 12000
D11v2 2 14 100 ? 2 1500
D12v2 4 28 200 ? 4 3000
D13v2 8 56 400 ? 8 6000
D14v2 16 112 800 ? 8 12000
D15v2 20 140 1000 ? 8 25000
DS1v2 1 3.5 7 4 2 750
DS2v2 2 7 14 8 2 1500
DS3v2 4 14 28 16 4 3000
DS4v2 8 28 56 32 8 6000
DS5v2 16 56 112 64 8 12000
DS11v2 2 14 28 8 2 1500
DS12v2 4 28 56 16 4 3000
DS13v2 8 56 112 32 8 6000
DS14v2 16 112 224 64 8 12000
DS15v2 20 140 280 64 8 25000
H8 8 56 1000 32 2 ?
H16 16 112 2000 64 4 ?
H8m 8 112 1000 32 2 ?
H16m 16 224 2000 64 4 ?
H16r 16 112 2000 64 4 ?
H16mr 16 224 2000 64 4 ?

NUMA (multipackage) machines are only available from the HB class.

External Links