Check out my first novel, midnight's simulacra!
VXLAN
The Virtual eXtensible Local Area Network protocol is used to encapsulate virtual Layer 2 networks over Layer 3+4, designed for use among multitenant hypervisors in (potentially multi-DC) cloud networks. It was formalized in 2014's RFC 7348. It avoids use of 802.1D's Spanning Tree Protocol while facilitating a full broadcast domain, superseding 802.1Q VLANs and their 12-bit VLAN IDs (VXLAN uses a 24-bit ID, the VXLAN Network Identifier (VNI)). The virtual layer 2 network thus created is known as a "VXLAN segment" or "VXLAN overlay network". The agents adding or removing VXLAN encapsulation (commonly hypervisors or switches) are referred to as "VXLAN Tunnel Endpoints" or VTEPs, and play roles similar to bridges, learning MACs and selectively forwarding frames.
The clients within a VXLAN segment needn't know that VXLAN is in use, and use standard unicast/broadcast traffic to talk to other hosts within the segment. Upon receipt of a frame, the VTEP looks up the VTEP with which this destination MAC is associated. If the client does not know the destination's L2 address, ARP is performed via normal broadcast. Broadcasts within a VXLAN are carried over a multicast address (multicast also uses this same tree).
VTEPs are not allowed to fragment packets. It is thus important to choose a network MTU which allows the VXLAN header to be inserted without exceeding physical MTUs.
VXLAN runs over udp/4789 by default. VXLANs can be stacked.
VXLAN encapsulation
The outermost header is a standard Ethernet header. The source hardware address is initially set to the originating VTEP's MAC address. The destination hardware address is the hardware address of the destination VTEP, or the router by which said VTEP is reached. 802.1Q tags can be used here as they would in any other case. In the case where routing hops exist between the two VTEPs, the source and destination addresses will change with each hop.
The next header is an IPv4 or IPv6 header with the VTEPs' L3 addresses used as source and dest. These addresses persist across hops. Within is the UDP header and its VXLAN payload. This datagram's destination port is the VTEP's VXLAN port, by default 4789. The source port is arbitrary, though RFC 7348 recommends that it be constructed using a hash over encapsulated headers, over the domain 49152--65535.
The VXLAN payload consists of the VXLAN header, plus the original frame minus its Ethernet FCS. The VXLAN header is 8 bytes:
- 8 bits of flags, RRRRIRRR. All R bits must be 0. The I bit must be 1 for a valid VNI.
- 24 reserved bits. All must be 0.
- 24-bit VNI.
- 8 reserved bits. All must be 0.
The VTEP removes the original FCS, and adds its own.