Check out my first novel, midnight's simulacra!
Packet sockets
Packet sockets allow a program to more directly interface with the networking stack than standard Layer 4 Berkeley sockets (e.g. AF_INET/AF_INET6 + SOCK_STREAM/SOCK_DGRAM).
Linux packet sockets
The SOCK_PACKET socket type is strongly deprecated (see packet(7)), and thus not discussed here.
Socket type | CAP_NET_ADMIN/root required? | Zero-copy? | Layers wholly directly specified | Layers partly directly specified |
---|---|---|---|---|
PACKET_TX_MMAP-enabled PF_PACKET, SOCK_RAW | Y | Y | 2+ | 2+ |
PF_PACKET, SOCK_RAW | Y | N | 2+ | 2+ |
PF_PACKET, SOCK_DGRAM | Y | N | 3+ | 2+ |
PF_INET, SOCK_RAW, IPPROTO_RAW | Y | N | 3+ | 3+ |
IP_HDRINCL-enabled PF_INET, SOCK_RAW
(only one IP protocol per socket when protocol is other than IPPROTO_RAW, not valid for PF_INET6 sockets) |
Y | N | 3+ | 3+ |
PF_INET, SOCK_RAW
(only one IP protocol per socket when protocol is other than IPPROTO_RAW) |
Y | N | 4+ | 3+ |
PF_INET
(only one IP protocol per socket when protocol is other than IPPROTO_RAW, and only certain IP protocols are supported at all when type is not SOCK_RAW) |
N | N | 5+ | 3+ |
PF_PACKET
As described in packet(7), this is a packet interface at the device level (Layer 2). The protocol field is either an IEEE 802.3 protocol number (found in linux/if_ether.h or ETH_P_ALL in network byte order. The CAP_NET_RAW capability or UID 0 are requisite to open a packet socket. bind(2) can be used to select a single interface to use with the packet socket.
SOCK_RAW
Raw packets including the link-level header.
SOCK_DGRAM
Cooked packets (common, protocol-independent link-layer header in a sockaddr_ll).
AF_INET/AF_INET6
The protocol field restricts the Layer 3 protocols which will be passed to the socket; IPPROTO_RAW can set arbitrary IP protocol values (for transmission, but not reception), and thus implies the IP_HDRINCL socket option. Even if IP_HDRINCL is set, raw sockets might rewrite the following fields:
- Total length
- Checksum
- Packet ID (given a random value if set to 0)
- Source address (selected via routing lookup if set to 0)
The IP_HDRINCL (IPV6_HDRINCL) option is not supported for IPv6 raw sockets.
SOCK_RAW
Path MTU is performed on outgoing packets unless disabled via IP_MTU_DISCOVER sockopt, in which case they might be fragmented if they exceed the outgoing device's MTU (unless IP_HDRINCL is used). Fragment reassembly is performed before delivering packets to receiving sockets. sendto() must be used with raw sockets, as opposed to a mere send(). When using an IPv6 raw socket, sin6_port must be set to 0 to avoid an EINVAL ("Invalid Argument") error.
SOCK_PACKET
Obsolete since Linux 2.0.
See also
- Linux APIs and FreeBSD APIs
- "BPF, XDP, Packet Filters and UDP" at fly.io