Check out my first novel, midnight's simulacra!

Netlink: Difference between revisions

From dankwiki
(solved the ss NETLINK_NFLOG mystery)
Line 15: Line 15:
** <tt>RTM_NEWRULE, RTM_DELRULE, RTM_GETRULE</tt> — rule tables for advanced routing (rtmsg structs)
** <tt>RTM_NEWRULE, RTM_DELRULE, RTM_GETRULE</tt> — rule tables for advanced routing (rtmsg structs)
** See <tt>rtnetlink(7)</tt> for more info
** See <tt>rtnetlink(7)</tt> for more info
* <tt>NETLINK_SOCK_DIAG</tt> — socket monitoring, as used by <tt>ss(8)</tt>
* <tt>NETLINK_SOCK_DIAG</tt> — socket snapshots, as used by <tt>ss(8)</tt>
** Aliased as <tt>NETLINK_INET_DIAG</tt>
** Aliased as <tt>NETLINK_INET_DIAG</tt>
** <tt>ss(8)</tt> as of at least 2019 seems to actually be using <tt>NETLINK_NFLOG</tt>
** This can only generate snapshots (it was originally added to assist checkpointing). It cannot be subscribed to for streaming events.
* <tt>NETLINK_NFLOG</tt> — [[iptables]] replacement for <tt>NETLINK_QUEUE</tt> since 2.6.14
* <tt>NETLINK_NFLOG</tt> — [[iptables]] replacement for <tt>NETLINK_QUEUE</tt> since 2.6.14
** Userspace provided by <tt>libnetfilter</tt>
** Userspace provided by <tt>libnetfilter</tt>
** <tt>ss(8)</tt> uses this when invoked with <tt>-E</tt> to print events continuously
* <tt>NETLINK_QUEUE</tt> — '''obsolete''' [[iptables]] packet interface for userspace
* <tt>NETLINK_QUEUE</tt> — '''obsolete''' [[iptables]] packet interface for userspace
** Used the <tt>ip_queue</tt> kernel module and the QUEUE target
** Used the <tt>ip_queue</tt> kernel module and the QUEUE target

Revision as of 20:00, 12 November 2019

Netlink sockets (PF_NETLINK) are a mechanism within Linux to retrieve and manage various aspects of the networking stacks -- they are a Linux-specific extension to the Berkeley Sockets model, and should not be used in portable programs. The information available via netlink sockets was previously available to userspace, if at all, via a collection of ioctl(2)s and a grabbag of get*(2) custom-purpose system calls; the majority of these are obsoleted by netlink sockets, but still implemented for backwards compatability. RFC 3549 provides a snapshot current as of kernel 2.4.6; the netlink socket interface, however, is prone to change. That doesn't affect RFC 3549 as much as one might think, as it really has nothing to do with the netlink programming model; I suspect it to be a joke Andi Kleen perpetrated knowing that W. Richard Stevens wasn't around to call him out on it anymore.

The netlink(3) man page includes the following text:

NOTES
       It is often better to use netlink via libnetlink than via the low-level
       kernel interface.

It has been this author's experience that this is untrue; the cold hard reality is that just about anything involving netlink sockets is bound to be unpleasant, usually in the extreme, and libnetlink won't improve things in the slightest. libdank has grown a capable netlink module over the years, and I would advise its use.

Netlink Families

As in the third argument to socket(2); the full and current list of families can be had at your local netlink(7) man page. Here's the important ones:

  • NETLINK_ROUTE — pretty much everything corresponding to ip(8), also known as iproute, including:
    • RTM_NEWLINK, RTM_DELLINK, RTM_GETLINK — device tables (ifinfomsg and rtattr structs) (see netdevice(7))
    • RTM_NEWADDR, RTM_DELADDR, RTM_GETADDR — address tables (ifaddrmsg and rtattr structs)
    • RTM_NEWROUTE, RTM_DELROUTE, RTM_GETROUTE — routing tables (rtmsg and rtattr structs)
    • RTM_NEWNEIGH, RTM_DELNEIGH, RTM_GETNEIGH — neighbor (ARP, for IPv4) tables (ndmsg structs)
    • RTM_NEWRULE, RTM_DELRULE, RTM_GETRULE — rule tables for advanced routing (rtmsg structs)
    • See rtnetlink(7) for more info
  • NETLINK_SOCK_DIAG — socket snapshots, as used by ss(8)
    • Aliased as NETLINK_INET_DIAG
    • This can only generate snapshots (it was originally added to assist checkpointing). It cannot be subscribed to for streaming events.
  • NETLINK_NFLOGiptables replacement for NETLINK_QUEUE since 2.6.14
    • Userspace provided by libnetfilter
    • ss(8) uses this when invoked with -E to print events continuously
  • NETLINK_QUEUEobsolete iptables packet interface for userspace
    • Used the ip_queue kernel module and the QUEUE target
    • Userspace was provided the libipq wrapper library.

Netlink Stupidity

Each time I come into contact with a new piece of netlink or the code that uses it, I'm flabbergasted by the utter lack of design elegance or even basic good taste. Alexey Kuznetsov, the primary author, is almost famous for getting the bits from one end of the wire to another in the most efficient and ugliest way possible (but I wouldn't try to write a better networking stack). PF_NETLINK and all it touches feels like someone took all the untyped, unsafe ioctl(2) layers, wrapped them up with some message queues, stuck them in an Eastern European hellhole and waited for NATO air power to solve the design problem.

The "Big Tent" approach to socket(2) (from netlink(7)):

Netlink is a datagram-oriented service. Both SOCK_RAW and SOCK_DGRAM are valid values for socket_type. However, the netlink protocol does not distinguish between datagram and raw sockets.

Things like this all over the place (taken from misc/ss.c in the iproute source package):

req.nlh.nlmsg_seq = 123456;

See Also