Check out my first novel, midnight's simulacra!
Fast UNIX Servers: Difference between revisions
From dankwiki
No edit summary |
No edit summary |
||
Line 14: | Line 14: | ||
* <tt>epoll</tt> (via <tt>EPOLLET</tt>) and <tt>kqueue</tt> (via <tt>EV_CLEAR</tt>) provide edge-triggered semantics | * <tt>epoll</tt> (via <tt>EPOLLET</tt>) and <tt>kqueue</tt> (via <tt>EV_CLEAR</tt>) provide edge-triggered semantics | ||
* '''fixme''' a thorough comparison of these is sorely needed | * '''fixme''' a thorough comparison of these is sorely needed | ||
==A Garden of Interfaces== | |||
* <tt>readv(2)</tt>, <tt>writev(2)</tt> ([[FreeBSD APIs|FreeBSD]]'s <tt>sendfile(2)</tt> has a <tt>struct iov</tt> handily attached) | |||
* <tt>splice(2)</tt>, <tt>vmsplice(2)</tt> and <tt>tee(2)</tt> on [[Linux APIs|Linux]] since version 2.6.17 | |||
** (When the first page of results for your interface centers largely on exploits, might it be time to reconsider your design assumptions?) | |||
* <tt>sendfile(2)</tt> (with charmingly different interfaces on FreeBSD and Linux) | |||
** On Linux since 2.6.2x ('''FIXME get a link'''), <tt>sendfile(2)</tt> is implemented in terms of <tt>splice(2)</tt> | |||
* <tt>aio_</tt> and friends for aysnchronous i/o | |||
* <tt>mmap(2)</tt> and an entire associated bag of tricks ('''FIXME detail''') | |||
==The Full Monty: A Theory of UNIX Servers== | ==The Full Monty: A Theory of UNIX Servers== |
Revision as of 21:38, 25 June 2009
Everyone ought start with Dan Kegel's classic site, "The C10K Problem" (still updated from time to time). Jeff Darcy's "High-Performance Server Architecture" is much of the same. Everything here is advanced followup material to these excellent works, and of course the books of W. Richard Stevens.
Queueing Theory
- "Introduction to Queueing"
- Leonard Kleinrock's peerless Queueing Systems (Volume 1: Theory, Volume 2: Computer Applications)
Event Cores
- epoll on Linux, /dev/poll on Solaris, kqueue on FreeBSD
- liboop, libev and libevent
- Ulrich Drepper's "The Need for Aynchronous, ZeroCopy Network I/O"
- If nothing else, Drepper's plans tend to become sudden and crushing realities in the glibc world
Edge and Level Triggering
- Historic interfaces like POSIX.1g/POSIX.1-2001's select(2) and POSIX.1-2001's poll(2) were level-triggered
- epoll (via EPOLLET) and kqueue (via EV_CLEAR) provide edge-triggered semantics
- fixme a thorough comparison of these is sorely needed
A Garden of Interfaces
- readv(2), writev(2) (FreeBSD's sendfile(2) has a struct iov handily attached)
- splice(2), vmsplice(2) and tee(2) on Linux since version 2.6.17
- (When the first page of results for your interface centers largely on exploits, might it be time to reconsider your design assumptions?)
- sendfile(2) (with charmingly different interfaces on FreeBSD and Linux)
- On Linux since 2.6.2x (FIXME get a link), sendfile(2) is implemented in terms of splice(2)
- aio_ and friends for aysnchronous i/o
- mmap(2) and an entire associated bag of tricks (FIXME detail)
The Full Monty: A Theory of UNIX Servers
We must mix and match:
- Many event sources, of multiple types and possibly various triggering mechanisms (edge- vs level-triggered):
- Socket descriptors, pipes
- File descriptors referring to actual files (these usually have different blocking semantics)
- Signals, perhaps being used for asynchronous I/O with descriptors (signalfd(2) on Linux unifies these with socket descriptors; kqueue supports EVFILT_SIGNAL events)
- Timers (timerfd(2) on Linux unifies these with socket descriptors; kqueue supports EVFILT_TIMER events)
- Condition variables and/or mutexes becoming available
- Filesystem events (inotify(7) on Linux, EVFILT_VNODE with kqueue)
- Networking events (netlink(7) (PF_NETLINK) sockets on Linux, EVFILT_NETDEV with kqueue)
- One or more event notifiers (epoll or kqueue fd)
- One or more event vectors, into which notifiers dump events
- kqueue supports vectorized registration of event changes, extending the issue
- Threads -- one event notifier per? one shared event notifier with one event vector per? one shared event notifier feeding one shared event vector? work-stealing/handoff?
- It is doubtful (but not, AFAIK, proven impossible) that one scheduling/sharing solution is optimal for all workloads
DoS Prevention or, Maximizing Useful Service
- TCP SYN -- to Syncookie or nay? The "half-open session" isn't nearly as meaningful or important a concept on modern networking stacks as it was in 2000.
- Long-fat-pipe options, fewer MSS values, etc...but recent work (in Linux, at least) has improved them (my gut feeling: nay)
- Various attacks like slowloris, TCPPersist as written up in Phrack 0x0d-0x42-0x09, Outpost24 etc...
- What are the winning feedbacks? fractals and queueing theory, oh my! fixme detail
See Also
- "sendfile(): fairly sexy (nothing to do with ECN)" on lkml
- "mmap() sendfile()" on freebsd-hackers
- "sharing memory map between processes (same parent)" on comp.unix.programmer
- "some mmap observations compared to Linux 2.6/OpenBSD" on freebsd-hackers
- Stuart Cheshire's "Laws of Networkdynamics" and "It's the Latency, Stupid"
- "mremap help? or no support for FreeBSD?" on freebsd-hackers
- "Edge-triggered interfaces are too difficult?" on LWN, 2003-05-16
- "Edge- vs Level-Triggered Events on Pierre Phaneuf's livejournal (pphaneuf)
- "edge-triggered vs level-triggered epoll in kernel 2.6" on comp.unix.programmer, 2004-12-01