Check out my first novel, midnight's simulacra!

UNIX Weapons School: Difference between revisions

From dankwiki
No edit summary
No edit summary
(43 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[File:UWS.png|right]]
[[File:UWS.png|right]]
'''[http://nick-black.com/unix-weapons-school.html UNIX WEAPONS SCHOOL 2013 FLYER]''' ''(Update: CS4803 UWS may well get offered at GT this summer!)''


This syllabus is '''''TENTATIVE''''' and '''''PRELIMINARY'''''!
Lecture slides are available [http://nick-black.com/lectures.html here]. These slides are very much a work in progress; LaTeX source is available on [https://github.com/dankamongmen/unix-weapons-school GitHub].


''Accept the challenges so that you can feel the exhilaration of victory.'' -- George S. Patton
''Accept the challenges so that you can feel the exhilaration of victory.'' -- George S. Patton
[[File:topgun.png|right|thumb|On March 3, 1969, the United States Navy established an elite school for the top one percent of its pilots. Its purpose was to teach the lost art of aerial combat and to ensure that the handful of students who graduated were the best fighter pilots in the world. They succeeded. Today, the Navy calls it Fighter Weapons School. The flyers call it: TOP GUN.]]


UNIX Weapons School is an intense elective practicum. '''Its goal is to train formidably competent systems programmers.'''
UNIX Weapons School is an intense elective practicum. '''Its goal is to train formidably competent systems programmers.'''
* The name owes heritage, I realize, to Richard Bejtlich's and TaoSecurity's superb [http://www.taosecurity.com/training.html TCP/IP Weapons School] from BlackHat 2007. Thanks Richard!


GoogleGroups hosts our (permanent, public) [https://groups.google.com/forum/?fromgroups#!forum/unix-weapons-school mailing list].
''update from 2021: this class would now absolutely go into io_uring and eBPF. also SSDs and the resultant changes to the I/O paradigm. there would probably be some talk of Spectre/Meltdown in conjunction with the architecture lecture, also BIG.little and core scheduling. probably a bit more coverage of ARM details, though i'll not yet yeet the x86 stuff. the "future of systems programming section" would be updated. a bit more talk about distributed systems, probably, and i'd discard the current material about machine learning. other than that, this still pretty much stands.''
-----
 
If you feel the urge to cheat, '''please drop the class'''. I will seek out cheating via comparison of compiled object code, statistical analysis of unit test results, and other methods. If you cheat in my class, I will use any influence I have to see you removed from my Institute and blacklisted from my industry. I will haunt you, yelling obscenities and bizarre threats for your life's pitiful remainder. If you go to China, I will be hiding in a bowl of rice, ready to pop a cap in your ass.
==Lectures thus far==
-----
===2013-05-14 Greetings and Salutations===
* "[http://nick-black.com/intro.pdf Intro + official junk]"
** Motivation
** Class policies
** Self-assessment
* "[http://nick-black.com/bigo.pdf Big-O Ain't What it Used to Be]"
** Review of asymptotic complexity
** Case study: The impact of hardware upon matrix multiplication
 
===2013-05-16 Computer Architecture===
* "[http://nick-black.com/comparch.pdf Your Friend the Computer]"
** CMOS power equations and pipelining
** Superscaler and out-of-order
** Delays in dynamically pipelined processors
 
===2013-05-21 The x86===
* "[http://nick-black.com/x86.pdf The x86 is dead. Long live the x86!]"
** Guided tour of the x86's history
** Core frontend + backend
** x86+SSE+VEX instruction set
 
===2013-05-23 UNIX Development===
* "[http://nick-black.com/practicum.pdf Not Sucking in the UNIX Environment]"
** Shell
** C development: Autotools, gcc, binutils
** C++ development: CMake, g++/clang
** Debugging: gdb, valgrind, strace, ltrace
 
===2013-08-01 FINAL EXAM SUMMER 2013===
[[UWC Summer 2013 Final Exam]]


Without systems programmers, hardware would sit silent, stillborn. Without systems programmers, application designers shout mutely into a void.
==Cheating==
If you feel predisposed to cheating, '''I urge you to drop the class'''. I will seek out cheating via comparison of compiled object code, statistical analysis of unit test results, and other methods. If you cheat in my class, I will use any influence I have to see you removed from my Institute and blacklisted from my industry. I will haunt you, yelling obscenities and bizarre threats for your life's pitiful remainder. My children will target your children for beatings, for unto seventy times seven generations.


The reconfigurability of software combined with the system bounding of hardware allows the systems programmer the broadest space with which to innovate. Unfortunately, most of these possible innovations are just bugs, or (within the innermost ring) superbugs. When that signed vs unsigned comparison means the wrong hard drive sector is selected, it can ruin your whole day.
==Motivation==
<blockquote>"The TOPGUN course is designed to train already experienced Navy and Marine Corps aircrews at the graduate level in all aspects of strike-fighter aircraft employment, including tactics, hardware, techniques and the current world threat for air-to-air and air-to-ground missions. The course includes eighty hours of lectures and twenty-five flights that pit students against TOPGUN instructors. When a pilot or WSO completes the TOPGUN course he/she will return as a Training Officer carrying the latest tactical doctrine back to their operational squadron, or go directly to an FRS squadron to teach new aircrews."</blockquote>


Systems programming unites the most fundamental realms of our science: [[Programming Language Theory|programming language theory]], [[Compiler Design|compiler design]], [[architecture]], and algorithms come together in almost every project. The systems programmer, sitting in the neck of abstractions' hourglass, need never ask "and at the level below me, how does it work?" When the systems programmer fails, computations become uncomputable. It is a glorious role to play, but glory derives from risk, risk engenders fear, and fear stanches opportunity.
Without systems programmers, hardware sits silent, stillborn. Without systems programmers, application designers shout mutely into a void.


This class is about beginning to eliminate fear, at least as it pertains to programming. False fears can be combated with knowledge, and true fears with experience. Your fears will be true fears. Each project in this class could easily occupy a semester. You'll be expected to tie together literally dozens of concepts and data, many or most of them completely new to you. Underneath this assemblage lies the bedrock of algorithmic problem solving and the system's development tools. If your bedrock shifts, you may have a very difficult time indeed. If you complete the projects successfully, however, you will have experienced a great deal. Your fear will be slackened, bounded, and in time extirpated. At that point -- systems programming nirvana -- the computer's entirety opens up before you, a tapestry unraveled, a continuous whole. Processes and abstractions fall away, and all is unified, a single state machine. You will swoop among UNIX's disparate source codes, surgically eliding imperfections. You will improve the world one line of bug-free code at a time, one line in a strand of millions.
The reconfigurability of software combined with the system bounding of hardware allows the systems programmer the broadest space with which to innovate. Unfortunately, most of these possible innovations are just bugs, or (within the innermost ring) superbugs. When that signed vs unsigned comparison means the wrong hard drive sector is selected, it can ruin your whole day. Systems programming unites the most fundamental realms of our science: [[Programming Language Theory|programming language theory]], [[Compiler Design|compiler design]], [[architecture]], and algorithms come together in practical application. These theories are complex enough to demand classes of their own. Here, we will integrate these prerequisite theories to deliver efficient, robust systems software solutions. Furthermore, you'll become acquainted with the gritty details of cutting-edge technique.


<blockquote>"Classes typically lose around 70–80% of their trainees, either due to DORs or injuries sustained during training, but it is not always easy to predict which of the trainees will DOR during BUD/S. Winter class drop out rates are usually higher due to the cold. SEAL instructors say that in every class, approximately 10 percent of the students simply do not have the physical ability to complete the training. Another 10–15 percent will definitely make it through unless they sustain a serious physical injury. The other 75–80 percent is 'up for grabs' depending on their motivation."
==Grading==
<blockquote>"Classes typically lose around 70–80% of their trainees, either due to DORs (Drop-on-Requests) or injuries sustained during training, but it is not always easy to predict which of the trainees will DOR. SEAL instructors say that in every class, approximately 10 percent of the students simply do not have the physical ability to complete the training. Another 10–15 percent will definitely make it through unless they sustain a serious physical injury. The other 75–80 percent is 'up for grabs' depending on their motivation."
<br>            -- Basic Underwater Demolition/SEAL training, as described in Dick Crouch's ''The Warrior Elite: The Forging of SEAL Class 228'' (2001)</blockquote>
<br>            -- Basic Underwater Demolition/SEAL training, as described in Dick Crouch's ''The Warrior Elite: The Forging of SEAL Class 228'' (2001)</blockquote>


Because I am hard, you will not like me. But the more you hate me, the more you will learn.
<blockquote>"Qui si convien lasciare ogne sospetto; ogne viltà convien che qui sia morta."
<br>            -- Dante Alighieri, ''The Inferno'', Canto III, 14-15</blockquote>


Each project will be evaluated for both correctness (in terms of unit tests passed) and performance (on several reference platforms). Performance of incorrect code is, of course, moot. I will:
[https://www.youtube.com/watch?v=tHxf17yJsKs Because I am hard, you will not like me. But the more you hate me, the more you will learn.]
* Made the project public two weeks prior to its due date
* Provide a set of basic unit tests
* Provide a web-accessible oracular reference implementation (you can submit inputs and receive outputs)
* Collect projects
* Review them by hand, looking for lacunae. Any problems spotted will be added to the unit tests.
* Run all projects against the collected unit tests.
* Eliminate (award a score of zero to) any projects which irregularly halt, or do not halt within some time.
* Each unit test T_n will be successfully completed by some S_n projects, ordered (0--S_n-1) by decreasing time to completion (i.e. slowest is 0).
** If S_n is 1, the single successful project receives 1.5 points.
** Otherwise, each of the S_n successful projects receives points equal to |sortpos| / S_n-1.
** Thus, the fastest will receive 1 point, and those between always get more points for beating more people while being bested by fewer
*** And the slowest might as well have been wrong, which I love.
*** Unless, of course, the slowest was also the fastest. Good for you!


This scheme ought maximize competition and conflict between class members, which I hope will reduce cheating, lead to great personal exertions, and possibly get a chair thrown.
Each project will be evaluated for both correctness (in terms of unit tests passed) and performance (on several reference platforms). Performance of incorrect code is, of course, moot. I will make each project public two weeks prior to its due date, and provide a set of basic unit tests and a web-accessible oracular reference implementation (you can submit inputs and receive outputs). For details of grading, see the first week's slides. I am using a scheme designed to maximize competition and conflict between class members, which I hope will reduce cheating, lead to great personal exertions, and possibly get a chair thrown.


This is Sparta, young engineers. Come back with your shield, or upon it.
{{:UNIX Weapons School Weekplan}}


{{:UNIX Weapons School Weekplan}}
==Thanks==
* Bill Leahy, for causing GT to pick UWS up as a class, at a time when it was nothing more than a single-page PDF flyer.
* David Kanter of [http://realworldtech.com Real World Tech] for use of his superb processor diagrams, and his years of insightful writing besides.
* Richard Bejtlich's and TaoSecurity's [http://www.taosecurity.com/training.html TCP/IP Weapons School] from BlackHat 2007, for nomenclature.
** ...and through them, of course, to the United States Navy's [http://en.wikipedia.org/wiki/United_States_Navy_Fighter_Weapons_School Fighter Weapons School] aka "Top Gun".

Revision as of 09:40, 12 August 2022

Lecture slides are available here. These slides are very much a work in progress; LaTeX source is available on GitHub.

Accept the challenges so that you can feel the exhilaration of victory. -- George S. Patton

On March 3, 1969, the United States Navy established an elite school for the top one percent of its pilots. Its purpose was to teach the lost art of aerial combat and to ensure that the handful of students who graduated were the best fighter pilots in the world. They succeeded. Today, the Navy calls it Fighter Weapons School. The flyers call it: TOP GUN.

UNIX Weapons School is an intense elective practicum. Its goal is to train formidably competent systems programmers.

update from 2021: this class would now absolutely go into io_uring and eBPF. also SSDs and the resultant changes to the I/O paradigm. there would probably be some talk of Spectre/Meltdown in conjunction with the architecture lecture, also BIG.little and core scheduling. probably a bit more coverage of ARM details, though i'll not yet yeet the x86 stuff. the "future of systems programming section" would be updated. a bit more talk about distributed systems, probably, and i'd discard the current material about machine learning. other than that, this still pretty much stands.

Lectures thus far

2013-05-14 Greetings and Salutations

2013-05-16 Computer Architecture

  • "Your Friend the Computer"
    • CMOS power equations and pipelining
    • Superscaler and out-of-order
    • Delays in dynamically pipelined processors

2013-05-21 The x86

2013-05-23 UNIX Development

2013-08-01 FINAL EXAM SUMMER 2013

UWC Summer 2013 Final Exam

Cheating

If you feel predisposed to cheating, I urge you to drop the class. I will seek out cheating via comparison of compiled object code, statistical analysis of unit test results, and other methods. If you cheat in my class, I will use any influence I have to see you removed from my Institute and blacklisted from my industry. I will haunt you, yelling obscenities and bizarre threats for your life's pitiful remainder. My children will target your children for beatings, for unto seventy times seven generations.

Motivation

"The TOPGUN course is designed to train already experienced Navy and Marine Corps aircrews at the graduate level in all aspects of strike-fighter aircraft employment, including tactics, hardware, techniques and the current world threat for air-to-air and air-to-ground missions. The course includes eighty hours of lectures and twenty-five flights that pit students against TOPGUN instructors. When a pilot or WSO completes the TOPGUN course he/she will return as a Training Officer carrying the latest tactical doctrine back to their operational squadron, or go directly to an FRS squadron to teach new aircrews."

Without systems programmers, hardware sits silent, stillborn. Without systems programmers, application designers shout mutely into a void.

The reconfigurability of software combined with the system bounding of hardware allows the systems programmer the broadest space with which to innovate. Unfortunately, most of these possible innovations are just bugs, or (within the innermost ring) superbugs. When that signed vs unsigned comparison means the wrong hard drive sector is selected, it can ruin your whole day. Systems programming unites the most fundamental realms of our science: programming language theory, compiler design, architecture, and algorithms come together in practical application. These theories are complex enough to demand classes of their own. Here, we will integrate these prerequisite theories to deliver efficient, robust systems software solutions. Furthermore, you'll become acquainted with the gritty details of cutting-edge technique.

Grading

"Classes typically lose around 70–80% of their trainees, either due to DORs (Drop-on-Requests) or injuries sustained during training, but it is not always easy to predict which of the trainees will DOR. SEAL instructors say that in every class, approximately 10 percent of the students simply do not have the physical ability to complete the training. Another 10–15 percent will definitely make it through unless they sustain a serious physical injury. The other 75–80 percent is 'up for grabs' depending on their motivation."
-- Basic Underwater Demolition/SEAL training, as described in Dick Crouch's The Warrior Elite: The Forging of SEAL Class 228 (2001)

"Qui si convien lasciare ogne sospetto; ogne viltà convien che qui sia morta."
-- Dante Alighieri, The Inferno, Canto III, 14-15

Because I am hard, you will not like me. But the more you hate me, the more you will learn.

Each project will be evaluated for both correctness (in terms of unit tests passed) and performance (on several reference platforms). Performance of incorrect code is, of course, moot. I will make each project public two weeks prior to its due date, and provide a set of basic unit tests and a web-accessible oracular reference implementation (you can submit inputs and receive outputs). For details of grading, see the first week's slides. I am using a scheme designed to maximize competition and conflict between class members, which I hope will reduce cheating, lead to great personal exertions, and possibly get a chair thrown.

Week 1: C/C++ development in the x86 UNIX environment

  • SHELL LIFE aka Things I Hope You Already Know
    • Job control
    • GNU readline
    • SSH tricks
    • effective use of interactive shells
    • shell scripting idioms.
    • rant: code is data and data is code so keep your home directory in source control
  • HOW COAL BECOMES CAT PICTURES aka Attack of the clone()s
    • UEFI. UNIX boot sequence.
    • Everything you wanted to know about /dev and /sys and /proc but never found out.
    • Where a process comes from, what composes it while alive, and where it goes when it dies.
    • lsof, netstat, memstat, etc
  • C/C++ UNIX DEVELOPMENT aka Onward Christian Soldiers
    • Highlights of GCC, G++, LLVM, Clang, ICC, and NVCC.
    • GNU Make
    • strace, ltrace, ptrace().
    • GDB tricks.
    • Profiling with perf.
    • UNIX resource management.
    • Linker tricks both stupid and less stupid.
    • The C and C++ machine models.
  • THE WORLD FATHER aka UNIX
    • The Linux virtual memory implementation on x86
    • The FreeBSD/Dragonfly virtual memory implementation
    • The Linux process schedulers
    • The Linux I/O schedulers
  • OUR EARTH MOTHERS aka C/C++
    • The system call interface.
    • Process-level memory management.
    • The C standard library.
    • Xmacros / The STL.
    • A glimpse of template metaprogramming

Week 2: Systems methods for efficient use of memory and buses

  • YOUR FRIEND THE COMPUTER aka Computer Architecture in Thirty Minutes.
    • Intel Core processors.
    • The memory hierarchy.
    • Branch prediction.
    • SIMD.
    • Memory fences.
    • Transactional memory.
    • Predication.
  • DOUBLE-COPIED I/O aka Definitely More Copying Than Required.
    • C/C++ I/O using stdio.h and streams. Interactions of standard library buffering and I/O.
    • Semantics and side-effects of the I/O model.
    • popen() and the seven thousand ways it can be incorrectly used.
  • SINGLE-COPIED I/O aka Not Good Enough, Try Again.
    • UNIX I/O using bytestreams, datagrams, and sequenced packets.
    • The AF_UNIX namespace.
    • That mysterious EAGAIN.
  • ZERO-COPY I/O aka Now We're Getting Somewhere.
    • Mmap and shared memory.
    • TLB invalidation and IPIs
    • CLONE_VM and a glimpse of threads.
    • RDMA. The PCIe bus.
  • NEGATIVE-COPY I/O aka Oh Shit! There Go My Pages!
    • COW games.
    • sendfile() and TCP.
    • splice() your way to success.
  • MORE MEMORY aka When in Doubt, Blame Memory.
    • Interactions of disk, disk paging, memory paging, caching, registers, and multiprocessing.
    • Memory characteristics of long-lived programs.
    • Life in a post-paged world.
    • Computational memory.
    • NUMA and you.

Week 3: Algorithmic methods for efficient use of CPU and memory

  • IN THE GRIM FUTURE OF WEEK 3 THERE ARE NO AKAs, ONLY ALGORITHMS
  • Searching small spaces: Constant sorts (sorting networks). Dancing links. Timer wheels.
  • Searching large spaces: Trees. PATRICIA tries. Skip lists. Suffix trees. Automata search. Interval trees.
  • Searching by content: Hashes. Algorithmic complexity attacks. Universal hashes. Cuckoo hashing. Adaptive perfect hashes.
  • Searching huge spaces: VLHU. Enumeration by method of linear congruence and other space-filling parlor tricks.
  • Real-time machine learning: Support vector machines. Non-negative matrix factorization. Hierarchal hashing. Hidden Markov models.
  • Three impossible things before breakfast: Detecting an infinite loop, transforming an infinite list, and computing without executing.
  • Yes, You Really Have to Learn Fourier Transforms.

Algorithms for event systems

  • select(), poll(), interaction with signals
  • Linux's epoll, FreeBSD's kqueue. Level- vs edge-triggered events.
    • Algorithms for oneshot edge-triggered interfaces. Asynchronous I/O on POSIX systems.
  • Everything as events: timerfd, signalfd, V_NODE
  • Wrapping pthread events. Multithreaded event engines. Work queues, work stealing, tasklets.
    • Digression: Kernel threads and interrupt handlers
  • Robust systems and judicious fork()ing
  • libev, libevent, libtorque

Week 4: Compilers and their limitations

  • SSA and Basic Blocks, aka Planet of the Compilers
  • The Banerjee Test and the Polyhedral Model, aka Beneath the Planet of the Compilers
  • The limits of compiler-based optimization, aka Escape from the Planet of the Compilers
  • Inline assembly, aka Conquest of the Planet of the Compilers
  • PGO and Genetic-O aka Battle for the Planet of the Compilers

Week 5

Allotrios

  • Windows aka Unfathomably Wretched Function Naming: I/O Completion Ports. With Fibers come Heaps.
  • Libraries: objdump, nm, and how to design a shared library.
  • Internationalization. UTF8
  • Unpleasant Details of the UNIX Environment aka What Stevens Forgot to Tell You.
    • The OOM killer.
    • Atime.
    • Fragmentation, a spectre haunting your userspace.
    • Hardware failures.
  • Build systems aka They're All Shite

Parallelism I: Hardware Parallelism

  • Models of parallelism aka They're All Shite
  • Bit-level parallelism
  • Parallelism within a register
  • Parallelism among instructions
  • Parallelism among memory accesses

Week 6: Parallelism II: Software Parallelism

  • Parallelism among tasks
  • Algorithms simulating parallelism and nondeterminism.
  • POSIX threads. Userspace threading. Coroutines.
  • Parallel languages and libraries. IPP and TBB.

Week 7: Effective use of intranets and the Internet

  • Sampling theory of Nyquist
  • Queueing theory of Kleinrock. The Linux packet queue disciplines.
  • The Internet backbone. Preserving service via anycast networking. Threats to the Internet. PMTUD / MSS black holes.
  • Bufferbloat. Perils of the end-user network. Hardware design of fast networking devices. The CODEL and CAKE queue disciplines.
  • IPv6. Algorithms for IPv6. Zeroconf. PXE. Ad-hoc and mesh networking. Algorithms for fragmentation and sequencing. Intranet threats.
  • Packet sockets. Linux's netlink(7).
  • Local topology discovery

Week 8: Heterogeneity

Hardware

  • TCP offload engines
  • SolarFlare's OpenPacket
  • Microsoft AMP
  • NVIDIA's CUDA
  • OpenCL

Software

  • System / guest emulation
  • Transmeta CMS (Code Morphing Software)
  • Popek and Goldberg virtualization requirements / KVM / Xen

Week 9: The future of systems programming

  • C++11
  • Amorphous computing
  • RAIN/RAIM, EMC-aware programming
  • COMA, ccNUMA, directories, SCI, SCA
  • Computational memory
  • Software-defined networking
  • MRAM / FeRam / PRAM / SONOS

Thanks

  • Bill Leahy, for causing GT to pick UWS up as a class, at a time when it was nothing more than a single-page PDF flyer.
  • David Kanter of Real World Tech for use of his superb processor diagrams, and his years of insightful writing besides.
  • Richard Bejtlich's and TaoSecurity's TCP/IP Weapons School from BlackHat 2007, for nomenclature.