Check out my first novel, midnight's simulacra!

Spooky tmux at a Distance

From dankwiki
Revision as of 04:27, 20 February 2021 by Dank (talk | contribs)

GitHub user ton did the noble service of filing Notcurses bug #1314 about a month ago (please, please, please file bugs when you encounter problems in my software), and i "fixed" it about a week later. i was uncomfortable with my solution, and i bet you know that shitty feeling. your reasoning about the issue doesn't even convince yourself...doesn't even seem plausible really. the applied change doesn't clearly rectify a wrong; perhaps you can't explain how, exactly, it has anything to do with the observed problems. perhaps you dress this up with a comment, a prolix row or two full of guarded conjecture, stressing that you've arranged things just this way because "maybe this other way causes this problem...". such a comment can only be necessary because a reasonable, even an informed person could at any point naturally change the code back, since after all, this thing here couldn't affect that thing over there (the irony of software engineering is that such a change might well happen anyway, doubly invalidating your cowardly, mealymouthed comment, which will be left in place to confuse the fuck out of whatever unhappy soul next reads the code, struggling for semantic consonance).

one feels bad at times like this because one is bad, and one ought feel bad. computer science and programming, at their best, are about the most wonderful jobs one can have. as fred brooks wrote:

The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by the exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures. Yet the program construct, unlike the poet's words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.

brooks was extending a simile made famous decades before by g. h. hardy:

The mathematician’s patterns, like the painter’s or the poet’s must be beautiful; the ideas like the colours or the words, must fit together in a harmonious way. Beauty is the first test: there is no permanent place in the world for ugly mathematics. A mathematician, on the other hand, has no material to work with but ideas, and so his patterns are likely to last longer.

rather unlike the poet, however, coding also involves buckets of cash dumped on your head on a daily basis. as lil' jon said:

party like a rock star / fuck like a porn star / i don't give a damn / imma buy the whole bar

and indeed zero basketball teams are owned by Poets Laureate. so when you half-ass it like this, you're not only being paid phat stacks to spend the day jerking it, an embarrassment to yourself and everyone around you. you're not just being intellectually dishonest in a time when we need our most logical and rational to light the way. no, you're profaning a sacred craft. you were chosen to steward humanity's greatest achievement, the microprocessor, and you took a shit all over it and played with the shit and went up to other people with shit smeared on your grinning simpleton face and shit dribbling from your hand and said, "isn't this a runny, nasty shit i took? i applied all my ingenuity and craft, for which i was paid $3,000 per diem, which i used to bronze my anal leakage". then you wonder how AOC keeps getting elected.

jesus, it's embarrassing to read my comments on that bug. "does tmux possibly shim getc() or something?" well, you can run programs as root within tmux, so probably not, asshole! scandalous levels of stupidity in this analysis. "i have no idea where this behavior is coming from, and intend to get to the bottom of it." famous last words! i did not get to the bottom of it, and indeed blissfully went on, with no idea where the behavior came from, because i was wrong. my changes did absolutely fix the limited case reported, but not the fundamental problem.

essentially, within tmux, the code did some drawing, then prompted for input. upon prompting for input, the active colors were reset. Notcurses heavily optimizes its rasterization, never emitting a styling escape unless it's needed (just a foreground and background RGB escape represent 2800%(!!) overhead relative to a single-byte glyph, so these optimizations are essential), so this reset caused subsequent draws to be incorrectly styled (in this case, the background color was incorrect). analysis with strace(1) revealed the resets in my write(2) calls, but i had no write call anywhere nearby. indeed, the output was being provably generated within my getc(3) call! as a sane fucking human being, i excluded the possibility that ANSI C getc(3) spontaneously emits console control codes. as that old fraud sherlock holmes is so fond of patronizing poor watson:

How often have I said to you that when you have eliminated the impossible, whatever remains, however improbable, must be the truth?

i thus attributed the behavior to some "spooooooooky tmux action at a distance", however improbable, and called it a day. the difficulty gone unspoken in doyle's bromidic platitude, of course, is knowing what's truly impossible. i had assumed that getc(3) does not emit precipitous, mysterious writes like ink from a squid enraged by a plate of calamari. after all, from the DESCRIPTION section of the 2020-12-21 Glibc getc.3 man page:

fgetc() reads the next character from stream and returns it as an unsigned char cast to an int, or EOF on end of file or error. getc() is equivalent to fgetc() except that it may be implemented as a macro which evaluates stream more than once.

and the ANSI/ISO C18 standard makes 18 non-index uses of "flush", exactly zero of them suggesting that getc(3) might spontaneously vomit control sequences into what was a perfectly serviceable output FILE*. c might be harsh and unforgiving, but tourette's isn't its style.

time's arrow marched forward; cells divided; the weak force flipped down quarks on their asses; old men died and children were born. the earth quakes and the heavens rattle; the beasts of nature flock together and the nations of men flock apart; volcanoes usher up heat while elsewhere water becomes ice and melts; and then on other days it just rains. indeed many things do come to pass. eventually our good man ton had to read a second time...and thus #1350. i can't stress enough how lucky i am to have an incredible community around Notcurses; most people would have never bothered to file a first report, especially one of such high quality, and especially not a second one after such a dubious fucking of the dog.

same constraints: the Notcurses intro banner had to be disabled, the behavior only happened within tmux, but following input, colors were off for the duration of a single render. crazy constraints. intro banners? what the fuck all could those have to do with anything? how is stdin affecting stdout? why don't i understand how computers work? perhaps finally drinking the bathroom cleaner would put this problem to rest once and for all? flinging myself up and over my terrace in a graceless subballistic arc, less gravity's rainbow than gravity's belly flop, seemed not an unreasonable response. alas, he persevered.

you can read the bug for the gory details of epiphany, but the moment of truth came during a perusal of the Glibc Info pages, §12.20.2 "Flushing Buffers". now look, my occupation has for over twenty years been that of a badass systems programmer, of the take-no-quarters UNIX C variety, ready at any moment to eat death and ask for seconds. part of that is periodically rereading your GNU Info pages, and i'd surely seen this line before, probably multiple times. but it had slipped my mind, or never quite taken hold there:

There are many circumstances when buffered output on a stream is flushed automatically:

  • When you try to do output and the output buffer is full.
  • When the stream is closed. See Closing Streams.
  • When the program terminates by calling exit. See Normal Termination.
  • When a newline is written, if the stream is line buffered.
  • Whenever an input operation on any stream actually reads data from its file.

anaphora aside, that's pretty exhaustive. read that last line again. Whenever an input operation on any stream actually reads data from its file. in GNU libc (again, this is not part of ANSI C--though not prohibited, either), any successful read operation (this does not apply to the read(2) system call, despite being wrapped by glibc), on any FILE*, flushes all buffered output data.

is this the Least Astonishing thing it could do? no it is not. christ, what a waste of my time.

in notcurses_init(), i reset the terminal, pretty reasonable. this reset ought have been flushed, and i'd never considered that it might not be. imagine my surprise when i took a closer look at reset_term_attributes():

static int
reset_term_attributes(notcurses* nc){
  int ret = 0;
  if(nc->tcache.op && term_emit(nc->tcache.op, nc->ttyfp, false)){
    ret = -1;
  }
  if(nc->tcache.sgr0 && term_emit(nc->tcache.sgr0, nc->ttyfp, false)){
    ret = -1;
  }
  if(nc->tcache.oc && term_emit(nc->tcache.oc, nc->ttyfp, true)){
    ret = -1;
  }
  return ret;
}

the second parameter to term_emit() controlled flushery, and our third call flushes, and the world is good, right? well of course not--that term_emit() was conditional on nc->tcache.oc. and guess what terminal definition lacks oc? here's a hint: it starts with t and rhymes with "fucks". furthermore, term_emit() writes to nc->ttyfp, a stdio FILE*, while tty_emit() writes to a file descriptor, and requires actually being attached to a terminal (there's precious little value in, say, clearing the screen when you're writing to a file or pipe). so there are at least 4 bugs in these 14 lines. put another way, the only bug-free operative lines of code are int ret = 0;, ret = -1;, and return ret;.

dijkstra is looking down at me from Dutch heaven, frowning, slowly chewing a De Ruijter Fruit Sprinkle, and saying "i do not like it."

so, to wrap everything up in one nice package:

  • tmux, alone among tested terminals, did not declare oc. everything else flushed the reset early on, as intended, hiding the bug.
  • a bug of mine meant that the reset was written using stdio, though admittedly this is merely a side-effect--the real bug was emitting control sequences to a non-terminal. this bug would otherwise not have been an issue, as write(2) is of course unbuffered.
  • if the banners were used, they flushed stdout (by virtue of writing a newline), hiding the bug.
  • stdout was not otherwise explicitly touched during the course of execution, but this unknown, supra-Standard behavior implied a buffer flush in an unrelated getc(3).
  • my previous fix did indeed address that limited case of the problem, changing a getc(3) to a read(2), bypassing this behavior, and hiding the bug.

everything is not only broken, but broken several different ways. you have no chance to survice. make your time.

i had considered naming this entry "a railroad's switch points", in reference to a beloved passage of primo levi's:

...better not to do than to do, better to meditate than to act, better his astrophysics, the threshold of the Unknowable, than my chemistry, a mess compounded of stenches, explosions and small futile mysteries. I thought of another moral, more down to earth and concrete, and I believe that every militant chemist can confirm it: that one must distrust the almost-the-same (sodium is almost the same as potassium, but with sodium nothing would have happened), the practically identical, the approximate, the or-even, all surrogates, and all patchwork. the difference can be small, but they can lead to radically different consequences, like a railroad's switch points; the chemist's trade consists in good part in being aware of these differences, knowing them close up, and foreseeing their effects. And not only the chemist's trade.

as an aside, it's interesting to consider that Levi wrote that particular simile after surviving the Holocaust. a railroad's switch points, indeed.

i learned something new in the course of this bug, and it will make me a better programmer. tomorrow i go again into battle; each day i learn new tricks, and grow nastier.

i am very tricky, and very nasty, now.

previously: "Threadripper L3 CPUID Strangeness" 2020-02-05