Check out my first novel, midnight's simulacra!

Theory and Practice of Sprixels: Difference between revisions

From dankwiki
No edit summary
No edit summary
Line 30: Line 30:
This isn't a problem with a single virtual plane, but imagine we have two planes, P0 and P1, with P0 above P1. They are of equal size, and P0 is transparent by default. P0 has a wide character at column 1. P1 has a wide character at column 0. What do we render? It's impossible to render the first half of P1, despite that being the logical render. Allah, the All-Powerful, has fucked us again!
This isn't a problem with a single virtual plane, but imagine we have two planes, P0 and P1, with P0 above P1. They are of equal size, and P0 is transparent by default. P0 has a wide character at column 1. P1 has a wide character at column 0. What do we render? It's impossible to render the first half of P1, despite that being the logical render. Allah, the All-Powerful, has fucked us again!


This difficulty is ratcheted up significantly in the case of sprixels. In the most complicated case, a single sprixel might need to be both "under" (obscured by) and "over" (partially obscuring) text, and the set of cells obscuring and being obscured can change every frame. Imagine, for instance, that we have the following friendly octopus (512x357px):
This difficulty is ratcheted up significantly in the case of sprixels. In the most complicated case, a single sprixel might need to be both "under" (obscured by) and "over" (partially obscuring) text, and the set of cells obscuring and being obscured can change every frame. Imagine, for instance, that we have the following friendly octopus (512x357px, and note that it has a transparent channel):


[[File:Octopus-transparent.png|frameless|200px]]
[[File:Octopus-transparent.png|frameless|200px]]


Let's assume a cell geometry of 11x20, which just happens to be my current cell geometry. 512 pixels then occupy a little over 46 columns (512 == 46 * 11 + 6), while 357 pixels occupy just about 18 rows (357 == 17 * 20 + 17).
Let's assume a cell geometry of 11x20, which just happens to be my current cell geometry. 512 pixels then occupy a little over 46 columns (512 == 46 * 11 + 6), while 357 pixels occupy just about 18 rows (357 == 17 * 20 + 17). We want to be able to draw this happy fellow atop a background of text:
 


[[File:Octopus-transparent-bg.png|frameless|200px]]


'''previously: "[[Spooky_tmux_at_a_Distance|spooky tmux at a distance]]" 2021-02-20'''
'''previously: "[[Spooky_tmux_at_a_Distance|spooky tmux at a distance]]" 2021-02-20'''

Revision as of 07:00, 29 March 2021

orcas on segways

dankblog! 2021-03-29, at the danktower

I've spent significant time over the past few weeks adding "sprixel" support to Notcurses, culminating (thus far) in yesterday's 2.2.4 release. What's a sprixel? Numerous terminal emulators support one or another "graphic protocol", by which bitmaps of arbitrary size can be written to the glyph-based viewing area. Showing that dynamic control of the font is approximately equivalent in power to such graphic protocols is left as an exercise for the reader. There are at least four major methods:

  • the venerable DEC Sixel protocol, introduced on the VT125, and also present on at least the VT240 and VT 330. Implemented by at least XTerm, Mlterm, and foot, with patches out for at least VTE, alacritty, and Windows Terminal.
  • Kitty's protocol, advanced by Kovid Goyal.
  • The "1337" iTerm2 protocol, also implemented by wezterm.
  • You can merrily shit all over the Linux framebuffer directly when in a framebuffer console.

Sixel is both the oldest and the most widely supported, though the majority of this support has come in the past year. Unfortunately, while passable for controlling early-Eighties nine-pin color dot matrix printers like the DEC LA50, it's pretty undesirable from a console bitmap standpoint. Nonetheless, combined with "sprite", it has loaned its name to "sprixels".

These protocols differ from each other in fundamental, important ways, and forming an abstraction from them isn't trivial. Integrating them wholly into the z-ordered semantics of Notcurses was still more difficult. Notcurses works via piles (rendering contexts) of planes (drawing surfaces), totally ordered on a z axis. Higher planes obscure or blend with lower ones. When Notcurses renders a pile, it projects these planes down onto a single plane, solving a single surface; it rasterizes this virtual surface to the physical viewing area by encoding it as a set of terminal escape codes and UTF-8 data. Only cells which have changed need to be rasterized, and eliding unchanged cells is a critical optimization, usually cutting the output size down by 90% or more. Every output byte is multiplied several times before becoming visible (we must copy it to a kernel buffer, the terminal must read it from a kernel buffer, and the terminal must display the resulting cell), so trading some computation for fewer output bytes is well worth it.

The elegant model of planes of fixed-width, independent cells solving for a single cell matrix already breaks down in the presence of Unicode wide glyphs, perhaps foremost among them U+FDFD ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM ﷽, a single glyph occupying anywhere from one to who-knows (I've seen 9) cells, depending on font, font engine, and terminal emulator. Wide characters cannot be printed in part, and printing another character anywhere atop a wide character obliterates all of the old character. Even restricting ourselves to characters two cells wide, a single character can annihilate four columns containing two characters:

  • column 0: left side of character A
  • column 1: right side of character A
  • column 2: left side of character B
  • column 3: right side of character B

Print a wide character C at column 1, and we end up with

  • column 0: space
  • column 1: left side of character C
  • column 2: right side of character C
  • column 3: space

This isn't a problem with a single virtual plane, but imagine we have two planes, P0 and P1, with P0 above P1. They are of equal size, and P0 is transparent by default. P0 has a wide character at column 1. P1 has a wide character at column 0. What do we render? It's impossible to render the first half of P1, despite that being the logical render. Allah, the All-Powerful, has fucked us again!

This difficulty is ratcheted up significantly in the case of sprixels. In the most complicated case, a single sprixel might need to be both "under" (obscured by) and "over" (partially obscuring) text, and the set of cells obscuring and being obscured can change every frame. Imagine, for instance, that we have the following friendly octopus (512x357px, and note that it has a transparent channel):

Let's assume a cell geometry of 11x20, which just happens to be my current cell geometry. 512 pixels then occupy a little over 46 columns (512 == 46 * 11 + 6), while 357 pixels occupy just about 18 rows (357 == 17 * 20 + 17). We want to be able to draw this happy fellow atop a background of text:

previously: "spooky tmux at a distance" 2021-02-20