Check out my first novel, midnight's simulacra!

Theory and Practice of Sprixels: Difference between revisions

From dankwiki
No edit summary
Line 51: Line 51:
Sixel writes the provided glyph as a unit, starting at the current cursor position (assuming we've entered DECSDM aka "Sixel Display Mode", which we always do—otherwise all sixels are emitted at the upper left corner of the display). Bits not specified in the sixel are transparent (assuming the Device Control String has used the value 2 for parameter <tt>P2</tt>), and transparent pixels leave undisturbed whatever was already present at the time of emission. Writing an entirely transparent region is a no-op. Writing a new character (or new Sixel pixels) anywhere in the graphic's region will wipe out the entirety of that cell. The Sixel data thus annihilated is not recoverable. This means that updating the text underneath a Sixel will blow the obscuring graphics away, requiring that the affected Sixel pixels be rerendered. In the event of stacked sprixels, we would potentially need to rerender the entire stack. In a naive implementation, changing one text cell "underneath" the octopus could result in painting 781 cells (the text cell actually being changed, and the 780 cells of the octopus). This is plainly undesirable.
Sixel writes the provided glyph as a unit, starting at the current cursor position (assuming we've entered DECSDM aka "Sixel Display Mode", which we always do—otherwise all sixels are emitted at the upper left corner of the display). Bits not specified in the sixel are transparent (assuming the Device Control String has used the value 2 for parameter <tt>P2</tt>), and transparent pixels leave undisturbed whatever was already present at the time of emission. Writing an entirely transparent region is a no-op. Writing a new character (or new Sixel pixels) anywhere in the graphic's region will wipe out the entirety of that cell. The Sixel data thus annihilated is not recoverable. This means that updating the text underneath a Sixel will blow the obscuring graphics away, requiring that the affected Sixel pixels be rerendered. In the event of stacked sprixels, we would potentially need to rerender the entire stack. In a naive implementation, changing one text cell "underneath" the octopus could result in painting 781 cells (the text cell actually being changed, and the 780 cells of the octopus). This is plainly undesirable.


With Kitty, a graphic is positional. All text is assumed to be at z=0. Graphics with non-negative coordinates obscure text. Graphics with negative coordinates sit underneath text. Note that this means we must choose whether a Kitty sprixel is going to have text above it, or under it, but no arbitrary single sprixel can do both. Thankfully, unlike Sixel, we have the ability to delete graphics in the Kitty protocol (Sixel graphics can only be cleared along with a larger screen region, or entirely obscured by other output), so we can "cut" chunks out of our bitmap, delete the visible sprixel, and load the new one; doing so effectively allows us to make text underneath a "z=-1" sprixel dynamically available. We cut the chunk out by setting the [https://en.wikipedia.org/wiki/Alpha_compositing α (alpha)] parameter to 0 across the breadth of the affected RGBA region. This function, <tt>sprite_kitty_cell_wipe()</tt>, can be found in [https://github.com/dankamongmen/notcurses/blob/master/src/lib/kitty.c src/lib/kitty.c]. Easy peasy. We must keep the original α values somewhere, in case the damage need be recovered (we needn't duplicate the RGB values; we leave those as they are). Since we normalize the α values on input to either 0 or 255, we only need to keep one bit of state for each pixel, and map back that bit times 0xff. For a 1024x1024 pixel image (4MiB), this would be be a maximum 1Mb bit vector, representing 3.125% overhead.
With Kitty, a graphic is positional. All text is assumed to be at z=0. Graphics with non-negative coordinates obscure text. Graphics with negative coordinates sit underneath text. Note that this means we must choose whether a Kitty sprixel is going to have text above it, or under it, but no arbitrary single sprixel can do both.


===Techniques===
===Techniques===
Thankfully, unlike Sixel, we have the ability to delete graphics in the Kitty protocol (Sixel graphics can only be cleared along with a larger screen region, or entirely obscured by other output), so we can "cut" chunks out of our bitmap, delete the visible sprixel, and load the new one; doing so effectively allows us to make text underneath a "z=-1" sprixel dynamically available. We cut the chunk out by setting the [https://en.wikipedia.org/wiki/Alpha_compositing α (alpha)] parameter to 0 across the breadth of the affected RGBA region. This function, <tt>sprite_kitty_cell_wipe()</tt>, can be found in [https://github.com/dankamongmen/notcurses/blob/master/src/lib/kitty.c src/lib/kitty.c]. Easy peasy. We must keep the original α values somewhere, in case the damage need be recovered (we needn't duplicate the RGB values; we leave those as they are). Since we normalize the α values on input to either 0 or 255, we only need to keep one bit of state for each pixel, and map back that bit times 0xff. For a 1024x1024 pixel image (4MiB), this would be be a maximum 1Mb bit vector, representing 3.125% overhead.


'''previously: "[[Spooky_tmux_at_a_Distance|spooky tmux at a distance]]" 2021-02-20'''
'''previously: "[[Spooky_tmux_at_a_Distance|spooky tmux at a distance]]" 2021-02-20'''

Revision as of 08:01, 29 March 2021

orcas on segways

dankblog! 2021-03-29, at the danktower

I've spent significant time over the past few weeks adding "sprixel" support to Notcurses, culminating (thus far) in yesterday's 2.2.4 release. What's a sprixel? Numerous terminal emulators support one or another "graphic protocol", by which bitmaps of arbitrary size can be written to the glyph-based viewing area. Showing that dynamic control of the font is approximately equivalent in power to such graphic protocols is left as an exercise for the reader. There are at least four major methods:

  • the venerable DEC Sixel protocol, introduced on the VT125, and also present on at least the VT240 and VT 330. Implemented by at least XTerm, Mlterm, and foot, with patches out for at least VTE, alacritty, and Windows Terminal.
  • Kitty's protocol, advanced by Kovid Goyal.
  • The "1337" iTerm2 protocol, also implemented by wezterm.
  • You can merrily shit all over the Linux framebuffer directly when in a framebuffer console.

Sixel is both the oldest and the most widely supported, though the majority of this support has come in the past year. Unfortunately, while passable for controlling early-Eighties nine-pin color dot matrix printers like the DEC LA50, it's pretty undesirable from a console bitmap standpoint. Nonetheless, combined with "sprite", it has loaned its name to "sprixels".

These protocols differ from each other in fundamental, important ways, and forming an abstraction from them isn't trivial. Integrating them wholly into the z-ordered semantics of Notcurses was still more difficult. Notcurses works via piles (rendering contexts) of planes (drawing surfaces), totally ordered on a z axis. Higher planes obscure or blend with lower ones. When Notcurses renders a pile, it projects these planes down onto a single plane, solving a single surface; it rasterizes this virtual surface to the physical viewing area by encoding it as a set of terminal escape codes and UTF-8 data. Only cells which have changed need to be rasterized, and eliding unchanged cells is a critical optimization, usually cutting the output size down by 90% or more. Every output byte is multiplied several times before becoming visible (we must copy it to a kernel buffer, the terminal must read it from a kernel buffer, and the terminal must display the resulting cell), so trading some computation for fewer output bytes is well worth it.

The elegant model of planes of fixed-width, independent cells solving for a single cell matrix already breaks down in the presence of Unicode wide glyphs, perhaps foremost among them U+FDFD ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM ﷽, a single glyph occupying anywhere from one to who-knows (I've seen 9) cells, depending on font, font engine, and terminal emulator. Wide characters cannot be printed in part, and printing another character anywhere atop a wide character obliterates all of the old character. Even restricting ourselves to characters two cells wide, a single character can annihilate four columns containing two characters:

  • column 0: left side of character A
  • column 1: right side of character A
  • column 2: left side of character B
  • column 3: right side of character B

Print a wide character C at column 1, and we end up with

  • column 0: space
  • column 1: left side of character C
  • column 2: right side of character C
  • column 3: space

This isn't a problem with a single virtual plane, but imagine we have two planes, P0 and P1, with P0 above P1. They are of equal size, and P0 is transparent by default. P0 has a wide character at column 1. P1 has a wide character at column 0. What do we render? It's impossible to render the first half of P1, despite that being the logical render. Allah, the All-Powerful, has fucked us again!

This difficulty is ratcheted up significantly in the case of sprixels. In the most complicated case, a single sprixel might need to be both "under" (obscured by) and "over" (partially obscuring) text, and the set of cells obscuring and being obscured can change every frame. Imagine, for instance, that we have the following friendly octopus (512x357px, and note that it has a transparent channel):

Let's assume a cell geometry of 11x20, which just happens to be my current cell geometry. 512 pixels then occupy a little over 46 columns (512 == 46 * 11 + 6), while 357 pixels occupy just about 18 rows (357 == 17 * 20 + 17). We want to be able to draw this happy fellow atop a background of text:

Note that the text "underneath" the octopus can change while the octopus is present, and this change must be reflected if the text is visible. We also want to draw on top of the octopus, and this text too can change while the octopus is displayed, or even go away, in which case the octopus must be restored.

We might want to write atop the sprixel, but with a transparent background. Unfortunately, this is only really possible with Kitty (I'm excluding ideas like sampling the glyph and interpolating an RGB background for the text cell), where we can place an image at "z=-1" in the internal kitty image Z-axis to place it underneath text. Here I've done so, placing the sprixel underneath the layer of 'a's, but still annihilating the sprixel with some other text.

Sixel doesn't really provide this capability; writing text over a sixel graphic will annihilate affected cells in their entirety. Here we come to the most meaningful difference between Sixel and Kitty/iTerm: the former is temporal, the latter positional.

Temporal vs Positional

Sixel writes the provided glyph as a unit, starting at the current cursor position (assuming we've entered DECSDM aka "Sixel Display Mode", which we always do—otherwise all sixels are emitted at the upper left corner of the display). Bits not specified in the sixel are transparent (assuming the Device Control String has used the value 2 for parameter P2), and transparent pixels leave undisturbed whatever was already present at the time of emission. Writing an entirely transparent region is a no-op. Writing a new character (or new Sixel pixels) anywhere in the graphic's region will wipe out the entirety of that cell. The Sixel data thus annihilated is not recoverable. This means that updating the text underneath a Sixel will blow the obscuring graphics away, requiring that the affected Sixel pixels be rerendered. In the event of stacked sprixels, we would potentially need to rerender the entire stack. In a naive implementation, changing one text cell "underneath" the octopus could result in painting 781 cells (the text cell actually being changed, and the 780 cells of the octopus). This is plainly undesirable.

With Kitty, a graphic is positional. All text is assumed to be at z=0. Graphics with non-negative coordinates obscure text. Graphics with negative coordinates sit underneath text. Note that this means we must choose whether a Kitty sprixel is going to have text above it, or under it, but no arbitrary single sprixel can do both.

Techniques

Thankfully, unlike Sixel, we have the ability to delete graphics in the Kitty protocol (Sixel graphics can only be cleared along with a larger screen region, or entirely obscured by other output), so we can "cut" chunks out of our bitmap, delete the visible sprixel, and load the new one; doing so effectively allows us to make text underneath a "z=-1" sprixel dynamically available. We cut the chunk out by setting the α (alpha) parameter to 0 across the breadth of the affected RGBA region. This function, sprite_kitty_cell_wipe(), can be found in src/lib/kitty.c. Easy peasy. We must keep the original α values somewhere, in case the damage need be recovered (we needn't duplicate the RGB values; we leave those as they are). Since we normalize the α values on input to either 0 or 255, we only need to keep one bit of state for each pixel, and map back that bit times 0xff. For a 1024x1024 pixel image (4MiB), this would be be a maximum 1Mb bit vector, representing 3.125% overhead.


previously: "spooky tmux at a distance" 2021-02-20

External Links