@ta: adding encode/decode #6

arthyn · 2024-02-21T20:20:35Z

This adds capabilities to encode and decode the @ta aura for strings.

github-actions · 2024-02-21T20:21:33Z

size-limit report 📦

Path	Size
dist/aura.cjs.production.min.js	5.72 KB (+4.94% 🔺)
dist/aura.esm.js	5.94 KB (+4.67% 🔺)

Fang- · 2024-02-23T12:13:11Z

Looking at this again, my naming of the original functions here has been somewhat confusing, to the point where I even tripped myself up.

tldr:

We should use the more-complete implementation from channels: support non-knot text in searches tloncorp/tlon-apps#3274, and
We should rename this to t.ts, containing formatT() and parseT().
decodeT() doesn't support non-ascii cases correctly.
- Consider ~~~2605.~1f920.yeehaw~1f468.~200d.~1f467.~200d.~1f466., which should decode to ★🤠yeehaw👨‍👧‍👦. Probably want to treat the >2-byte case separately too, because it'll need to insert two js string characters.

Additional detail, for clarification and for the record:

The version of encodeTa that lives in channels: support non-knot text in searches tloncorp/tlon-apps#3274 was evolved from the implementation here (which presumably was lifted from the debug dashboard code, or wherever else the snippet ended up). The version from the other PR implements the TODO, and as such is a more complete implementation. We should use that instead.
encodeTa and stringToTa are both slightly misleading names, but perhaps all of the functions in this repo are in some sense. Consider the following:
- @ta is the aura for "string with only a small set of url-safe characters" (0-9, a-z, -, ., ~, _). You may think of this subset as "$path-safe characters".
- +scot exists to encode atoms with arbitrary auras into strings that fit (+sanely) inside a @ta. If it's a known/standard aura, it does so in a way that lets you distinguish the original aura from the way the value is encoded. (Falling back to normal hex encoding otherwise.) +slav is the inverse, "decode string into atom, expecting this aura".
  - Curiously, @uw is the sole exception to this. ((sane %ta) (scot %uw eny)) produces false.
- Notably, any such encoded string is valid hoon syntax for writing a literal value with that aura.
- @t is slightly exceptional in that, in hoon code, we usually write 'some string' instead of using its "@ta-encoded" variant, ~~some.string. The latter is what we want to use when encoding arbitrary strings into paths (or other @ta-esque contexts).
  - Note here also the different leading characters, ~~ indicating an encoded @t, rather than ~. which indicates an encoded @ta. (Which wouldn't really be "encoded" at all, beyond having its value prefixed with ~..)
- Given all of the above, arguably the bulk of aura-js should be accessible through scot(aura, value) and slav(aura, string) functions. This to more closely match the look & feel of the matching hoon-side operations, and make more obvious that the encoding target throughout this library is (a slightly stricter version of) hoon's atom literal syntax, which consists of @ta-compatible strings.
  - Will be event more appropriate if we ever add utils here (or elsewhere) that do all these operations to/from nockjs Atoms.
- But that's obviously way out of scope here. Renaming ta.ts to t.ts (and renaming the functions to match and have consistency with the other functions) will be plenty.

ta: adding encode/decode

9f583a5

arthyn requested a review from Fang- February 21, 2024 20:20

arthyn mentioned this pull request Feb 22, 2024

channels: support non-knot text in searches tloncorp/tlon-apps#3274

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

@ta: adding encode/decode #6

@ta: adding encode/decode #6

arthyn commented Feb 21, 2024

github-actions bot commented Feb 21, 2024

Fang- commented Feb 23, 2024

@ta: adding encode/decode #6

Are you sure you want to change the base?

@ta: adding encode/decode #6

Conversation

arthyn commented Feb 21, 2024

github-actions bot commented Feb 21, 2024

size-limit report 📦

Fang- commented Feb 23, 2024