-
-
Notifications
You must be signed in to change notification settings - Fork 21
/
.readme.gotxt
305 lines (195 loc) · 9.38 KB
/
.readme.gotxt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
`uni` queries the Unicode database from the commandline. It supports Unicode
15.1 (September 2023) and has good support for emojis.
There are four commands: `identify` codepoints in a string, `search` for
codepoints, `print` codepoints by class, block, or range, and `emoji` to find
emojis.
There are binaries on the [releases] page, and [packages] for a number of
platforms. You can also [run it in your browser][uni-wasm].
Compile from source with:
% go install zgo.at/uni/v2@latest
which will give you a `uni` binary in `~/go/bin`.
README index:
- [Integrations](#integrations)
- [Usage](#usage)
- [Identify](#identify)
- [Search](#search)
- [Print](#identify)
- [Emoji](#emoji)
- [JSON](#json)
- [ChangeLog](#changelog)
- [Development](#development)
- [Alternatives](#alternatives)
[uni-wasm]: https://arp242.github.io/uni-wasm/
[releases]: https://github.com/arp242/uni/releases
[packages]: https://repology.org/project/uni/versions
Integrations
------------
- [dmenu], [rofi], and [fzf] script at [`dmenu-uni`](/dmenu-uni). See the top of
the script for some options you may want to frob with.
- For a Vim command see [`uni.vim`](/uni.vim); just copy/paste it in your vimrc.
[dmenu]: http://tools.suckless.org/dmenu
[rofi]: https://github.com/davatorium/rofi
[fzf]: https://github.com/junegunn/fzf
Usage
-----
*Note: the alignment is slightly off for some entries due to the way GitHub
renders wide characters; in terminals it should be aligned correctly.*
### Identify
Identify characters in a string, as a kind of a unicode-aware `hexdump`:
{{example "identify" "€"}}
`i` is a shortcut for `identify`:
{{example "i" "h€ý"}}
It reads from stdin:
% head -c5 README.md | uni i
CPoint Dec UTF8 HTML Name
'`' U+0060 96 60 ` GRAVE ACCENT [backtick, backquote]
'u' U+0075 117 75 u LATIN SMALL LETTER U
'n' U+006E 110 6e n LATIN SMALL LETTER N
'i' U+0069 105 69 i LATIN SMALL LETTER I
'`' U+0060 96 60 ` GRAVE ACCENT [backtick, backquote]
% echo 'U+1234 U+1111' | uni p
CPoint Dec UTF8 HTML Name
'ᄑ' U+1111 4369 e1 84 91 ᄑ HANGUL CHOSEONG PHIEUPH [P]
'ሴ' U+1234 4660 e1 88 b4 ሴ ETHIOPIC SYLLABLE SEE
You can use `-compact` (or `-c`) to suppress the header, and `-format` (or `-f`)
to control the output format:
{{example "i" "-f" "%unicode %name" "a€🧟"}}
If the format string starts with `+` it will automatically be prepended with the
character, codepoint, and name:
{{example "i" "-f" "+%unicode" "a€🧟"}}
You can add more advanced options with `%(name flags)`; for example to generate
an aligned codepoint to X11 keysym mapping:
{{example "i" "-c" "-f" "0x%(hex l:auto f:0): %(keysym l:auto q:\":\",) // %name" "h€ý"}}
See `uni help` for more details on the `-format` flag; this flag can also be
added to other commands.
### Search
Search description:
{{example "search" "euro"}}
The `s` command is a shortcut for `search`. Multiple words are matched
individually:
{{example "s" "globe" "earth"}}
Use shell quoting for more literal matches:
{{trim 3 (example "s" "rightwards" "black" "arrow")}}
{{example "s" "rightwards black arrow"}}
Add `-or` or `-o` to combine the search terms with "OR" instead of "AND":
{{example "s" "-o" "globe" "milky"}}
### Print
Print specific codepoints or groups of codepoints:
{{example "print" "U+2042"}}
Print a custom range; `U+2042`, `U2042`, and `2042` are all identical:
{{example "print" "2042..2044"}}
You can also use hex, octal, and binary numbers: `0x2024`, `0o20102`, or
`0b10000001000010`.
General category:
{{trim 3 (example "p" "Po")}}
Blocks:
{{trim 5 (example "p" "arrows" "box drawing")}}
Print as table, and with a shorter name:
{{example "p" "-as" "table" "box"}}
Or more compact table:
{{example "p" "-as" "table" "box" "-compact"}}
### Emoji
The `emoji` command (shortcut: `e`) is is the real reason I wrote this:
{{example "e" "cry"}}
By default both the name and CLDR data are searched; the CLDR data is a list of
keywords for an emoji; prefix with `name:` or `n:` to search on the name only:
{{trim 3 (example "e" "smile")}}
{{example "e" "name:smile"}}
As you can see, the CLDR is pretty useful, as "smile" only gives one result as
most emojis use "smiling".
Prefix with `group:` to search by group:
{{example "e" "group:hands"}}
Group and search can be combined, and `group:` can be abbreviated to `g:`:
{{example "e" "g:cat-face" "grin"}}
Like with `search`, use `-or` to OR the parameters together instead of AND:
{{example "e" "-or" "g:face-glasses" "g:face-hat"}}
Apply skin tone modifiers with `-tone`:
{{example "e" "-tone" "dark" "g:hands"}}
The handshake emoji supports setting individual skin tones per hand since
Unicode 14, but this isn't supported, mostly because I can't really really think
a good CLI interface for setting this without breaking compatibility (there are
some other emojis too, like "holding hands" and "kissing" where you can set both
the gender and skin tone of both sides individually). Maybe for uni v3 someday.
The default is to display only the gender-neutral "person", but this can be
changed with the `-gender` option:
{{example "e" "-gender" "man" "g:person-gesture"}}
Both `-tone` and `-gender` accept multiple values. `-gender women,man` will
display both the female and male variants, and `-tone light,dark` will display
both a light and dark skin tone; use `all` to display all skin tones or genders:
{{example "e" "-tone" "light,dark" "-gender" "f,m" "shrug"}}
Like `print` and `identify`, you can use `-format`:
{{example "e" "g:cat-face" "-c" "-format" "%(name): %(emoji)"}}
See `uni help` for more details on the `-format` flag.
### JSON
With `-as json` or `-as j` you can output the data as JSON:
{{example "i" "-as" "json" "h€ý"}}
All the columns listed in `-f` will be included; you can use `-f all` to include
all columns:
{{example "i" "-as" "json" "-f" "all" "h€ý"}}
This also works for the `emoji` command:
{{example "e" "-as" "json" "-f" "all" "kissing cat"}}
All values are always a string, even numerical values. This makes things a bit
easier/consistent as JSON doesn't support hex literals and such. Use `jq` or
some other tool if you want to process the data further.
ChangeLog
---------
Moved to [CHANGELOG.md](/CHANGELOG.md).
Development
-----------
Re-generate the Unicode data with `go generate unidata`. Files are cached in
`unidata/.cache`, so clear that if you want to update the files from remote.
This requires zsh and GNU awk (gawk).
Alternatives
------------
Note this is from ~2017/2018 when I first wrote this; I don't re-evaluate every
program every year, and I don't go finding newly created tools every year
either.
### CLI/TUI
- https://github.com/philpennock/character
More or less similar to uni, but very different CLI, and has some additional
features. Seems pretty good.
- https://github.com/sindresorhus/emoj
Doesn't support emojis sequences (e.g. MAN SHRUGGING is PERSON SHRUGGING +
MAN, FIREFIGHTER is PERSON + FIRE TRUCK, etc); quite slow for a CLI program
(`emoj smiling` takes 1.8s on my system, sometimes a lot longer), search
results are pretty bad (`shrug` returns unamused face, thinking face, eyes,
confused face, neutral face, tears of joy, and expressionless face ... but not
the shrugging emoji), not a fan of npm (has 1862 dependencies).
- https://github.com/Fingel/tuimoji
Grouping could be better, doesn't support emojis sequences, only interactive
TUI, feels kinda slow-ish especially when searching.
- https://github.com/pemistahl/chr
Only deals with codepoints, not emojis.
### GUI
- gnome-characters
Uses Gnome interface/window decorations and won't work well with other WMs,
doesn't deal with emoji sequences, I don't like the grouping/ordering it uses,
requires two clicks to copy a character.
- gucharmap
Doesn't display emojis, just unicode blocks.
- KCharSelect
Many KDE-specific dependencies (106M). Didn't try it.
- https://github.com/Mange/rofi-emoji and https://github.com/fdw/rofimoji
Both are pretty similar to the dmenu/rofi integration of uni with some minor
differences, and both seem to work well with no major issues.
- gtk3 emoji picker (Ctrl+; or Ctrl+. in gtk 3.93 or newer)
Only works in GTK, doesn't work with `GTK_IM_MODULE=xim` (needed for compose
key), for some reasons the emojis look ugly, doesn't display emojis sequences,
doesn't have a tooltip or other text description about what the emoji actually
is, the variation selector doesn't seem to work (never displays skin tone?),
doesn't work in Firefox.
This is so broken on my system that it seems that I'm missing something for
this to work or something?
- https://github.com/rugk/awesome-emoji-picker
Only works in Firefox; takes a tad too long to open; doesn't support skin
tones.
### Didn't investigate (yet)
Some alternatives people have suggested that I haven't looked at; make an issue
or email me if you know of any others.
- https://github.com/cassidyjames/ideogram
- https://github.com/OzymandiasTheGreat/emoji-keyboard
- https://github.com/salty-horse/ibus-uniemoji
- https://fcitx-im.org/wiki/Unicode
- http://kassiopeia.juls.savba.sk/~garabik/software/unicode/ and https://github.com/garabik/unicode (same?)
- https://billposer.org/Software/unidesc.html
- https://github.com/NoraCodes/charpicker (rofi)