Skip to content

Commit 11398e0

Browse files
committed
Decode XML entities in CLDR data
1 parent ad3a8d5 commit 11398e0

File tree

3 files changed

+44
-40
lines changed

3 files changed

+44
-40
lines changed

README.md

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -233,24 +233,24 @@ The `emoji` command (shortcut: `e`) is is the real reason I wrote this:
233233

234234
% uni e cry
235235
Name CLDR
236-
🥹 face holding back tears [admiration, angry, aw, aww, cry, embarrassed, feelings, grateful, gratitude, please, proud, resist, sad, sadness, tears of joy]
236+
🥹 face holding back tears [admiration, aww, cry, embarrassed, feelings, grateful, gratitude, joy, please, proud, resist, sad]
237237
😢 crying face [awful, feels, miss, sad, tear, triste, unhappy]
238238
😭 loudly crying face [bawling, sad, sob, tear, tears, unhappy]
239-
😿 crying cat [animal, crying cat face, face, sad, tear]
240-
🔮 crystal ball [fairy tale, fairytale, fantasy, fortune, future, magic, tool]
239+
😿 crying cat [animal, face, sad, tear]
240+
🔮 crystal ball [fairy, fairytale, fantasy, fortune, future, magic, tale, tool]
241241

242242
By default both the name and CLDR data are searched; the CLDR data is a list of
243243
keywords for an emoji; prefix with `name:` or `n:` to search on the name only:
244244

245245
% uni e smile
246246
Name CLDR
247247
😀 grinning face [cheerful, cheery, happy, laugh, nice, smile, smiling, teeth]
248-
😃 grinning face with big eyes [awesome, happy, mouth, open, smile, smiling, smiling face with open mouth, teeth, yay]
248+
😃 grinning face with big eyes [awesome, happy, mouth, open, smile, smiling, teeth, yay]
249249
250250

251251
% uni e name:smile
252252
Name CLDR
253-
😼 cat with wry smile [animal, cat face with wry smile, face, ironic]
253+
😼 cat with wry smile [animal, face, ironic]
254254

255255
As you can see, the CLDR is pretty useful, as "smile" only gives one result as
256256
most emojis use "smiling".
@@ -259,43 +259,43 @@ Prefix with `group:` to search by group:
259259

260260
% uni e group:hands
261261
Name CLDR
262-
👏 clapping hands [applause, approval, awesome, congrats, congratulations, excited, good job, great, homie, nice, prayed, well done, yay]
262+
👏 clapping hands [applause, approval, awesome, congrats, congratulations, excited, good, great, homie, job, nice, prayed, well, yay]
263263
🙌 raising hands [celebration, gesture, hooray, praise, raised]
264-
🫶 heart hands [<3, love, love you]
265-
👐 open hands [hug, jazz hands, swerve]
266-
🤲 palms up together [cupped hands, dua, pray, prayer, wish]
264+
🫶 heart hands [<3, love, you]
265+
👐 open hands [hug, jazz, swerve]
266+
🤲 palms up together [cupped, dua, hands, pray, prayer, wish]
267267
🤝 handshake [agreement, deal, meeting]
268-
🙏 folded hands [appreciate, ask, beg, blessed, bow, cmon, five, gesture, high 5, high five, please, pray, thank, thank you, thanks, thx]
268+
🙏 folded hands [appreciate, ask, beg, blessed, bow, cmon, five, gesture, high, please, pray, thanks, thx]
269269

270270
Group and search can be combined, and `group:` can be abbreviated to `g:`:
271271

272272
% uni e g:cat-face grin
273273
Name CLDR
274-
😺 grinning cat [animal, face, mouth, open, smile, smiling cat face with open mouth]
275-
😸 grinning cat with smiling eyes [animal, face, grinning cat face with smiling eyes, smile]
274+
😺 grinning cat [animal, face, mouth, open, smile, smiling]
275+
😸 grinning cat with smiling eyes [animal, face, smile]
276276

277277
Like with `search`, use `-or` to OR the parameters together instead of AND:
278278

279279
% uni e -or g:face-glasses g:face-hat
280280
Name CLDR
281281
🤠 cowboy hat face [cowgirl]
282-
🥳 partying face [birthday, celebrate, celebration, excited, happy bday, happy birthday, hat, hooray, horn]
282+
🥳 partying face [bday, birthday, celebrate, celebration, excited, happy, hat, hooray, horn]
283283
🥸 disguised face [eyebrow, glasses, incognito, moustache, mustache, nose, person, spy, tache, tash]
284-
😎 smiling face with sunglasses [awesome, beach, bright, bro, chillin, cool, eye, eyewear, fly, rad, relaxed, shades, slay, smile, stunner, style, swag, swagger, win, winning, yeah]
284+
😎 smiling face with sunglasses [awesome, beach, bright, bro, chilling, cool, rad, relaxed, shades, slay, smile, style, swag, win]
285285
🤓 nerd face [brainy, clever, expert, geek, gifted, glasses, intelligent, smart]
286286
🧐 face with monocle [classy, fancy, rich, stuffy, wealthy]
287287

288288
Apply skin tone modifiers with `-tone`:
289289

290290
% uni e -tone dark g:hands
291291
Name CLDR
292-
👏🏿 clapping hands: dark skin tone [applause, approval, awesome, congrats, congratulations, excited, good job, great, homie, nice, prayed, well done, yay]
292+
👏🏿 clapping hands: dark skin tone [applause, approval, awesome, congrats, congratulations, excited, good, great, homie, job, nice, prayed, well, yay]
293293
🙌🏿 raising hands: dark skin tone [celebration, gesture, hooray, praise, raised]
294-
🫶🏿 heart hands: dark skin tone [&lt;3, love, love you]
295-
👐🏿 open hands: dark skin tone [hug, jazz hands, swerve]
296-
🤲🏿 palms up together: dark skin tone [cupped hands, dua, pray, prayer, wish]
294+
🫶🏿 heart hands: dark skin tone [<3, love, you]
295+
👐🏿 open hands: dark skin tone [hug, jazz, swerve]
296+
🤲🏿 palms up together: dark skin tone [cupped, dua, hands, pray, prayer, wish]
297297
🤝🏿 handshake: dark skin tone [agreement, deal, meeting]
298-
🙏🏿 folded hands: dark skin tone [appreciate, ask, beg, blessed, bow, cmon, five, gesture, high 5, high five, please, pray, thank, thank you, thanks, thx]
298+
🙏🏿 folded hands: dark skin tone [appreciate, ask, beg, blessed, bow, cmon, five, gesture, high, please, pray, thanks, thx]
299299

300300
The handshake emoji supports setting individual skin tones per hand since
301301
Unicode 14, but this isn't supported, mostly because I can't really really think
@@ -308,27 +308,27 @@ changed with the `-gender` option:
308308

309309
% uni e -gender man g:person-gesture
310310
Name CLDR
311-
🙍‍♂️ man frowning [annoyed, disappoint, disgruntled, disturbed, frustrated, gesture, irritated, not happy, person frowning, upset, woman frowning]
312-
🙎‍♂️ man pouting [disappoint, downtrodden, frown, gesture, grimace, person pouting, scowl, sulk, upset, whine, woman pouting]
313-
🙅‍♂️ man gesturing NO [exclude, forbidden, gesture, hand, no, nope, not, not a chance, person gesturing NO, prohibit, prohibited, woman gesturing NO]
314-
🙆‍♂️ man gesturing OK [exercise, gesture, hand, omg, person gesturing OK, woman gesturing OK]
315-
💁‍♂️ man tipping hand [fetch, gossip, hair flick, hair flip, help, information, person tipping hand, sarcasm, sarcastic, sassy, seriously, whatever, woman tipping hand]
316-
🙋‍♂️ man raising hand [gesture, hands, happy, I can help, i know, me, over here, person raising hand, pick me, question, raised, right here, woman raising hand]
317-
🧏‍♂️ deaf man [accessibility, deaf person, ear, hear]
318-
🙇‍♂️ man bowing [apology, beg, forgive, gesture, meditate, meditation, person bowing, pity, regret, sorry]
319-
🤦‍♂️ man facepalming [disbelief, exasperation, not again, oh no, omg, person, person facepalming, shock, smh]
320-
🤷‍♂️ man shrugging [doubt, dunno, i dunno, I guess, idk, ignorance, indifference, maybe, person, person shrugging, whatever, who knows]
311+
🙍‍♂️ man frowning [annoyed, disappointed, disgruntled, disturbed, frustrated, gesture, irritated, person, upset]
312+
🙎‍♂️ man pouting [disappointed, downtrodden, frown, grimace, person, scowl, sulk, upset, whine]
313+
🙅‍♂️ man gesturing NO [forbidden, gesture, hand, not, person, prohibit]
314+
🙆‍♂️ man gesturing OK [exercise, gesture, hand, omg, person]
315+
💁‍♂️ man tipping hand [fetch, flick, flip, gossip, person, sarcasm, sarcastic, sassy, seriously, whatever]
316+
🙋‍♂️ man raising hand [gesture, here, know, me, person, pick, question, raise]
317+
🧏‍♂️ deaf man [accessibility, ear, gesture, hear, person]
318+
🙇‍♂️ man bowing [apology, ask, beg, favor, forgive, gesture, meditate, meditation, person, pity, regret, sorry]
319+
🤦‍♂️ man facepalming [again, bewilder, disbelief, exasperation, no, not, oh, omg, person, shock, smh]
320+
🤷‍♂️ man shrugging [doubt, dunno, guess, idk, ignorance, indifference, knows, maybe, person, whatever, who]
321321

322322
Both `-tone` and `-gender` accept multiple values. `-gender women,man` will
323323
display both the female and male variants, and `-tone light,dark` will display
324324
both a light and dark skin tone; use `all` to display all skin tones or genders:
325325

326326
% uni e -tone light,dark -gender f,m shrug
327327
Name CLDR
328-
🤷🏻‍♂️ man shrugging: light skin tone [doubt, dunno, i dunno, I guess, idk, ignorance, indifference, maybe, person, person shrugging, whatever, who knows]
329-
🤷🏻‍♀️ woman shrugging: light skin tone [doubt, dunno, i dunno, I guess, idk, ignorance, indifference, maybe, person, person shrugging, whatever, who knows]
330-
🤷🏿‍♂️ man shrugging: dark skin tone [doubt, dunno, i dunno, I guess, idk, ignorance, indifference, maybe, person, person shrugging, whatever, who knows]
331-
🤷🏿‍♀️ woman shrugging: dark skin tone [doubt, dunno, i dunno, I guess, idk, ignorance, indifference, maybe, person, person shrugging, whatever, who knows]
328+
🤷🏻‍♂️ man shrugging: light skin tone [doubt, dunno, guess, idk, ignorance, indifference, knows, maybe, person, whatever, who]
329+
🤷🏻‍♀️ woman shrugging: light skin tone [doubt, dunno, guess, idk, ignorance, indifference, knows, maybe, person, whatever, who]
330+
🤷🏿‍♂️ man shrugging: dark skin tone [doubt, dunno, guess, idk, ignorance, indifference, knows, maybe, person, whatever, who]
331+
🤷🏿‍♀️ woman shrugging: dark skin tone [doubt, dunno, guess, idk, ignorance, indifference, knows, maybe, person, whatever, who]
332332

333333
Like `print` and `identify`, you can use `-format`:
334334

@@ -464,8 +464,8 @@ This also works for the `emoji` command:
464464

465465
% uni e -as json -f all 'kissing cat'
466466
[{
467-
"cldr": "animal, eye, face, kissing cat face with closed eyes",
468-
"cldr_full": "animal, cat, eye, face, kiss, kissing cat, kissing cat face with closed eyes",
467+
"cldr": "animal, closed, eye, eyes, face",
468+
"cldr_full": "animal, cat, closed, eye, eyes, face, kiss, kissing",
469469
"cpoint": "U+1F63D",
470470
"emoji": "😽",
471471
"group": "Smileys & Emotion",

unidata/gen/emojis.go

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,11 +49,15 @@ func readCLDR(f string) map[string][]string {
4949
}
5050
zli.F(xml.Unmarshal(d, &cldr))
5151

52-
out := make(map[string][]string)
52+
var (
53+
// "Good enough" XML entity removal.
54+
tr = strings.NewReplacer("&lt;", "<", "&gt;", ">", "&amp;", "&")
55+
out = make(map[string][]string)
56+
)
5357
for _, a := range cldr.Annotations {
5458
if a.Type != "tts" {
5559
a.CP = strings.ReplaceAll(a.CP, "\u200d", "")
56-
out[a.CP] = strings.Split(a.Names, " | ")
60+
out[a.CP] = strings.Split(tr.Replace(a.Names), " | ")
5761
}
5862
}
5963
return out

unidata/gen_emojis.go

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)