Releases: arp242/uni
v2.8.0
v2.7.0
-
Improve
-formatflag:-
Add
%nameas an alias for%(name l:auto); this is a lot less typing and requires less shell quoting, and >90% of the time this is what you want. -
Automatically prepend character, codepoint, and name if the format flag starts with
+; for example:% uni identify -f +'%unicode %plane' a Name Unicode Plane 'a' U+0061 LATIN SMALL LETTER A 1.1 Basic Multilingual Plane
This should make quickly printing some property a lot quicker.
-
-
Align and colourize JSON output.
-
Update CLDR information, adding significantly more aliases for emojis.
-
Add
cellscolumn, which returns how many cells a codepoint will display at (0, 1, or 2). -
Add
aliasescolumn, which lists the alias names. Also add this to the default output:% uni s factorial CPoint Dec UTF8 HTML Name Aliases '!' U+0021 33 21 ! EXCLAMATION MARK [factorial, bang] -
Add
refscolumns, which references other related/similar codepoints:% uni p -q U+46 -f '%(name): %(refs)' LATIN CAPITAL LETTER F: U+2109, U+2131, U+2132 % uni p -q U+46 -f '%(refs)' | uni p CPoint Dec UTF8 HTML Name Aliases '℉' U+2109 8457 e2 84 89 ℉ DEGREE FAHRENHEIT 'ℱ' U+2131 8497 e2 84 b1 ℱ SCRIPT CAPITAL F [Fourier transform] 'Ⅎ' U+2132 8498 e2 84 b2 Ⅎ TURNED CAPITAL F [Claudian digamma inversum] -
Allow arguments to
printto start or end with a comma or slash. This comes up when copy/pasting some list of codepoints from another source; there's no real reason to error out on this. -
Allow listing unicode versions with
uni list unicodeand planes withuni list planes. -
uni listwithout arguments errors, instead of listing all. -
Add
hformat flag to not print the header for this column.
v2.6.0
-
Update to Unicode 15.1.
-
Add "script" property – also supported in the list and print commands:
% uni identify -f '%(script l:auto) %(cpoint) %(name)' 'a Ω' Script CPoint Name Latin U+0061 LATIN SMALL LETTER A Common U+0020 SPACE Greek U+03A9 GREEK CAPITAL LETTER OMEGA % uni list scripts Scripts: Name Assigned Adlam 83 Ahom 54 Anatolian Hieroglyphs 582 … % uni print 'script:linear a' Showing script Linear A CPoint Dec UTF8 HTML Name (Cat) '𐘀' U+10600 67072 f0 90 98 80 𐘀 LINEAR A SIGN AB001 (Other_Letter) '𐘁' U+10601 67073 f0 90 98 81 𐘁 LINEAR A SIGN AB002 (Other_Letter) '𐘂' U+10602 67074 f0 90 98 82 𐘂 LINEAR A SIGN AB003 (Other_Letter) … -
Add "unicode" property, which tells you in which Unicode version a codepoint was introduced:
% uni identify -f '%(unicode l:auto) %(cpoint l:auto) %(name)' a𐘂🫁 Unicode CPoint Name 1.1 U+0061 LATIN SMALL LETTER A 7.0 U+10602 LINEAR A SIGN AB003 13.0 U+1FAC1 LUNGS -
Show unprintable control characters as the open box (␣, U+2423) instead of the replacement character (�, U+FFFD). It already did that for C1 control characters, and U+FFFD looked more like a bug than intentional. The -raw/-r flag still overrides this.
-
Always print Private Use characters as-is for %(char) instead of using U+FFFD replacement character. It's usually safe to print this, and having to use -raw is confusing.
-
lscommand is now an alias for `list.
v2.5.1
v2.5.0
-
Add support for properties; they can be displayed with
%(props)in-format, and selected inprint(e.g.uni print dash). -
Add
uni listcommand, to list categories, blocks, and properties. -
Allow explicitly selecting a block, category, or property in
printwithblock:name(b:name),category:name(cat:name,c:name), orproperty:name(prop:name,p:name).Also print an error if a string without prefix matched more than one group (i.e.
uni p dashmatches both the propertyDashand categoryDash_Punctuation). -
Add table layout with
-as table. Also change-json/-jto-as jsonor-as j. The-jsonflag is still accepted as an alias for compatibility. -
Change
-q/-quietto-c/-compact;-as jsonwill print as minified if given, and-as tablewill include less padding.-qis still accepted as an alias for compatibility. -
Don't use the Go stdlib
unicodepackage; since this is a Unicode 13 database and some operations would fail on codepoints added in Unicode 14 due to the mismatch.
v2.4.0
-
Update import path to
zgo.at/uni/v2. -
Add
octandbinflags for-fto print a codepoint as octal or binary. -
Add
fformat flag to change the fill character with alignment; e.g.%(bin r:auto f:0)will print zeros on the left. -
Allow using just
o123for an octal number (instead of0o123). We can't do this for binary and decimal numbers (sincebanddare valid hexidecimals), but no reason not to do it foro.
Release v2.3.0
-
Update to Unicode 14.0.
-
UTF-16 and JSON are printed as lower case, just like UTF-8 was. Upper-case is used only for codepoints (i.e. U+00AC).
-
uni printcan now print from UTF-8 byte sequence; for example to print the € sign:uni p utf8:e282ac uni p 'utf8:e2 82 ac' uni p 'utf8:0xe2 0x82 0xac'Bytes can optionally be separated by any combination of
0x,-,_, or spaces.
v2.2.1
Only one small change:
You can now use uni p 0d40 to get U+28 by decimal.
uni print 40 interprets the 40 as hex instead of decimal, and there was no way to get a codepoint by decimal number. Since codepoints are much more more common than decimals, leaving off the U+ and U is a useful shortcut I'd like to keep. AFAIK there isn't really a standard(-ish) was to explicitly indicate a number is a decimal, so this is probably the closest.
v2.2.0
v2.1.0
-
Can now output as JSON with
-jor-json. -
-format allis a special value to include all columns uni knows about. This is useful especially in combination with-json. -
Add
%(block),%(plane),%(width),%(utf16be),%(utf16le), and%(json) to-f`. -
Refactor the arp242.net/uni/unidata package to be more useful for other use cases. This isn't really relevant for
uniusers as such, but if you want to get information about codepoints or emojis then this package is a nice addition to the standard library'sunicodepackage.