UNIC: Unicode and Internationalization Crates for Rust
unic-ucd-name_aliases
: Unicode Name Alias character properties.unic-cli
: Fallback to Name Alias for characters without Name value.ucd-ident
: Use correct data table for PatternWhitespace
property.
[GH-254]Use external git submodules for source data.
Migrate to Rust 2018 Edition.
unic-ucd-block
: List of all Unicode Blocks and the property assigning a block to each character.
unic-ucd-hangul
: Unicode Hangul Syllable detection and Composition/Decomposition algorithms.
unic-ucd-name
: Complete implementation for Unicode Name Property, with addition of Hangul and CJK Han names, as defined by The Unicode Standard.This is the last release of the project before migration to Rust 2018 Edition.
Special thanks for Yan Li (@eyeplum) for implementing most of the features in this release.
UNIC Applications are binary creates hosting in the same repository as unic
super-crate, under the apps/
directory. These creates are not internal parts of the unic
library, but tools designed and developed for the general audience, also serving as a test bed for the UNIC API. We are starting with CLI applications, and possibly expanding it to GUI and WEB applications, as well.
unic-cli
] The new UNIC CLI application provides command-line tools for working with Unicode characters and strings. In this release, first versions of unic-echo
and unic-inspector
commands are implemented.unic-ucd-common
] Common character properties (alphabetic, alphanumeric, control, numeric, and white_space).unic-ucd-ident
] Unicode Identifier character properties.unic-ucd-segment
] Unicode Segmentation character properties.unic-emoji-char
] Unicode Emoji character properties.unic-segment
] Implementation of Unicode Text Segmentation algorithms (Grapheme Cluster and Word boundaries).This release was delayed for a couple of cycles, because of the problems with running tests in a workspace with a mix of std and no-std creates. The issue is resolved as of 1.22.0
.
no_std
for many of the existing components.1.22.0
.unic-char-range
] Range and iterator types for characters, plus a chars!()
macro. (Used as chars!('a'..'e')
, chars!('a'..='e')
, or chars!(..)
.)unic-char-property
] New component based on the module previously in unic-utils
, with new support for binary character properties.unic-ucd-name
] New minimal implementation of Unicode character names (Name
property).unic-ucd-case
] New basic implementation of Unicode character case properties.unic-ucd-bidi
] Add Bidi_Mirrored
and Bidi_Control
properties.unic-utils
's iter_all_chars()
in favor of unic-char-range
types and macros.unic-utils
] Restructure tables into a dedicated type, rather than a mix of traits and "blessed" std types.New component: [unic-ucd-category
] Support General_Category Unicode (UCD) character property, implemented as enum GeneralCategory
.
[unic-ucd-nomal
] Support Decomposition_Type Unicode (UCD) character property, implemented as enum DecompositionType
.
[unic-ucd-normal
] Update Canonical_Combining_Class implementation to tuple struct and add update API accordingly.
[unic-ucd-age
] Update Age property implementation to not cause API breakage on new Unicode versions.
[unic-utils
] Rename from unic-ucd-utils
, to contain all data-less utility functionalities. (https://github.com/behnam/rust-unic/issues/50)
Expand character property API in implementations, in the process of defining trait-based contracts for all (UCD and other) character properties. (https://github.com/behnam/rust-unic/issues/66, https://github.com/behnam/rust-unic/issues/34)
Reorganize code structure to make room for dev packages, like new unic-gen
crate—which is going to replace the Python implementation for data table generation.
Drop data-dependent integration tests from packaging, allowing all tests pass for downloaded packages. (https://github.com/behnam/rust-unic/issues/34)
[unic-ucd
] Expand cross-component and conformance tests. (https://github.com/behnam/rust-unic/issues/18, https://github.com/behnam/rust-unic/issues/43)
Drop dependency on rustc_test
in favor of default integration test harness. (https://github.com/behnam/rust-unic/issues/76)
Create UnicodeVersion
type and use in all components for UNICODE_VERSION
, and allow conversion to/from Age
character property.
Split IDNA Mapping data into unic-idna-mapping
and leave the process algorithms in unic-idna
.
[ucd] Create common pattern for UCD character properties: For property called Prop
, static function Prop::of(ch: char)
to get value for a character, and ch.<prop>()
using the helper trait called CharProp
. Also, move all property value helpers into impl Prop
as methods.
[idna] Use standard binary_search_by()
.
Pass in bench_it
feature to components supporting it. (Only unic-bidi
at the moment.)
ucd::age
component. (unic-ucd-age
)Update UCD and IDNA data to Unicode 10.0.0, as released on 2017-06-20.
Add a bunch of missing documentations.
Add a script to publish all crates, in order of dependency.
Initial release with UCD, Bidi, IDNA, and Normalization components.