Nisaba Versions Save

Finite-state script normalization and processing utilities

v0.3.0-beta

1 year ago

Synopsis

This is an interim pre-release of the data compiled on x86_64 Linux platform. The data consists of FST archives (FARs) in OpenFst format. These can be manipulated using Pynini.

Note: this pre-release does not contain the precompiled FSTs for natural romanization of Brahmic scripts. These will be included in the next release.

Contents of the released tarballs

For each script family the tarballs contain FST archives (FARs) shown below along with their corresponding sizes (in bytes).

  • abjad_alphabet_x86_64.tar.gz: Perso-Arabic abjads:

    28409 x86_64/abjad_alphabet/nfc.far
    175758 x86_64/abjad_alphabet/reading_norm.far
    1532106 x86_64/abjad_alphabet/reversible_roman.far
    1530858718 x86_64/abjad_alphabet/visual_norm.far
    
  • brahmic_x86_64.tar.gz: Brahmic abugidas:

    60129 fixed.far
    2254777 iso.far
    668276 nfc.far
    244509 reading_norm.far
    16231686 visual_norm.far
    86528 wellformed.far