Simple and Performant Language detection library for NodeJS
Typing in javascript and require
Typing in typescript and import
Maintenance version with only small modifications
18.x
detect('これは日本語です.', { verbose: true })
"exports": {
".": {
"require": "./dist/tinyld.normal.node.js",
"import": "./dist/tinyld.normal.node.mjs",
"browser": "./dist/tinyld.normal.browser.js"
},
"./light": {
"require": "./dist/tinyld.light.node.js",
"import": "./dist/tinyld.light.node.mjs",
"browser": "./dist/tinyld.light.browser.js"
}
},
Small maintenance version
Npm repository does not contains the src/
folder anymore, but type definitions directly in the dist folder.
tinyld-light
which was returning the wrong supportedLanguage list930KB
-> 590KB
110KB
-> 68KB
Full Changelog: https://github.com/komodojp/tinyld/compare/1.2.0...1.2.2
After lot of unsuccessful experimentations, I'm glad to have find a way to improve the accuracy and release it. I decided to focus on accuracy over quantity for the moment. Making sure the algorithm work properly before trying to scale it up.
With this version 1.2.0
:
tinyld
and tinyld-light
are over 97% accuracy on 16 most common languagestinyld
global accuracy on all language (64) is over 95% and each language has an accuracy > 80%Few new API to get the list of supported language and their names
import { supportedLanguages, langName, langRegion } from 'tinyld'
// all supported languages (ISO3 format)
supportedLanguages // ['jpn', 'cmn', ...]
// and few utils about langs
langName('jpn') // Japanese
langRegion('jpn') // east-asia
- Greek (ell) - 100%
- Hindi (hin) - 100%
- Bengali (ben) - 100%
- Thai (tha) - 100%
- Telugu (tel) - 100%
- Gujarati (guj) - 100%
- Tamil (tam) - 100%
- Amharic (amh) - 100%
- Kannada (kan) - 100%
- Burmese (mya) - 100%
- Armenian (hye) - 99.9555%
- Japanese (jpn) - 99.9333%
- Vietnamese (vie) - 99.9067%
- Korean (kor) - 99.8134%
- Khmer (khm) - 99.7354%
- Urdu (urd) - 99.2537%
- Hebrew (heb) - 99.1068%
- Berber (ber) - 99.0135%
- German (deu) - 98.9601%
- Toki Pona (toki) - 98.8801%
- Russian (rus) - 98.8268%
- Persian (pes) - 98.8135%
- Polish (pol) - 98.8002%
- Chinese (cmn) - 98.7602%
- French (fra) - 98.7068%
- Arabic (ara) - 98.4669%
- Finnish (fin) - 98.0936%
- English (eng) - 98.0136%
- Yiddish (yid) - 97.9869%
- Romanian (ron) - 97.9336%
- Mongolian (mon) - 97.8058%
- Lithuanian (lit) - 97.8003%
- Icelandic (isl) - 97.7203%
- Klingon (tlh) - 97.6803%
- Hungarian (hun) - 97.5603%
- Kazakh (kaz) - 97.4214%
- Indonesian (ind) - 97.267%
- Dutch (nld) - 96.8937%
- Tatar (tat) - 96.8271%
- Latvian (lvs) - 96.4734%
- Tagalog (tgl) - 95.8539%
- Ukrainian (ukr) - 95.4673%
- Turkish (tur) - 95.214%
- Portuguese (por) - 95.054%
- Kirundi (run) - 94.6058%
- Turkmen (tuk) - 94.5193%
- Italian (ita) - 94.4541%
- Belarusian (bel) - 94.2808%
- Esperanto (epo) - 93.9475%
- Spanish (spa) - 93.4009%
- Volapuk (vol) - 92.6978%
- Swedish (swe) - 91.9344%
- Irish (gle) - 89.6735%
- Latin (lat) - 89.0948%
- Estonian (est) - 88.6921%
- Czech (ces) - 88.5749%
- Catalan (cat) - 88.0949%
- Danish (dan) - 87.375%
- Afrikaans (afr) - 86.578%
- Bulgarian (bul) - 84.5754%
- Slovak (slk) - 83.4555%
- Serbian (srp) - 83.0823%
- Macedonian (mkd) - 82.709%
- Norwegian (nob) - 81.5358%