Home
Projects
Resources
Alternatives
Blog
Sign In
OpenNMT Py Versions
Save
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
Overview
Versions
Reviews
Resources
v3.1.1
1 year ago
fix major bug in 3.1.0 introduced with LoRa (3.1.0 not available)
v3.1.0
1 year ago
updated docs with Sphinx 6.4
Restore source features to v3 (thanks @anderleich)
add inline tags transform (thanks @panosk)
add docify transform to allow doc-level training / inference
fix NLLB training (decoder_start_token)
New! LoRa adapters to finetune big models (egs: NLLB 3.3B)
various bug fixes
v3.0.4
1 year ago
override_opts to override checkpoints opt when training from
normalize transform based on (Sacre)Moses scripts
uppercase transform for adhoc data augmentation
suffix transform
Fuzzy match transform
WMT17 detailed example
NLLB-200 (from Meta/FB) models support (after conversion)
various bug fixes
v3.0.3
1 year ago
fix loss normalization when using accum or nb GPU > 1
use native CrossEntropyLoss with Label Smoothing. reported loss/ppl impacted by LS
fix long-time coverage loss bug thanks Sanghyuk-Choi
fix detok at scoring / fix tokenization Subword_nmt + Sentencepiece
various small bugs fixed
v3.0.2
1 year ago
3.0.2
(2022-12-07)
pyonmttok.Vocab is now pickable. dataloader switched to spawn. (MacOS/Windows compatible)
fix scoring with specific metrics (BLEU, TER)
fix tensorboard logging
fix dedup in batch iterator (only for TRAIN, was happening at inference also)
New: Change: tgt_prefix renamed to tgt_file_prefix
New: tgt_prefix / src_prefix used for "prefix" Transform (onmt/transforms/misc.py)
New: process transforms of buckets in batches (vs per example) / faster
v3.0.1
1 year ago
fix dynamic scoring
reinstate apex.amp level O1/O2 for benchmarking
New: LM distillation for NMT training
New: bucket_size ramp-up to avoid slow start
fix special tokens order
remove Library and add link to Yasmin's Tuto
v3.0.0
1 year ago
v3.0 !
Removed completely torchtext. Use Vocab object of pyonmttok instead
Dataloading changed accordingly with the use of pytorch Dataloader (num_workers)
queue_size / pool_factor no longer needed. bucket_size optimal value > 64K
options renamed: rnn_size => hidden_size (enc/dec_rnn_size => enc/dec_hid_size)
new tools/convertv2_v3.py to upgrade v2 models.pt
inference with length_penalty=avg is now the default
add_qkvbias (default false, but true for old model)
2.3.0
1 year ago
New features
BLEU/TER (& custom) scoring during training and validation (#2198)
LM related tools (#2197)
Allow encoder/decoder freezing (#2176)
Dynamic data loading for inference (#2145)
Sentence-level scores at inference (#2196)
MBR and oracle reranking scoring tools (#2196)
Fixes and improvements
Updated beam exit condition (#2190)
Improve scores reporting (#2191)
Fix dropout scheduling (#2194)
Better catch CUDA ooms when training (#2195)
Fix source features support in inference and REST server (#2109)
Make REST server more flexible with dictionaries (#2104)
Fix target prefixing in LM decoding (#2099)
2.2.0
2 years ago
New features
Support source features (thanks @anderleich !)
Fixes and improvements
Adaptations to relax torch version
Customizable transform statistics (#2059)
Adapt release code for ctranslate2 2.0
2.1.2
3 years ago
Fixes and improvements
Fix update_vocab for LM (#2056)
« Previous
Next »
Home
Projects
Resources
Alternatives
Blog
Sign In
Sign In to OSA
I agree with
Terms of Service
and
Privacy Policy
Sign In with Github