Sockeye Versions Save

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch

3.1.34

1 year ago

[3.1.34]

Fixed

Do not mask prepended tokens by default (for self-attention).
Do not require specifying --end-of-prepending-tag if it is already done when preparing the data.

3.1.33

1 year ago

[3.1.33]

Fixed

Two small fixes to SampleK. Before the device was not set correctly leading to issues when running sampling on GPUs. Furthermore, SampleK did not return the top-k values correctly.

[3.1.32]

Added

Sockeye now supports blocking cross-attention between decoder and encoded prepended tokens.
- If the source contains prepended text and a tag indicating the end of prepended text, Sockeye supports blocking the cross-attention between decoder and encoded prepended tokens (including the tag). To enable this operation, specify --end-of-prepending-tag for training or data preparation, and --transformer-block-prepended-cross-attention for training.

Changed

Sockeye uses a new dictionary-based prepared data format that supports storing length of prepended source tokens (version 7). The previous format (version 6) is still supported.

3.1.31

1 year ago

[3.1.31]

Fixed

Fixed sequence copying integration tests to correctly specify that scoring/translation outputs should not be checked.
Enabled bfloat16 integration and system testing on all platforms.

[3.1.30]

Added

Added support for --dtype bfloat16 to sockeye-translate, sockeye-score, and sockeye-quantize.

Fixed

Fixed compatibility issue with numpy==1.24.0 by using pickle instead of numpy to save/load ParallelSampleIter data permutations.

3.1.29

1 year ago

[3.1.29]

Changed

Running sockeye-evaluate no longer applies text tokenization for TER (same behavior as other metrics).
Turned on type checking for all sockeye modules except test_utils and addressed resulting type issues.
Refactored code in various modules without changing user-level behavior.

[3.1.28]

Added

Added kNN-MT model from Khandelwal et al., 2021.
- Installation: see faiss document -- installation via conda is recommended.
- Building a faiss index from a sockeye model takes two steps:
  - Generate decoder states: sockeye-generate-decoder-states -m [model] --source [src] --target [tgt] --output-dir [output dir]
  - Build index: sockeye-knn -i [input_dir] -o [output_dir] -t [faiss_index_signature] where input_dir is the same as output_dir from the sockeye-generate-decoder-states command.
  - Faiss index signature reference: see here
- Running inference using the built index: sockeye-translate ... --knn-index [index_dir] --knn-lambda [interpolation_weight] where index_dir is the same as output_dir from the sockeye-knn command.

3.1.27

1 year ago

[3.1.27]

Changed

allow torch 1.13 in requirements.txt
Replaced deprecated torch.testing.assert_allclose with torch.testing.close for PyTorch 1.14 compatibility.

[3.1.26]

Added

--tf32 0|1 bool device (torch.backends.cuda.matmul.allow_tf32) enabling 10-bit precision (19 bit total) transparent float32 acceleration. default true for backward compat with torch < 1.12. allow different --tf32 training continuation

Changed

device.init_device() called by train, translate, and score
allow torch 1.12 in requirements.txt

[3.1.25]

Changed

Updated to sacrebleu==2.3.1. Changed default BLEU floor smoothing offset from 0.01 to 0.1.

[3.1.24]

Fixed

Updated DeepSpeed checkpoint conversion to support newer versions of DeepSpeed.

[3.1.23]

Changed

Change decoder softmax size logging level from info to debug.

[3.1.22]

Added

log beam search avg output vocab size

Changed

common base Search for GreedySearch and BeamSearch
.pylintrc: suppress warnings about deprecated pylint warning suppressions

[3.1.21]

Fixed

Send skip_nvs and nvs_thresh args now to Translator constructor in sockeye-translate instead of ignoring them.

[3.1.20]

Added

Added training support for DeepSpeed.
- Installation: pip install deepspeed
- Usage: deepspeed --no_python ... sockeye-train ...
- DeepSpeed mode uses Zero Redundancy Optimizer (ZeRO) stage 1 (Rajbhandari et al., 2019).
- Run in FP16 mode with --deepspeed-fp16 or BF16 mode with --deepspeed-bf16.

[3.1.19]

Added

Clean up GPU and CPU memory used during training initialization before starting the main training loop.

Changed

Refactored training code in advance of adding DeepSpeed support:
- Moved logic for flagging interleaved key-value parameters from layers.py to model.py.
- Refactored LearningRateScheduler API to be compatible with PyTorch/DeepSpeed.
- Refactored optimizer and learning rate scheduler creation to be modular.
- Migrated to ModelWithLoss API, which wraps a Sockeye model and its losses in a single module.
- Refactored primary and secondary worker logic to reduce redundant calculations.
- Refactored code for saving/loading training states.
- Added utility code for managing model/training configurations.

Removed

Removed unused training option --learning-rate-t-scale.

[3.1.18]

Added

Added sockeye-train and sockeye-translate option --clamp-to-dtype that clamps outputs of transformer attention, feed-forward networks, and process blocks to the min/max finite values for the current dtype. This can prevent inf/nan values from overflow when running large models in float16 mode. See: https://discuss.huggingface.co/t/t5-fp16-issue-is-fixed/3139

[3.1.17]

Added

Added support for offline model quantization with sockeye-quantize.
- Pre-quantizing a model avoids the load-time memory spike of runtime quantization. For example, a float16 model loads directly as float16 instead of loading as float32 then casting to float16.

[3.1.16]

Added

Added nbest list reranking options using isometric translation criteria as proposed in an ICASSP 2021 paper https://arxiv.org/abs/2110.03847. To use this feature pass a criterion (isometric-ratio, isometric-diff, isometric-lc) when specifying --metric.
Added --output-best-non-blank to output non-blank best hypothesis from the nbest list.

[3.1.15]

Fixed

Fix type of valid_length to be pt.Tensor instead of Optional[pt.Tensor] = None for jit tracing

3.1.14

1 year ago

[3.1.14]

Added

Added the implementation of Neural vocabulary selection to Sockeye as presented in our NAACL 2022 paper "The Devil is in the Details: On the Pitfalls of Vocabulary Selection in Neural Machine Translation" (Tobias Domhan, Eva Hasler, Ke Tran, Sony Trenous, Bill Byrne and Felix Hieber).
- To use NVS simply specify --neural-vocab-selection to sockeye-train. This will train a model with Neural Vocabulary Selection that is automatically used by sockeye-translate. If you want look at translations without vocabulary selection specify --skip-nvs as an argument to sockeye-translate.

[3.1.13]

Added

Added sockeye-train argument --no-reload-on-learning-rate-reduce that disables reloading the best training checkpoint when reducing the learning rate. This currently only applies to the plateau-reduce learning rate scheduler since other schedulers do not reload checkpoints.

3.1.12

2 years ago

[3.1.12]

Fixed

Fix scoring with batches of size 1 (whic may occur when |data| % batch_size == 1.

[3.1.11]

Fixed

When resuming training with a fully trained model, sockeye-train will correctly exit without creating a duplicate (but separately numbered) checkpoint.

3.1.10

2 years ago

[3.1.10]

Fixed

When loading parameters, SockeyeModel now ignores false positive missing parameters for traced modules. These modules use the same parameters as their original non-traced versions.

3.1.9

2 years ago

[3.1.9]

Changed

Clarified usage of batch_size in Translator code.

[3.1.8]

Fixed

When saving parameters, SockeyeModel now skips parameters for traced modules because these modules are created at runtime and use the same parameters as non-traced versions. When loading parameters, SockeyeModel ignores parameters for traced modules that may have been saved by earlier versions.

3.1.7

2 years ago

[3.1.7]

Changed

SockeyeModel components are now traced regardless of whether inference_only is set, including for the CheckpointDecoder during training.

[3.1.6]

Changed

Moved offsetting of topk scores out of the (traced) TopK module. This allows sending requests of variable batch size to the same Translator/Model/BeamSearch instance.

[3.1.5]

Changed

Allow PyTorch 1.11 in requirements