Hfxunlp Transformer Versions Save

Neutron: A pytorch based implementation of Transformer and its variants.

v0.3.8

9 months ago

support pre-trained models (BERT, RoBERTa, BART, T5, MBART); add regression loss, bucket relative positional encoding and self-dependency units; support compression (gz, bz2, xz) and character-level text data processing; support unicode standardization and Chinese desegmentation; configure BF/FP16 and inference mode for pytorch; fix & enhancement.

v0.3.7

2 years ago

add constrained decoder; support GLU; fix & enhancement.

Hello 2022 :-)

v0.3.6

2 years ago

support hard retrieval attention; fix & enhancement.

Bye 2021 :-)

v0.3.5

2 years ago

support multilingual NMT; support contiguous model parameters; add sentencepiece (spm) support; add C backend for core modules (this saves resources but is slower than the python backend); clean & enhancement (the class components of transformer.Encoder/Decoder(s) are changed, and model files of previous commits cannot be loaded correctly).

v0.3.4

2 years ago

support MultiGPUOptimizer; support word translation probe, MHPLSTM; disable FFN inside AAN by default; support to clean data with many repeated tokens; clean & enhancement.

v0.3.3

3 years ago

fix decoding efficiency by moving decoding cache from attention inputs to attention hiddens; support shared vocabulary pruning of trained models.

v0.3.2

3 years ago

fix bugs; support fast label smoothing loss;

v0.3.1

3 years ago

In this release, we: Support AdaBelief optimizer; Accelerate zero_grad by enabling set_to_none; Support RealFormer.

v0.3.0

3 years ago

In this release, we: Move AMP support from apex to torch.cuda.amp introduced in PyTorch 1.6; Support sampling during greedy decode (for back-translation); Accelerate Average Attention Network by replacing the matrix multiplication with cumsum; (A typo in this release is fixed in commit ed5eb60) Add APE support; Support the Mish activation function.

v0.2.9

3 years ago

In this release, we: adapt to PyTorch 1.5; explicitly support Lipschitz constrained parameter initialization; incorporate features: n-gram dropout, dynamic batch sizes, and source phrase representation learning.

Sorry for we did not include the update of utils in this release, please find utils/comm.py (a.k.a utils.comm) required by parallel/base.py (a.k.a parallel.base), or use commit 2b6b22094b545e74b05c075f3daac9c14f16414d instead.