Releases · awslabs/sockeye

31 Jul 09:34

fhieber

2.1.16

2cbad61

2.1.16

[2.1.16]

Fixed

Fixed batch sizing error introduced in version 2.1.12 (c00da52) that caused batch sizes to be multiplied by the number of devices. Batch sizing now works as documented (same as pre-2.1.12 versions).
Fixed max-word batching to properly size batches to a multiple of both --batch-sentences-multiple-of and the number of devices.

[2.1.15]

Added

Inference option --mc-dropout to use dropout during inference, leading to non-deterministic output. This option uses the same dropout parameters present in the model config file.

[2.1.14]

Added

Added sockeye.rerank option --output to specify output file.
Added sockeye.rerank option --output-reference-instead-of-blank to output reference line instead of best hypothesis when best hypothesis is blank.

Assets 2

07 Jul 14:11

fhieber

2.1.13

292b42e

2.1.13

[2.1.13]

Added

Training option --quiet-secondary-workers that suppresses console output for secondary workers when training with Horovod/MPI.
Set version of isort to <5.0.0 in requirements.dev.txt to avoid incompatibility between newer versions of isort and pylint.

[2.1.12]

Added

Batch type option max-word for max number of words including padding tokens (more predictable memory usage than word).
Batching option --batch-sentences-multiple-of that is similar to --round-batch-sizes-to-multiple-of but always rounds down (more predictable memory usage).

Changed

Default bucketing settings changed to width 8, max sequence length 95 (96 including BOS/EOS tokens), and no bucket scaling.
Argument --no-bucket-scaling replaced with --bucket-scaling which is False by default.

[2.1.11]

Changed

Updated sockeye.rerank module to use "add-k" smoothing for sentence-level BLEU.

Fixed

Updated sockeye.rerank module to use current N-best format.

Assets 2

23 Jun 15:44

fhieber

2.1.10

1e5e821

2.1.10

[2.1.10]

Changed

Changed to a cross-entropy loss implementation that avoids the use of SoftmaxOutput.

[2.1.9]

Added

Added training argument --ignore-extra-params to ignore extra parameters when loading models. The primary use case is continuing training with a model that has already been annotated with scaling factors (sockeye.quantize).

Fixed

Properly pass allow_missing flag to model.load_parameters()

[2.1.8]

Changed

Update to sacrebleu=1.4.10

Assets 2

03 Jun 09:40

fhieber

2.1.7

88dc440

2.1.7

[2.1.7]

Changed

Optimize prepare_data by saving the shards in parallel. The prepare_data script accepts a new parameter --max-processes to control the level of parallelism with which shards are written to disk.

[2.1.6]

Changed

Updated Dockerfiles optimized for CPU (intgemm int8 inference, full MKL support) and GPU (distributed training with Horovod). See sockeye_contrib/docker.

Added

Official support for int8 quantization with intgemm:
- This requires the "intgemm" fork of MXNet (kpuatamazon/incubator-mxnet/intgemm). This is the version of MXNet used in the Sockeye CPU docker image (see sockeye_contrib/docker).
- Use sockeye.translate --dtype int8 to quantize a trained float32 model at runtime.
- Use the sockeye.quantize CLI to annotate a float32 model with int8 scaling factors for fast runtime quantization.

[2.1.5]

Changed

Changed state caching for transformer models during beam search to cache states with attention heads already separated out. This avoids repeated transpose operations during decoding, leading to faster inference.

[2.1.4]

Added

Added Dockerfiles that build an experimental CPU-optimized Sockeye image:
- Uses the latest versions of kpuatamazon/incubator-mxnet (supports intgemm and makes full use of Intel MKL) and kpuatamazon/sockeye (supports int8 quantization for inference).
- See sockeye_contrib/docker.

[2.1.3]

Changed

Performance optimizations to beam search inference
- Remove unneeded take ops on encoder states
- Gathering input data before sending to GPU, rather than sending each batch element individually
- All of beam search can be done in fp16, if specified by the model
- Other small miscellaneous optimizations
Model states are now a flat list in ensemble inference, structure of states provided by state_structure()

[2.1.2]

Changed

Updated to MXNet 1.6.0

Added

Added support for CUDA 10.2

Removed

Removed support for CUDA<9.1 / CUDNN<7.5

[2.1.1]

Added

Ability to set environment variables from training/translate CLIs before MXNet is imported. For example, users can
configure MXNet as such: --env "OMP_NUM_THREADS=1;MXNET_ENGINE_TYPE=NaiveEngine"

[2.1.0]

Changed

Version bump, which should have been included in commit b0461b due to incompatible models.

[2.0.1]

Changed

Inference defaults to using the max input length observed in training (versus scaling down based on mean length ratio and standard deviations).

Added

Additional parameter fixing strategies:
- all_except_feed_forward: Only train feed forward layers.
- encoder_and_source_embeddings: Only train the decoder (decoder layers, output layer, and target embeddings).
- encoder_half_and_source_embeddings: Train the latter half of encoder layers and the decoder.
Option to specify the number of CPU threads without using an environment variable (--omp-num-threads).
More flexibility for source factors combination

[2.0.0]

Changed

Update to MXNet 1.5.0
Moved SockeyeModel implementation and all layers to Gluon API
Removed support for Python 3.4.
Removed image captioning module
Removed outdated Autopilot module
Removed unused training options: Eve, Nadam, RMSProp, Nag, Adagrad, and Adadelta optimizers, fixed-step and fixed-rate-inv-t learning rate schedulers
Updated and renamed learning rate scheduler fixed-rate-inv-sqrt-t -> inv-sqrt-decay
Added script for plotting metrics files: sockeye_contrib/plot_metrics.py
Removed option --weight-tying. Weight tying is enabled by default, disable with --weight-tying-type none.

Added

Added distributed training support with Horovod/OpenMPI. Use horovodrun and the --horovod training flag.
Added Dockerfiles that build a Sockeye image with all features enabled. See sockeye_contrib/docker.
Added none learning rate scheduler (use a fixed rate throughout training)
Added linear-decay learning rate scheduler
Added training option --learning-rate-t-scale for time-based decay schedulers
Added support for MXNet's Automatic Mixed Precision. Activate with the --amp training flag. For best results, make sure as many model dimensions are possible are multiples of 8.
Added options for making various model dimensions multiples of a given value. For example, use --pad-vocab-to-multiple-of 8, --bucket-width 8 --no-bucket-scaling, and --round-batch-sizes-to-multiple-of 8 with AMP training.
Added GluonNLP's BERTAdam optimizer, an implementation of the Adam variant used by Devlin et al. (2018). Use --optimizer bertadam.
Added training option --checkpoint-improvement-threshold to set the amount of metric improvement required over the window of previous checkpoints to be considered actual model improvement (used with --max-num-checkpoint-not-improved).

Assets 2

03 Jun 07:41

fhieber

1.18.115

482f9d4

1.18.115

[1.18.115]

Added

Added requirements for MXnet compatible with cuda 10.1.

[1.18.114]

Fixed

Fix bug in prepare_train_data arguments.

[1.18.113]

Fixed

Added logging arguments for prepare_data CLI.

[1.18.112]

Added

Option to suppress creation of logfiles for CLIs (--no-logfile).

[1.18.111]

Added

Added an optional checkpoint callback for the train function.

Changed

Excluded gradients from pickled fields of TrainState

[1.18.110]

Changed

We now guard against failures to run nvidia-smi for GPU memory monitoring.

[1.18.109]

Fixed

Fixed the metric names by prefixing training metrics with 'train-' and validation metrics with 'val-'. Also restricted the custom logging function to accept only a dictionary and a compulsory global_step parameter.

[1.18.108]

Changed

More verbose log messages about target token counts.

[1.18.107]

Changed

Updated to MXNet 1.5.0

Assets 2

18 Aug 08:56

fhieber

1.18.106

49e46b2

1.18.106

[1.18.106]

Added

Added an optional time limit for stopping training. The training will stop at the next checkpoint after reaching the time limit.

[1.18.105]

Added

Added support for a possibility to have a custom metrics logger - a function passed as an extra parameter. If supplied, the logger is called during training.

[1.18.104]

Changed

Implemented an attention-based copy mechanism as described in Jia, Robin, and Percy Liang. "Data recombination for neural semantic parsing." (2016).
Added a <ptr\d+> special symbol to explicitly point at an input token in the target sequence
Changed the decoder interface to pass both the decoder data and the pointer data.
Changed the AttentionState named tuple to add the raw attention scores.

[1.18.103]

Added

Added ability to score image-sentence pairs by extending the scoring feature originally implemented for machine
translation to the image captioning module.

[1.18.102]

Fixed

Fixed loading of more than 10 source vocabulary files to be in the right, numerical order.

[1.18.101]

Changed

Update to Sacrebleu 1.3.6

[1.18.100]

Fixed

Always initializing the multiprocessing context. This should fix issues observed when running sockeye-train.

[1.18.99]

Changed

Updated to MXNet 1.4.1

[1.18.98]

Changed

Converted several transformer-related layer implementations to Gluon HybridBlocks. No functional change.

Assets 2

07 May 14:07

fhieber

1.18.97

2d458b2

1.18.97

[1.18.97]

Changed

Updated to PyYAML 5.1

[1.18.96]

Changed

Extracted prepare vocab functionality in the build vocab step into its own function. This matches the pattern in prepare data and train where the main() function only has argparsing, and it invokes a separate function to do the work. This is to allow modules that import this one to circumvent the command line.

[1.18.95]

Changed

Removed custom operators from transformer models and replaced them with symbolic operators.
Improves Performance.

[1.18.94]

Added

Added ability to accumulate gradients over multiple batches (--update-interval). This allows simulation of large
batch sizes on environments with limited memory. For example: training with --batch-size 4096 --update-interval 2
should be close to training with --batch-size 8192 at smaller memory footprint.

[1.18.93]

Fixed

Made brevity_penalty argument in Translator class optional to ensure backwards compatibility.

Assets 2

16 Apr 14:03

fhieber

1.18.92

ea78e06

1.18.92

[1.18.92]

Added

Added sentence length (and length ratio) prediction to be able to discourage hypotheses that are too short at inference time. Can be enabled for training with --length-task and with --brevity-penalty-type during inference.

[1.18.91]

Changed

Multiple lexicons can now be specified with the --restrict-lexicon option:
- For a single lexicon: --restrict-lexicon /path/to/lexicon.
- For multiple lexicons: --restrict-lexicon key1:/path/to/lexicon1 key2:/path/to/lexicon2 ....
- Use --json-input to specify the lexicon to use for each input, ex: {"text": "some input string", "restrict_lexicon": "key1"}.

[1.18.90]

Changed

Updated to MXNet 1.4.0
Integration tests no longer check for equivalence of outputs with batch size 2

[1.18.89]

Fixed

Made the length ratios per bucket change backwards compatible.

[1.18.88]

Changed

Made sacrebleu a pip dependency and removed it from sockeye_contrib.

[1.18.87]

Added

Data statistics at training time now compute mean and standard deviation of length ratios per bucket.
This information is stored in the model's config, but not used at the moment.

[1.18.86]

Added

Added the --fixed-param-strategy option that allows fixing various model parameters during training via named strategies.
These include some of the simpler combinations from Wuebker et al. (2018) such as fixing everything except the first and last layers of the encoder and decoder (all_except_outer_layers). See the help message for a full list of strategies.

Assets 2

15 Mar 14:07

fhieber

1.18.85

c1b1da8

1.18.85

[1.18.85]

Changed

Disabled dynamic batching for Translator.translate() by default due to increased memory usage. The default is to
fill-up batches to Translator.max_batch_size.
Dynamic batching can still be enabled if fill_up_batches is set to False.

Added

Added parameter to force training to stop after a given number of checkpoints. Useful when forced to share limited GPU resources.

[1.18.84]

Fixed

Fixed lexical constraints bugs that broke batching and caused large drop in BLEU.
These were introduced with sampling (1.18.64).

[1.18.83]

Changed

The embedding size is automatically adjusted to the Transformer model size in case it is not specified on the command line.

[1.18.82]

Fixed

Fixed type conversion in metrics file reading introduced in 1.18.79.

[1.18.81]

Fixed

Making sure the training pickled training state contains the checkpoint decoder's BLEU score of the last checkpoint.

[1.18.80]

Fixed

Fixed a bug introduced in 1.18.77 where blank lines in the training data resulted in failure.

[1.18.79]

Added

Writing of the convergence/divergence status to the metrics file and guarding against numpy.histogram's errors for NaNs during divergent behaviour.

Assets 2

24 Feb 14:44

fhieber

1.18.78

86b8175

1.18.78

[1.18.78]

Changed

Dynamic batch sizes: Translator.translate() will adjust batch size in beam search to the actual number of inputs without using padding.

[1.18.77]

Added

sockeye.score now loads data on demand and doesn't skip any input lines

[1.18.76]

Changed

Do not compare scores from translation and scoring in integration tests.

Added

Adding the option via the flag --stop-training-on-decoder-failure to stop training in case the checkpoint decoder dies (e.g. because there is not enough memory).
In case this is turned on a checkpoint decoder is launched right when training starts in order to fail as early as possible.

[1.18.75]

Changed

Do not create dropout layers for inference models for performance reasons.

[1.18.74]

Changed

Revert change in 1.18.72 as no memory saving could be observed.

[1.18.73]

Fixed

Fixed a bug where source-factors-num-embed was not correctly adjusted to num-embed
when using prepared data & source-factor-combine sum.

Assets 2

Releases: awslabs/sockeye

2.1.16

[2.1.16]

Fixed

[2.1.15]

Added

[2.1.14]

Added

2.1.13

[2.1.13]

Added

[2.1.12]

Added

Changed

[2.1.11]

Changed

Fixed

2.1.10

[2.1.10]

Changed

[2.1.9]

Added

Fixed

[2.1.8]

Changed

2.1.7

[2.1.7]

Changed

[2.1.6]

Changed

Added

[2.1.5]

Changed

[2.1.4]

Added

[2.1.3]

Changed

[2.1.2]

Changed

Added

Removed

[2.1.1]

Added

[2.1.0]

Changed

[2.0.1]

Changed

Added

[2.0.0]

Changed

Added

1.18.115

[1.18.115]

Added

[1.18.114]

Fixed

[1.18.113]

Fixed

[1.18.112]

Added

[1.18.111]

Added

Changed

[1.18.110]

Changed

[1.18.109]

Fixed

[1.18.108]

Changed

[1.18.107]

Changed

1.18.106

[1.18.106]

Added

[1.18.105]

Added

[1.18.104]

Changed

[1.18.103]

Added