PyTorch implementation of EmBERT

This is the original implementation of EmBERT model from the paper "Multitask Learning Using BERT with Task-Embedded Attention". Our code is strongly based on the BERT and PALs implementation of Asa Cooper Stickland and Iain Murray.

Models weights

Below one can find weights for the following models:

BERT pretrained weights - we used the uncased_L-12_H-768_A-12 model's weights shared by Google.
EmBERT weights.

Moreover we share a file with EmBERT GLUE submission.

EmBERT training

In configs/embert_config.json one can find the config needed to train the EmBERT model.

run_multi_task.py is a script that runs a multitask model training.
run_test_multi_task is a script that returns the model predictions on GLUE benchmark.

Below one can see, how to run EmBERT training

export BERT_BASE_DIR=uncased_L-12_H-768_A-12
export BERT_PYTORCH_DIR=uncased_L-12_H-768_A-12
export GLUE_DIR=glue/glue_data
export SAVE_DIR=save_dir

python run_multi_task.py \
  --seed 1 \
  --output_dir $SAVE_DIR/embert \
  --tasks all \
  --sample 'anneal'\
  --multi \
  --do_train \
  --do_eval \
  --do_lower_case \
  --data_dir $GLUE_DIR/ \
  --vocab_file $BERT_BASE_DIR/vocab.txt \
  --bert_config_file $BERT_BASE_DIR/embert_config.json \
  --init_checkpoint $BERT_PYTORCH_DIR/pytorch_model.bin \
  --max_seq_length 128 \
  --train_batch_size 32 \
  --learning_rate 2e-5 \
  --num_train_epochs 25.0 \
  --gradient_accumulation_steps 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PyTorch implementation of EmBERT

Models weights

EmBERT training

Files

README.md

Latest commit

History

README.md

File metadata and controls

PyTorch implementation of EmBERT

Models weights

EmBERT training