Skip to content

Bi-Encoder Training Experiments based on Various Training Tricks (e.g. Pre Batch, Passage-wise Loss, Gradient Caching, ...)

Notifications You must be signed in to change notification settings

jiyoon0923/BiEncoder-Experiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BiEncoder-Experiments

Bi-Encoder Training Experiments based on Various Training Techniques(e.g. Pre Batch, Passage-wise Loss, Gradient Caching, ...)

Todo List

  • Validation Dataset In-Batch Negative Accuracy Logging
  • Gradient Caching Implementation
  • Passage-Wise Loss Implementation
  • PreBatch After Model Warmup Implementation
  • Cross Batch for Multi-GPU Train
  • Multi GPU Setting
  • Loading Scheduler & Model

Proposal Papers for Each Techniques


Example of Multi GPU Setting

  1. Fisrtly, make configuration file of Huggingface Accelerate
accelerate config
  1. launch accelerate for ddp training
accelerate launch src/trainer.py --config configs/train/base.yaml
  1. [CAUTION] The batch size is not operated as single gpu setting. The Acutal Batch size is train_batch_size*[your_total_gpu_for_ddp]

About

Bi-Encoder Training Experiments based on Various Training Tricks (e.g. Pre Batch, Passage-wise Loss, Gradient Caching, ...)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published