Bi-Encoder Training Experiments based on Various Training Techniques(e.g. Pre Batch, Passage-wise Loss, Gradient Caching, ...)
- Validation Dataset In-Batch Negative Accuracy Logging
- Gradient Caching Implementation
- Passage-Wise Loss Implementation
- PreBatch After Model Warmup Implementation
- Cross Batch for Multi-GPU Train
- Multi GPU Setting
- Loading Scheduler & Model
- PreBatch : DensePhrases
- Passage-Wise Loss : PAIR
- Gradient Caching : Condenser & Gradient Cache
- Cross Batch : RocketQA
- Fisrtly, make configuration file of Huggingface Accelerate
accelerate config
- launch accelerate for ddp training
accelerate launch src/trainer.py --config configs/train/base.yaml
- [CAUTION] The batch size is not operated as single gpu setting. The Acutal Batch size is train_batch_size*[your_total_gpu_for_ddp]