Skip to content

CTranslate2 3.3.0

Compare
Choose a tag to compare
@guillaumekln guillaumekln released this 02 Jan 12:21
· 345 commits to master since this release

New features

  • Support T5 models, including the variants T5v1.1 and mT5
  • Support loading the model files from memory:
    • Python: see the files argument in the constructor of classes loading models
    • C++: see the models::ModelMemoryReader class

Fixes and improvements

  • Improve the quantization accuracy of OPT models by applying the SmoothQuant technique during conversion (pre-computed activation scales should be passed to the converter option --activation_scales)
  • Fix conversion of BART-like models from HuggingFace that are using a different number of encoder and decoder layers
  • Fix compilation when no BLAS CPU backend is selected
  • Remove no longer relevant CMake warning when the project is compiled without oneDNN
  • Update oneDNN to 3.0
  • Update oneMKL to 2023.0