CTranslate2 4.0.0
This major version introduces the breaking change while updating to cuda 12.
Breaking changes
Python
- Support cuda 12
New features
- Add feature to_device() in class StorageView in Python to move data between host <-> device
Fixes and improvements
- Implement Conv1D with im2col and GEMM to improvement in performance
- Get tokens in the range of the vocab size for LlaMa models
- Fix loss of performance
- Update cibuildwheel to 2.16.5