CTranslate2 3.17.1

guillaumekln released this 20 Jul 18:18

· 149 commits to master since this release

4978339

Fixes and improvements

Fix an error when running models with the new int8_bfloat16 computation type
Fix a vocabulary error when converting Llama 2 models with the Transformers converter
Update the Transformers converter to correctly convert Llama models using GQA
Stop the decoding when the generator returned by the method generate_tokens is closed

Assets 2