Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

OpenNMT / CTranslate2 Public

Notifications You must be signed in to change notification settings
Fork 305
Star 3.4k

Code
Issues 166
Pull requests 27
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Releases: OpenNMT/CTranslate2

Releases · OpenNMT/CTranslate2

CTranslate2 3.15.1

09 Jun 09:56

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.15.1

Fixes and improvements

Fix an error when using the new static_prompt argument in the methods generate_tokens and generate_batch
Improve the performance of models using ALiBi

Assets 2

Loading

totaltube reacted with thumbs up emoji

All reactions

👍 1 reaction

1 person reacted

CTranslate2 3.15.0

06 Jun 14:13

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.15.0

New features

Initial support of encoder-only Transformer model via a new class ctranslate2.Encoder
Update the Transformers converter to support the Falcon models
Add a generation argument static_prompt to optimize the execution for models using system prompts: the model state for this prompt is cached and reused in future calls
Support early stopping in greedy search when the callback function returns True
Make the layer norm epsilon value configurable in the model configuration file config.json
Add Tanh as a possible activation function

Fixes and improvements

Fix a performance issue when running models using ALiBi on the GPU
Fix application of the rotary embeddings when the multi-query attention is used
Fix conversion of Marian models using tied-embeddings-all: false
Remove use_fast argument when loading Hugging Face tokenizers to use the default tokenizer for the model

Assets 2

Loading

juliensalinas, NeonBohdan, daxaxelrod, chainyo, and hoonlight reacted with hooray emoji

All reactions

🎉 5 reactions

5 people reacted

CTranslate2 3.14.0

26 May 16:20

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.14.0

New features

Update the Transformers converter with new architectures:
- CodeGen
- GPTBigCode
- LLaMa
- MPT
Update the OpenNMT-py converter to support some recent options:
- layer_norm="rms"
- max_relative_positions=-1 (rotary embeddings)
- max_relative_positions=-2 (ALiBi)
- pos_ffn_activation_fn="silu"
Update the OpenNMT-tf converter to support models using different configurations for the encoder and decoder (e.g. post-norm in the encoder and pre-norm in the decoder)
Implement the multi-query attention (used by GPTBigCode)

Fixes and improvements

Support paths containing Unicode characters on Windows
Fix the generate_tokens method to properly raise the underlying exception instead of hanging indefinitely
Fix compilation error when using -DBUILD_SHARED_LIBS=OFF
Fix runtime errors when linking against libctranslate2.a without using the "whole archive" flags

Assets 2

Loading

NeonBohdan, raphaelmerx, limcheekin, and u-235 reacted with thumbs up emoji

All reactions

👍 4 reactions

4 people reacted

CTranslate2 3.13.0

26 Apr 09:37

guillaumekln

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

CTranslate2 3.13.0

New features

Support conversion of GPT-NeoX models with the Transformers converter
Extend the end_token argument to also accept a list of tokens
Add option return_end_token to include the end token in the results of the methods generate_batch and translate_batch (by default the end token is removed)
Expose the callback argument for the methods generate_batch and translate_batch to get early results from the decoding loop
Fallback to a custom threading implementation when OpenMP is not used (which is currently the case for the macOS ARM64 Python wheels)
Define the CMake package CTranslate2::ctranslate2 to facilitate the library integration in other CMake projects

Fixes and improvements

Fix the vocabulary loading when some tokens end with the carriage return
Implement a fused kernel to apply the rotary embeddings
Update the Ruy library to commit 363f2522

Assets 2

Loading

juliensalinas and trihutama reacted with hooray emoji

All reactions

🎉 2 reactions

2 people reacted

CTranslate2 3.12.0

17 Apr 18:22

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.12.0

New features

Add methods Generator.generate_tokens and Translator.generate_tokens returning a generator that yields tokens as soon as they are generated by the model (not compatible with beam search)
Improve performance of rotary embeddings on CPU with an alternative implementation that is enabled when setting rotary_interleave=False in the model specification (may require to permute QK weights)
Support a variable number of input frames in method Whisper.align to improve batch support
Expose flag low_cpu_mem_usage in the Transformers converter to reduce the memory usage when loading large models (requires the package accelerate)

Fixes and improvements

Fix crash in Whisper.align when num_frames // 2 <= median_filter_width
Raise an error if arguments end_token or suppress_sequences contain tokens that are not in the vocabulary
Optimize the quantization of FP16 weights during the model conversion
In the Transformers converter, also load the model weights in FP16 when the selected quantization is int8_float16
Update the Whisper timestamp decoding rules to prevent the generation of segments with zero duration

Assets 2

Loading

All reactions

CTranslate2 3.11.0

06 Apr 16:19

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.11.0

Changes

The Python wheels for macOS ARM are now built with the Ruy backend to support INT8 computation. This will change the performance and results when loading an INT8 model and/or using the auto compute type. To keep the previous behavior, set compute_type="float32".

New features

Support conversion of the GPT-J architecture
Support conversion of models using rotary position embeddings
Apply the new OpenNMT-py option decoder_start_token
Add option revision in the Transformers converter to download a specific revision of the model from the Hugging Face Hub

Assets 2

Loading

All reactions

CTranslate2 3.10.3

30 Mar 15:52

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.10.3

Fixes and improvements

Fix a synchronization issue when the model input is a CUDA storage

Assets 2

Loading

All reactions

CTranslate2 3.10.2

27 Mar 15:13

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.10.2

Fixes and improvements

Select the correct device when copying a StorageView instance

Assets 2

Loading

All reactions

CTranslate2 3.10.1

27 Mar 15:12

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.10.1

Fixes and improvements

Add missing device setter in Whisper.encode

Assets 2

Loading

All reactions

CTranslate2 3.10.0

24 Mar 09:45

guillaumekln

Compare

Choose a tag to compare

Loading

CTranslate2 3.10.0

New features

Add Generator option include_prompt_in_result (True by default)
Add method Whisper.encode to only run the Whisper encoder
Add model properties Whisper.device and Whisper.device_index

Fixes and improvements

Update the methods Whisper.detect_language, Whisper.generate, and Whisper.align to accept the encoder output
Fix a crash when running Generator.forward on GPU and the generator object is destroyed before the forward output
Fix parsing of Marian YAML vocabulary files containing "complex key mappings" and escaped sequences such as "\x84"

Assets 2

Loading

All reactions

Previous 1 2 3 4 5 … 12 13 Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.