Releases: illuin-tech/colpali
Releases · illuin-tech/colpali
v0.3.8
v0.3.7
v0.3.6
Description
Loosen default dependencies, but keep stricter dep ranges for the train
dependency group.
Features
Added
- Add expected scores in ColPali E2E test
Changed
- Loosen package dependencies
Full Changelog: v0.3.5...v0.3.6
v0.3.5: SmolVM
[0.3.5] - 2024-12-13
Added
- Added support for Idefics3 (and SmolVLM)
Fixed
- Fix typing for
processor.score_multi_vector
(allow for both list and tensor inputs). This does not change how the scores are computed. - Fix
tear_down_torch
when used on a non-MPS machine
v0.3.4
[0.3.4] - 2024-11-07
Added
- General
CorpusQueryCollator
for BEIR style dataset training or hard negative training. This deprecatesHardNegCollator
but all changes to the training loop are made for a seemless update.
Changed
- Updates BiPali config files
- Removed query augmentation tokens from BiQwen2Processor
- Modified XQwen2Processor to place
<|endoftext|>
token at the end of the document prompt (non-breaking for ColQwen but helps BiQwen). - Removed
add_suffix
in the VisualRetrieverCollator and let thesuffix
be added in the individual processors. - Changed the incorrect
<pad>
token to<|endoftext|>
fo query augmentationColQwen2Processor
. Note that previous models were trained with<|endoftext|>
so this is simply a non-breaking inference upgrade patch.
v0.3.3
[0.3.3] - 2024-10-29
Added
- Add BiQwen2 model
Changed
- Modified ColQwen and BiQwen to prevent the useless forward pass in the last layer of the original model (classification head)
- Bumped "breaking" dependencies on MTEB and Transformers version and made the corresponding changes in the code
- Casted Image dtype in ColPali due to breaking 4.46 transformers update
- Added a "num_image_tokens" kwarg to the
ColQwen2Processor
to allow for different image resolutions
Fixed
- Fix wrong variable name for
ColPaliProcessor
's prefixes
Full Changelog: v0.3.2...v0.3.3
v0.3.2: The interpretability update
Description
✨ This release brings the interpretability module to colpali-engine
and adds support for generating similarity maps with the ColQwen2 model.
🛠️ We’ve also made several code improvements and added tests for ColQwen2 to ensure better performance and reliability.
Features
Added
- Restore, refactor, and improve
interpretability
module for generating similarity maps
Changed
- Remove dummy image from
ColPaliProcessor.process_queries
Fixed
- Fix the
compute_hardnegs.py
script
Tests
- Add missing
model.eval()
in tests - Add tests for ColQwen2
Full Changelog: v0.3.1...v0.3.2
v0.3.1: ColQwen2
[0.3.1] - 2024-09-27
Added
- Add module-level imports for collators
- Add sanity check in the run inference example script
- Add E2E test for ColPali
- Add Qwen2-VL support
Changed
- Improve code clarity the run inference example script
- Subset the example dataset in the run inference example script
- Rename scorer test to
test_processing_utils
- Greatly simplify routing logic in Trainer selection and when feeding arguments to the model forward pass (refacto)
- Removed class
ContrastiveNegativeTrainer
which is now just integrated in ContrastiveTrainer. This should not affect the user-facing API. - Bumped transformers version to 4.45.0 to get Qwen2-VL support
Fixed
- Import HardNegCollator at module-level if and only if datasets is available
- Remove the need for
typer
in the run inference example script - Fix edge case when empty suffix
""
given to processor - Fix bug in HardNegCollator since 0.3.0
Full Changelog: v0.3.0...v0.3.1
v0.3.0: Package restructure
Description
✨ This release is an extensive package restructure, making ColPali more modular and easier to use.
🚨 It is NOT backward-compatible with previous versions.
Features
Added
- Restructure the
utils
module - Restructure the model training code
- Add custom
Processor
classes to easily process images and/or queries - Enable module-level imports
- Add scoring to processor
- Add
CustomRetrievalEvaluator
- Add missing typing
- Add tests for model, processor, scorer, and collator
- Lint
Changelog
- Add missing docstrings
- Add "Ruff" and "Test" CI pipelines
Changed
- Restructure all modules to closely follow the
transformers
architecture - Hugely simplify the collator implementation to make it model-agnostic
ColPaliProcessor
'sprocess_queries
doesn't need a mock image input anymore- Clean
pyproject.toml
- Loosen the required dependencies
- Replace
black
with theruff
linter
Removed
- Remove
interpretability
andeval_manager
modules - Remove unused utils
- Remove
TextRetrieverCollator
- Remove
HardNegDocmatixCollator
Fixed
- Fix wrong PIL import
- Fix dependency issues
Full Changelog: v0.2.2...v0.3.0