A library for automatic data selection in active fine-tuning of large neural networks.
Please cite our work if you use this library in your research (bibtex below):
- Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
- Transductive Active Learning: Theory and Applications (Section 4)
pip install activeft
from activeft.sift import Retriever
# Load embeddings
embeddings = np.random.rand(1000, 512)
query_embeddings = np.random.rand(1, 512)
index = faiss.IndexFlatIP(d)
index.add(embeddings)
retriever = Retriever(index)
indices = retriever.search(query_embeddings, N=10)
- The code is auto-formatted using
black .
. - Static type checks can be run using
pyright
. - Tests can be run using
pytest test
.
To start a local server hosting the documentation run pdoc ./activeft --math
.
- update version number in
pyproject.toml
andactiveft/__init__.py
- build:
poetry build
- publish:
poetry publish
- push version update to GitHub
- create new release on GitHub
@article{hubotter2024efficiently,
title = {Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs},
author = {H{\"u}botter, Jonas and Bongni, Sascha and Hakimi, Ido and Krause, Andreas},
year = 2024,
journal = {arXiv preprint arXiv:2410.08020}
}
@inproceedings{hubotter2024transductive,
title = {Transductive Active Learning: Theory and Applications},
author = {H{\"u}botter, Jonas and Sukhija, Bhavya and Treven, Lenart and As, Yarden and Krause, Andreas},
year = 2024,
booktitle = {Advances in Neural Information Processing Systems}
}