Release 0.0.1 · eole-nlp/eole

There will be more breaking changes in the near future so let's start some proper versioning and corresponding changelog.

What's Changed Since Initial Publication

mlp refact by @vince62s in #1
fix llama3 and parallel_residual by @vince62s in #4
fixed mismatch between mask and batch dimensions by @l-k-11235 in #6
simplify LayerNorm access as a constant by @vince62s in #7
Fix the checkpoint directory cleaning by @l-k-11235 in #10
Modify default model config behaviour by @francoishernandez in #8
rename num_kv remove multiquery by @vince62s in #12
fix mmlu config by @vince62s in #13
Fix the tokenizer saving in the HF converter by @l-k-11235 in #14
remove unsused average attn by @vince62s in #15
MHA refac: rope without complex operations + query only as input of the forward by @vince62s in #20
Revert "MHA refac: rope without complex operations + query only as input of the forward" by @vince62s in #22
missing removal of average attn by @vince62s in #23
config.models.BaseModelConfig._override_values updates everything once by @francoishernandez in #24
[fix] Patch lora bin to dump json config by @francoishernandez in #28
review flash/sdpa arg by @vince62s in #25
fix missing layers names by @vince62s in #30
Split MHA by @vince62s in #29
Resize the key_pad_mask by @l-k-11235 in #36
[patch] upgrade docusaurus deps, fix build script by @francoishernandez in #37
Add gpt2 converter, hellaswag eval tool, misc fixes by @francoishernandez in #38
Forgot hellaswag.py tool in #38 by @francoishernandez in #39
estim lambda scheduler by @vince62s in #40
Add support for XLM-Roberta-XL (and XXL) conversion by @vince62s in #41
Some fixes, get rid of data_task, homogenize model_task to model_type by @francoishernandez in #43
Some improvements to config.json readability by @francoishernandez in #44
[docs] Github Actions workflow to facilitate docs deployment by @francoishernandez in #47
[fix] Allow to build_vocab with full train config, patch vocab validation by @francoishernandez in #49
Enable PyPI release workflow by @francoishernandez in #50
[fix] Fix paths in wiki_103 recipe, add pyarrow opt requirement by @francoishernandez in #51
Estim first token instead of average by @vince62s in #46
Add Recipe to train a cometkiwi-like encoder model (which can be used to score sentence pairs) by @vince62s in #53
Simplify init files, remove some unused code by @francoishernandez in #52

New Contributors

@l-k-11235 made their first contribution in #6

Full Changelog: https://github.com/eole-nlp/eole/commits/0.0.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.0.1

What's Changed Since Initial Publication

New Contributors

Contributors