0.0.1
There will be more breaking changes in the near future so let's start some proper versioning and corresponding changelog.
What's Changed Since Initial Publication
- mlp refact by @vince62s in #1
- fix llama3 and parallel_residual by @vince62s in #4
- fixed mismatch between mask and batch dimensions by @l-k-11235 in #6
- simplify LayerNorm access as a constant by @vince62s in #7
- Fix the checkpoint directory cleaning by @l-k-11235 in #10
- Modify default model config behaviour by @francoishernandez in #8
- rename num_kv remove multiquery by @vince62s in #12
- fix mmlu config by @vince62s in #13
- Fix the tokenizer saving in the HF converter by @l-k-11235 in #14
- remove unsused average attn by @vince62s in #15
- MHA refac: rope without complex operations + query only as input of the forward by @vince62s in #20
- Revert "MHA refac: rope without complex operations + query only as input of the forward" by @vince62s in #22
- missing removal of average attn by @vince62s in #23
config.models.BaseModelConfig._override_values
updates everything once by @francoishernandez in #24- [fix] Patch lora bin to dump json config by @francoishernandez in #28
- review flash/sdpa arg by @vince62s in #25
- fix missing layers names by @vince62s in #30
- Split MHA by @vince62s in #29
- Resize the key_pad_mask by @l-k-11235 in #36
- [patch] upgrade docusaurus deps, fix build script by @francoishernandez in #37
- Add gpt2 converter, hellaswag eval tool, misc fixes by @francoishernandez in #38
- Forgot hellaswag.py tool in #38 by @francoishernandez in #39
- estim lambda scheduler by @vince62s in #40
- Add support for XLM-Roberta-XL (and XXL) conversion by @vince62s in #41
- Some fixes, get rid of data_task, homogenize model_task to model_type by @francoishernandez in #43
- Some improvements to config.json readability by @francoishernandez in #44
- [docs] Github Actions workflow to facilitate docs deployment by @francoishernandez in #47
- [fix] Allow to build_vocab with full train config, patch vocab validation by @francoishernandez in #49
- Enable PyPI release workflow by @francoishernandez in #50
- [fix] Fix paths in wiki_103 recipe, add pyarrow opt requirement by @francoishernandez in #51
- Estim first token instead of average by @vince62s in #46
- Add Recipe to train a cometkiwi-like encoder model (which can be used to score sentence pairs) by @vince62s in #53
- Simplify init files, remove some unused code by @francoishernandez in #52
New Contributors
- @l-k-11235 made their first contribution in #6
Full Changelog: https://github.com/eole-nlp/eole/commits/0.0.1