Releases: graphcore-research/unit-scaling
Releases · graphcore-research/unit-scaling
v0.3.4
What's Changed
- Bump setuptools from 68.2.2 to 70.0.0 by @dependabot in #81
New Contributors
- @dependabot made their first contribution in #81
Full Changelog: v0.3.3...v0.3.4
v0.3.3
What's Changed
- first draft of pypi action by @thecharlieblake in #79
- Switch to pyproject.toml by @thecharlieblake in #80
Full Changelog: v0.3.0...v0.3.3
v0.3.2
Full Changelog: v0.3.1...v0.3.2
v0.3.1
Full Changelog: v0.3.0...v0.3.1
v0.3.0
What's Changed
- Linear by @thecharlieblake in #1
- Add CI for public repo by @thecharlieblake in #2
- Docs by @thecharlieblake in #3
- Remove unneeded docker setup by @thecharlieblake in #4
- Add MLP and supporting fns by @thecharlieblake in #6
- Add
analyse_module()
utils function by @thecharlieblake in #7 - Add necessary IP requirements for open-source by @thecharlieblake in #8
- Implement self-attention and update utils by @thecharlieblake in #9
- Add transformer layer by @thecharlieblake in #10
- Full transformer by @thecharlieblake in #13
- User guide by @thecharlieblake in #14
- Remove poptorch/pea by @thecharlieblake in #15
- Fixes to run on python3.10 by @thecharlieblake in #17
- Alpha docs by @thecharlieblake in #18
- Add scaling fusion analysis and update docs by @thecharlieblake in #21
- Doc review by @JamesRandom in #20
- Add ops for demo by @thecharlieblake in #24
- Change constraints to strings by @thecharlieblake in #26
- Format simulation (ready for review) by @thecharlieblake in #25
- Add unit_scale transform by @thecharlieblake in #27
- Quantisation for IPU by @DouglasOrr in #28
- Fix setuptools packages by @thecharlieblake in #30
- Changes grad_w default scaling to be 0.75 by @thecharlieblake in #31
- Prepare for beta release by @thecharlieblake in #34
- Support sdist builds by @DouglasOrr in #33
- Fix a torch FX error on torch-2.0.1 'annotations must be set to a dict object' by @DouglasOrr in #35
- Update scaled_dot_product_attention scaling factor from our derivation. by @DouglasOrr in #36
- Add separate tau scaling by @thecharlieblake in #37
- Visualisation tool by @thecharlieblake in #32
- Fix setup dependencies by @thecharlieblake in #39
- Add compile transform by @thecharlieblake in #40
- Tidy plots by @thecharlieblake in #41
- Tidy docs for beta release by @thecharlieblake in #42
- Fix torch.nn functional issue in docs by @thecharlieblake in #43
- Update limitations and add link to notebook by @thecharlieblake in #44
- PyTorch 2.1 fixes by @thecharlieblake in #47
- Add blog draft for 'almost scaled dot product attention' by @DouglasOrr in #46
- [almost-scaled blog] Introduce terms {d_seq, t} by @DouglasOrr in #48
- Add dependencies and dataset download to almost_scaled_dot_product_attention blog by @DouglasOrr in #49
- Fix torch version at 2.1 for now by @thecharlieblake in #50
- Revert "Changes grad_w default scaling to be 0.75" by @thecharlieblake in #51
- Proposal to refactor a state-carrying closure to a class by @awf in #53
- Update to PyTorch 2.2 by @awf in #52
- Update to PyTorch 2.2 addendum: Fixing a doc typo and logic mismatch by @awf in #54
- Update license and numpy by @thecharlieblake in #55
- Parametrize the number of random bits used in stochastic rounding by @awf in #56
- Add a package version and unit_scaling.version (0.1) by @DouglasOrr in #59
- Fix recursion in torch_nn_modules_to_user_modules() by @DouglasOrr in #57
- Updates to support u-muP, as the new default behaviour by @DouglasOrr in #58
- Fix unit_scaling setup.py install (#60) by @DouglasOrr in #61
- Migrate blog post 'almost scaled dot product attention' to the graphcore blog by @DouglasOrr in #64
- Add initial how_to_scale_op notebook by @DouglasOrr in #65
- Conv1d by @thecharlieblake in #67
- Import Conv1d into top-level unit_scaling (fixes #69) by @DouglasOrr in #71
- Remove the constraint argument in the container modules uu.{MLP, MHSA} by @DouglasOrr in #72
- fix sgd bug and add demo notebook by @thecharlieblake in #74
- add umup slides by @thecharlieblake in #75
- fix paper link in docs by @thecharlieblake in #76
- upgrade transformers dependency (security) by @thecharlieblake in #77
- Add docker dev setup and update requirements by @thecharlieblake in #78
New Contributors
- @JamesRandom made their first contribution in #20
- @DouglasOrr made their first contribution in #28
- @awf made their first contribution in #53
Full Changelog: https://github.com/graphcore-research/unit-scaling/commits/v0.3.0