Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 863 Bytes

README.md

File metadata and controls

17 lines (11 loc) · 863 Bytes

HomebrewNLP

Overview

"...Our goal is to open up the space by combining every form of efficient training we have. If we throw enough tradeoffs against it, a model of this size (GPT-3) should be trainable on commodity hardware (<1k if purchased as upgrades) ... Compute-memory tradeoffs (like MOE) aren't enough ... we want more efficient training using extragradient methods and better optimizers (Shampoo)" - Lucas Nestler

Example Command

python3 main.py configs/small.yaml

DeepSource | Dataset | Discord