Skip to content
View xrsrke's full-sized avatar
🎯
focus
🎯
focus

Organizations

@huggingface

Block or report xrsrke

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
xrsrke/README.md

Hi there 👋

I used to study consistently from 3:30 AM to 3:30 PM. And then went to sleep from 5:20 PM to 2:45 AM (schedule optimized for studying, around 45m to fall asleep) for over 2 years (I studied many other things before ML). AND REPEATED.

I stream daily on Twitch: twitch.tv/xrsrke

Currently, I'm building a library that enables training any 🤗 transformers models in 3D parallelism and ZeRO-1 out of the box, and learning about mechanistic interpretability.

DMs open

Best way to reach me is discord: neuralink, twitter/@xariusrke, or [email protected]

Pinned Loading

  1. pipegoose pipegoose Public

    Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

    Python 80 17

  2. huggingface/nanotron huggingface/nanotron Public

    Minimalistic large language model 3D-parallelism training

    Python 1.3k 132

  3. instructGOOSE instructGOOSE Public

    Implementation of Reinforcement Learning from Human Feedback (RLHF)

    Jupyter Notebook 171 21

  4. toolformer toolformer Public

    Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools

    Jupyter Notebook 136 14

  5. reinforcement-learning reinforcement-learning Public

    Jupyter Notebook 9 1