awesome-human-in-the-loop

An awesome list of tools and resources to get started with Human in the Loop or RHLF.

Awesome RHLF

Blog Posts + Academic Papers

**Open AI - Aligning language models to follow instructions | **Internal blog post, how-to-use
**Cornell University - Scaling Language Models: Methods, Analysis & Insights from Training Gopher | **Academic paper
**Hugging Face - Illustrating Reinforcement Learning from Human Feedback (RLHF) | **Definition, blog post
**LessWrong - RLHF | **Blog post
**Unite.ai | What is Reinforcement Learning From Human Feedback (RLHF) | **Blog post
**Surge.ai - Introduction to Reinforcement Learning with Human Feedback | **Blog post

Tools and Resources

Secrets of RLHF in Large Language Models | Code and tutorials for RLHF in nutshell
Scale - RLHF for Large Language Models | Landing page, tool
**Github - lucidrains/PaLM-rlhf-pytorch | **Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
**Github - anthropics/hh-rlhf | **Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
**Github - conceptofmind/LaMDA-rlhf-pytorch | **Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
**Github - opendilab/awesome-RLHF | **A curated list of reinforcement learning with human feedback resources (continually updated)
**Github - CarperAI/trlx | **A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
**Github - sunzeyeah/RLHF | **Implementation of Chinese ChatGPT
**Github - LAION-AI/Open-Assistant | **OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
**Github - xrsrke/instructGOOSE | **Implementation of Reinforcement Learning from Human Feedback (RLHF)
**Github - arunprsh/ChatGPT-Decoded-GPT2-FAQ-Bot-RLHF-PPO | **A Practical Guide to Developing a Reliable FAQ Chatbot with Reinforcement Learning and Human Feedback using GPT-2 on AWS
**Github - voidful/TextRL | **Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
**Github - cogment/Cogment-verse | **Library of Environments, Human Actor UIs and Agent implementation for Human In the Loop Learning & Reinforcement Learning
**Github - s-JoL/Open-Llama | **The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
**Github - jianzhnie/open-chatgpt | **The open source implementation of chatgpt and RLHF. 从0开始实现一个ChatGPT.
**Github - andy-yangz/Awesome-RLHF | **Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD
**Github - jordimas/awesome-RLHF-language-models | **Curated list of resources for Reinforcement Learning from Human Feedback and Language Models
**Github - RUCAIBox/LLMSurvey | **A collection of papers and resources related to Large Language Models.
**Github - mfarisadip/T5-rlhf-pytorch | **Implementation of RLHF (Reinforcement Learning with Human Feedback) and GAN (Generative Adversarial Network) on top of the T5 architecture.
**Github - CarperAI/Polygraph | **RLHF Mechanistic Interpretability and Deception
**Github - ayulockin/T5-RLHF-TF | **Implementation of Reinforcement Learning from Human Feedback for Summarization Task in TensorFlow
**Github - ckkissane/rlhf-shakespeare | **Shakespeare transformer fine-tuned to generate positive sentiment samples using RLHF
**Github - G-U-N/T2I-HumanFeedback | **Implementations of Baseline Methods for Aligning Text2Img Diffusion Models with Human FeedBack
**Github - nazneenrajani/rlhf_langchain | **Langchain for RLHF
**Github - uSaiPrashanth/raithubot-training | **Training a RLHF-transformer architecture to answer farmers' queries
**Github - l294265421/alpaca-rlhf | **Finetuning alpaca with RLHF (Reinforcement Learning with Human Feedback)
**Github - DaehanKim/EasyRLHF | **EasyRLHF aims to providing an easy and minimal interface to train RLHF LMs, using off-the-shelf solutions and datasets
**Github - jeremy-collins/robot-rlhf | **Robot Learning through Human Feedback. Inspired by advancements in NLP, we train a robot policy via reinforcement learning using a reward function learned exclusively from human preferences.
**Github - Sugoto/GPT-Model-with-RLHF | **This is a GPT 📜 model built from scratch that uses Reinforcement Learning with Human Feedback (RLHF) 🤖 to generate positive 👍 or negative 👎 recreations of Shakespeare's writing style 🎭.
**Github - vincentmin/transformer_rlhf_eli5 | **We train a transformer model using Reinforcement Learning Human Feedback on the Reddit ELI5 dataset

Demos and Tutorials

**Github - ojus1/MyMusicTransformer | **RLHF + MusicTransformer = Generate the music YOU love
**Github - AmirMotefaker/Create-your-own-ChatGPT | **Create your own ChatGPT with Python

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

awesome-human-in-the-loop

Awesome RHLF

Blog Posts + Academic Papers

Tools and Resources

Demos and Tutorials

About

Releases

Packages

Contributors 2

License

HumanSignal/awesome-human-in-the-loop

Folders and files

Latest commit

History

Repository files navigation

awesome-human-in-the-loop

Awesome RHLF

Blog Posts + Academic Papers

Tools and Resources

Demos and Tutorials

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages