Skip to content

Commit

Permalink
Apply automatic changes
Browse files Browse the repository at this point in the history
  • Loading branch information
mdingemanse authored and github-actions[bot] committed Apr 19, 2024
1 parent 0d372a1 commit fdd9761
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 46 deletions.
2 changes: 1 addition & 1 deletion docs/df.csv
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ https://huggingface.co/LLM360/AmberChat,,Amber,ShareGPT + Evol-Instruct (synthet
https://open-assistant.io/,,Pythia 12B,OpenAssistant Conversations,Apache 2.0,LAION-AI,https://open-assistant.io/,,open,https://github.com/LAION-AI/Open-Assistant,Code includes guide for developers,open,https://github.com/LAION-AI/Open-Assistant/tree/main/data/datasets,Datasets documented in detail and recipes for cleaning up and downloading provided in code notebooks.,open,https://huggingface.co/OpenAssistant,Model weights in several variants downloadable through HuggingFace,open,https://huggingface.co/datasets/OpenAssistant/oasst1,"OpenAssistant Conversations is 'a human-generated, human-annotated assistant-style conversation corpus consisting of 161443 messages distributed across 66497 conversation trees, in 35 different languages, annotated with 461292 quality ratings' (preprint)",closed,,RLHF weights not separately released,open,https://projects.laion.ai/Open-Assistant/docs/faq#what-license-does-open-assistant-use,Apache 2.0,open,https://projects.laion.ai/Open-Assistant/docs/intro,Separate website provides entry point to comprehensive documentation,open,https://github.com/LAION-AI/Open-Assistant/tree/main/model,Instructions to tune the pipeline on training data,partial,https://arxiv.org/abs//2304.07327,"Preprint describes creation of OpenAssistant Conversations corpus for instruction tuning, but not the base LLM, hence partial.",closed,,No peer-reviewed paper or published data audit found,closed,,,closed,,,open,,,open,https://projects.laion.ai/Open-Assistant/api,,/projects/Open-Assistant.yaml,9.5
https://github.com/imoneoi/openchat,,Mistral 7B,ShareGPT with C-RLFT,Apache 2.0,Tshinghua University,https://github.com/imoneoi,OpenChat notes 'We are a student team from Tsinghua University',open,https://github.com/imoneoi/openchat/tree/master/ochat,Repository offers a large amount of fairly well-organized code for data curation and model,closed,,Pretraining data for Mistral is nowhere disclosed or documented,open,https://github.com/mistralai/mistral-src#download-the-model,Mistral 7B weights available via Mistral repository,closed,,Preprint says shareGPT dataset 'collected from sharegpt.com' but not disclosed or made available by this project,open,https://huggingface.co/openchat/openchat_3.5/tree/main,Instruction-tuned model weights made available via HuggingFace,open,https://github.com/imoneoi/openchat/blob/master/LICENSE,Code and model released under Apache 2.0,partial,https://github.com/imoneoi/openchat/tree/master/ochat,There is plenty of code in the github repository but only some of it is documented,open,https://arxiv.org/abs/2309.11235,Architecture quite well described in preprint,open,https://arxiv.org/abs/2309.11235,"Preprint describes the model architecture and instruction tuning approach, though is hampered by building on notoriously closed Llama2",open,https://openreview.net/forum?id=AOJyfhWYHf,Paper reviewed and accepted for ICLR 2024,partial,https://huggingface.co/openchat/openchat_v3.2,There is a model card that provides some details on architecture and evaluation,closed,,Datasheet not provided.,open,https://github.com/imoneoi/openchat/tree/master#installation,Python package 'ochat' provided through pip,partial,,"Model too large to load onto HuggingFace free inference API, so only available through Inference Endpoints or package",/projects/OpenChat.yaml,9.5
https://huggingface.co/togethercomputer/Pythia-Chat-Base-7B,,EleutherAI pythia,OIG,Apache 2.0 license,togethercomputer,https://github.com/togethercomputer,,open,,,open,https://github.com/togethercomputer/OpenDataHub,Training data curated and shared in separate repository,open,https://huggingface.co/togethercomputer/Pythia-Chat-Base-7B/tree/main,Model weights available via HuggingFace,open,https://huggingface.co/datasets/laion/OIG,From the documentation 'This is our attempt to create a large instruction dataset of medium quality along with a smaller high quality instruciton dataset (OIG-small-chip2).',closed,,RL weights not separately made available,open,https://huggingface.co/togethercomputer/Pythia-Chat-Base-7B#model-details,Apache 2.0,open,https://github.com/togethercomputer/OpenChatKit,Actively maintained repository,open,https://github.com/togethercomputer/OpenChatKit#reproducing-pythia-chat-base-7b,Architecture and recipe for reproducing model provided,partial,https://arxiv.org/abs/2304.01373,Preprint describes LM base (Pythia) but not instruction tuning details,closed,,No peer-reviewed paper or data audit found,partial,https://huggingface.co/togethercomputer/Pythia-Chat-Base-7B,Model card partially available but fairly minimally specified,partial,https://huggingface.co/datasets/laion/OIG,OIG instruction dataset documented,open,,,closed,,,/projects/pythia-chat-base-7B.yaml,9.5
https://huggingface.co/SebastianSchramm/Cerebras-GPT-111M-instruction,,Cerebras,Alpaca (synthetic),Apache 2.0,Cerebras + Schramm,https://github.com/Cerebras,,partial,https://github.com/Cerebras/gigaGPT,"Some of the training code available in GigaGPT, but fine-tuning",open,https://huggingface.co/datasets/EleutherAI/pile,Eleuther AI's The Pile,open,https://huggingface.co/cerebras/Cerebras-GPT-111M,base model available via Cerebras,open,https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/blob/main/data/alpaca_gpt4_data.json,Alpaca GPT4,open,https://huggingface.co/SebastianSchramm/Cerebras-GPT-111M-instruction/tree/main,Finetuned model weights available,partial,https://huggingface.co/SebastianSchramm/Cerebras-GPT-111M-instruction/,Licensing situation unclear as model page mentions no license (base model is licensed Apache 2.0),closed,,Code only sparsely documented,open,,Described in preprint,partial,https://arxiv.org/abs/2304.03208,,closed,,,closed,https://huggingface.co/cerebras/Cerebras-GPT-111M,Only serves as as advertising for the model,open,https://arxiv.org/abs/2201.07311,Datasheet available for The Pile,closed,,No package found,open,,Available through HuggingFace inference API,/projects/cerebras-gpt-111m-instruction.yaml,8.5
https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct,,RedPajama-INCITE-7B-Base,various (GPT-JT recipe),Apache 2.0,TogetherComputer,https://together.ai/,,partial,https://github.com/togethercomputer/redpajama.cpp/tree/master/examples/redpajama,Code for datasets made available in exemplary ways; code for training and tuning harder to find,open,https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T,RedPajama-Data-1T made available on HuggingFace,open,https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Base,Base is RedPajama-INCITE-7B-Base,open,https://huggingface.co/datasets/togethercomputer/RedPajama-Data-Instruct,"The model was trained on a large collection of diverse data, including Chain-of-Thought (CoT), Public Pool of Prompts (P3) dataset, Natural-Instructions (NI) dataset.",open,https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct,Instruction-tuned version made available in paralle with base version,partial,https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct/blob/main/README.md,"Models licensed under Apache 2.0, but note that the data itself is variably licensed and so imposes some limitations.",partial,,Code for base LLM and instruction tuning datasets beautifully documented; code specifying training and fine-tuning sparsely documented.,partial,https://together.ai/blog/redpajama,"Architecture detailed on model card, but crucial parts appear to be forked from GPT-NeoX",closed,,No preprint found,closed,,No paper found,open,https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct,Model card and readme provide details on datasets and,open,https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T,Data sheet includes links to data and recipes to create from scratch,closed,,No separate package found,partial,https://huggingface.co/togethercomputer/RedPajama-INCITE-7B-Instruct,Hosted inference API available through HuggingFace,/projects/RedPajama-INCITE-Instruct-7B.yaml,8.5
https://github.com/databrickslabs/dolly,,EleutherAI pythia,databricks-dolly-15k,,databricks,https://www.databricks.com,,open,,,open,,,open,,,open,,,closed,,,open,,,open,,,open,,,partial,https://arxiv.org/abs/2304.01373,,closed,,,closed,,,closed,,,open,,,closed,,,/projects/dolly.yaml,8.5
https://huggingface.co/allenai/tulu-2-dpo-70b,A new DPO model from AllenAI called Tülu,Llama2,"Tulu SFT, Ultrafeedback",AI2 ImpACT license,AllenAI,https://allenai.org/,,open,https://github.com/allenai/open-instruct,Important effort to make available fine-tuning procedure and source code. Not seen a repository for base model training yet.,closed,,Based on Llama2 so nothing known about training data,partial,https://github.com/meta-llama/,Base model made available via Meta (requires privacy-defying signup),open,https://github.com/allenai/open-instruct,Codebase used for fine-tuning this model and made available for others,open,https://huggingface.co/allenai/tulu-2-dpo-70b/tree/main,Fine-tuned weights available,partial,https://allenai.org/impact-license,Meta Community License and AI2 Impact license,partial,https://github.com/allenai/open-instruct,OpenInstruct code well-documented,partial,https://huggingface.co/HuggingFaceH4/zephyr-7b-beta,"Post-llama2 aspects quite well described in papers, e.g. DPO based on Zephyr Beta, but base model is irrevocably closed",open,https://arxiv.org/abs/2311.10702,Preprint covers the training of Tulu2,closed,,No peer-reviewed work found,partial,https://huggingface.co/allenai/tulu-2-70b,Model card offers little details,partial,https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture,"Some datasheets available (for instruction tuning), but base model entirely undocumented.",closed,,No separate package found,open,,Available via various APIs,/projects/tulu-2-70b.yaml,8.0
Expand All @@ -13,7 +14,6 @@ https://huggingface.co/mosaicml/mpt-7b-instruct,,MosaicML,"dolly, anthropic",CC-
https://github.com/CarperAI/trlx,,"various (pythia, flan, OPT)",various,MIT license,carperai,https://github.com/CarperAI/trlx,,open,,,open,,,open,,,partial,,,closed,,,open,,,open,,,partial,,,closed,,,closed,,,closed,,,closed,,,partial,,,open,,,/projects/trlx.yaml,7.5
https://huggingface.co/lmsys/vicuna-13b-v1.3,Vicuna is a chat assistant trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.,LLaMA,ShareGPT,Non-commercial license,LMSYS,https://lmsys.org/,"According to its website, 'The Large Model Systems Organisation develops large models and systems that are open, accessible and scalable'",open,https://github.com/lm-sys/FastChat,Actively maintained repository,partial,https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md#training-dataset,"Vicuna is fine-tuned LLaMA, and LLaMA in turn is based on 'publicly available datasets' that are not all specified or easily downloadable.",open,https://github.com/lm-sys/FastChat#vicuna-weights,"Unlike Vicuna 13B v0, these weights do not require applying delta",closed,https://github.com/lm-sys/FastChat#fine-tuning,From the documentation 'We will not release the ShareGPT dataset'. Also 'Vicuna v1.3 is fine-tuned from LLaMA with supervised instruction fine-tuning. The training data is around 140K conversations collected from ShareGPT.com.',closed,https://github.com/lm-sys/FastChat#fine-tuning,No model weights are shared for the instruction tuning,partial,https://github.com/lm-sys/FastChat#vicuna-weights,From the documentation 'Vicuna is based on LLaMA and should be used under LLaMA's model license.',open,https://github.com/lm-sys/FastChat,Code is quite well-documented and released as part of the FastChat framework.,closed,,,open,https://arxiv.org/pdf/2306.05685.pdf,Preprint covers training of the Vicuna model.,closed,,No peer-reviewed paper.,partial,https://huggingface.co/lmsys/vicuna-13b-v1.3,"Minimal model card, but many details are not provided or have to be pieced together from elsewhere.",closed,,No datasheet provided.,open,https://pypi.org/project/fschat/0.1.2/,Available via pip,partial,https://github.com/lm-sys/FastChat#api,"Support provided for several APIs OpenAI restful, HuggingFace, Langchain",/projects/vicuna13B-lmsys.yaml,7.0
hhttps://github.com/ethanyanjiali/minChatGPT,,GPT2,anthropic,GNU General Public License v3.0,ethanyanjiali,https://github.com/ethanyanjiali/minChatGPT,,open,,,open,,,open,,,partial,,,closed,,,open,,,open,,,partial,,,closed,,,closed,,,closed,,,closed,,,closed,,,open,,,/projects/minChatGPT.yaml,7.0
https://github.com/Cerebras,,,Alpaca (synthetic),Apache 2.0,Cerebras + Schramm,https://github.com/Cerebras,,open,,,open,,,open,,,open,,,closed,,,open,,,closed,,,open,,,partial,https://arxiv.org/abs/2304.03208,,closed,,,closed,,,closed,,,closed,,,closed,,,/projects/Cerebras-GPT-111m.yaml,6.5
https://github.com/BlinkDL/ChatRWKV,,RWKV-LM,"alpaca, shareGPT (synthetic)",,BlinkDL/RWKV,https://www.rwkv.com/,,open,https://github.com/BlinkDL/ChatRWKV,Various community-contributed enhancements available,partial,https://pile.eleuther.ai/,Trained on The Pile. Recent versions also build on Red Pajama (https://huggingface.co/datasets/togethercomputer/RedPajama-Data-1T),open,https://huggingface.co/BlinkDL/rwkv-4-world/tree/main,Model weights released across different HuggingFace spaces,closed,,"Instruction tuning data not separately available. Documentation 'These are RWKV-4-Pile 1.5/3/7/14B models finetuned on Alpaca, CodeAlpaca, Guanaco, GPT4All, ShareGPT and more'",closed,,Weights not separately available.,open,https://github.com/BlinkDL/ChatRWKV/blob/main/LICENSE,Apache 2.0,partial,,Code documentation scattered across github repo and HuggingFace spaces,partial,,Architecture described in preprint (LM part) but not all details clearly documented.,partial,https://arxiv.org/abs/2305.13048,"Preprint covers only LLM (RNN based), not instruction fine-tuning, so partial.",closed,,No peer-reviewed paper or published data audit known,closed,https://huggingface.co/BlinkDL/rwkv-4-raven,"No modelcard, HuggingFace spaces only used to share files",closed,https://huggingface.co/BlinkDL/rwkv-4-raven,"No data sheet, HuggingFac spaces only used to share files",open,https://pypi.org/project/rwkv/,Available through pip install rwkv,partial,,API via HuggingFace,/projects/ChatRWKV.yaml,6.5
https://github.com/LianjiaTech/BELLE,,LLaMA & BLOOMZ,"alpaca, shareGPT, Belle (synthetic)",Apache License 2.0,KE Technologies,http://www.ke.com,,open,https://github.com/LianjiaTech/BELLE,Repository contains a fair bit of code,partial,,"Open for variants based on BLOOMZ. Closed for variants based on LLaMA, whose pretraining data is nowhere disclosed or documented.",partial,,LLaMA based but copyright status unclear,partial,https://github.com/LianjiaTech/BELLE/tree/main/data/1.5M,Synthetic BELLE training data in Chinese released in batches,partial,https://github.com/LianjiaTech/BELLE/tree/main/models,"Some models available, most only as delta weights requiring separate access to LLaMA",closed,,Lowest common denominator is non-OSI approved LLaMA licence agreement,partial,https://github.com/LianjiaTech/BELLE/blob/main/README_en.md,"Quite some documentation on Github, though not all well-organized",open,https://github.com/LianjiaTech/BELLE/blob/main/README_en.md,Specified in a fair bit of detail on github,open,https://arxiv.org/abs/2303.14742,,closed,,No peer-reviewed paper found,closed,,No model card found,partial,,No data sheet found,closed,,No dedicated package available,closed,,No API found,/projects/BELLE.yaml,6.0
https://huggingface.co/WizardLM/WizardLM-13B-V1.2,Empowering Large Pre-Trained Language Models to Follow Complex Instructions,LLaMA2-13B,Evol-Instruct (synthetic),CC-BY-NC-4.0,Microsoft & Peking University,https://github.com/nlpxucan,,partial,https://github.com/nlpxucan/WizardLM/tree/main/WizardLM,Fast-evolving repository contains WizardLM code,closed,https://github.com/opening-up-chatgpt/opening-up-chatgpt.github.io/blob/main/projects/llama-2-chat.yaml,"Based on LLaMA2, which is claimed to be public but nowhere exactly documented.",partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,"Based on LLaMA2 weights, which are made conditionally available by Meta.",open,https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k,The Evol-Instruct V2 dataset contains 196k instruction-following sequences generated from Evol-Instruct,open,https://huggingface.co/WizardLM/WizardLM-13B-V1.2,Model weights offered in HuggingFace repository,partial,https://github.com/nlpxucan/WizardLM/blob/main/WizardLM/MODEL_DIFF_LICENSE,"Restricted for academic research purposes only. Code and Model diff release under CC-BY-NC-4.0, software code under Apache 2.0",partial,https://github.com/nlpxucan/WizardLM/tree/main/WizardLM,"Code is only partially documented, not clearly versioned, and appears to be in flux.",open,https://arxiv.org/abs/2304.12244,Architecture described in preprint and partly accessible in code repository,open,https://arxiv.org/abs/2304.12244,Preprint describes method for creating large amounts of LLM-based synthetic RLHF data and fine-tuning WizardLM based on it,closed,,No peer-reviewed paper or data audit found,closed,https://huggingface.co/WizardLM/WizardLM-13B-V1.2,Model card is only a placeholder and generates an error (missing yaml metadata),closed,https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k,Dataset card for Evol-Instruct generates an error,closed,,No package available,closed,,No API available,/projects/wizardlm-13B.yaml,6.0
Expand Down
Loading

0 comments on commit fdd9761

Please sign in to comment.