Skip to content

Commit

Permalink
Apply automatic changes
Browse files Browse the repository at this point in the history
  • Loading branch information
mdingemanse authored and github-actions[bot] committed Jun 26, 2024
1 parent ff37291 commit 0a6e82d
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 5 deletions.
2 changes: 1 addition & 1 deletion docs/df.csv
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ https://huggingface.co/BramVanroy/GEITje-7B-ultra,Dutch instruction-tuned model
https://huggingface.co/microsoft/Phi-3-mini-128k-instruct,,Phi3,Unspecified,MIT License,Microsoft,https://huggingface.co/microsoft/Phi-3-mini-128k-instruct,,closed,,"No source code found for pretraining, posttraining, or evaluation",closed,,No datasets made available and no information on datasets disclosed except very generic claims about filtering for high quality.,closed,,No base model of the instruction-tuned Phi-3 was released,closed,,No post-training datasets made available and no information on datasets disclosed except very generic claims about filtering for high quality.,open,https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/tree/main,Instruction-tuned model weights shared through HuggingFace,open,https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/LICENSE,MIT License,closed,,"No source code, so no documentation of source code found",open,https://arxiv.org/abs/2404.14219,Architecture described in model card and preprint,partial,https://arxiv.org/abs/2404.14219,"Preprint describes model architecture but not training data, focusing mostly on benchmarks and evalution",closed,,No paper found,open,https://huggingface.co/microsoft/Phi-3-mini-128k-instruct/blob/main/LICENSE,Model card provides reasonable level of detail,closed,,No datasheet made available,partial,,Available through development version of transformers,open,,Available through HuggingFace API,/projects/phi-3-instruct.yaml,6.0
https://huggingface.co/WizardLM/WizardLM-13B-V1.2,Empowering Large Pre-Trained Language Models to Follow Complex Instructions,LLaMA2-13B,Evol-Instruct (synthetic),CC-BY-NC-4.0,Microsoft & Peking University,https://github.com/nlpxucan,,partial,https://github.com/nlpxucan/WizardLM/tree/main/WizardLM,Fast-evolving repository contains WizardLM code,closed,https://github.com/opening-up-chatgpt/opening-up-chatgpt.github.io/blob/main/projects/llama-2-chat.yaml,"Based on LLaMA2, which is claimed to be public but nowhere exactly documented.",partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,"Based on LLaMA2 weights, which are made conditionally available by Meta.",open,https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k,The Evol-Instruct V2 dataset contains 196k instruction-following sequences generated from Evol-Instruct,open,https://huggingface.co/WizardLM/WizardLM-13B-V1.2,Model weights offered in HuggingFace repository,partial,https://github.com/nlpxucan/WizardLM/blob/main/WizardLM/MODEL_DIFF_LICENSE,"Restricted for academic research purposes only. Code and Model diff release under CC-BY-NC-4.0, software code under Apache 2.0",partial,https://github.com/nlpxucan/WizardLM/tree/main/WizardLM,"Code is only partially documented, not clearly versioned, and appears to be in flux.",open,https://arxiv.org/abs/2304.12244,Architecture described in preprint and partly accessible in code repository,open,https://arxiv.org/abs/2304.12244,Preprint describes method for creating large amounts of LLM-based synthetic RLHF data and fine-tuning WizardLM based on it,closed,,No peer-reviewed paper or data audit found,closed,https://huggingface.co/WizardLM/WizardLM-13B-V1.2,Model card is only a placeholder and generates an error (missing yaml metadata),closed,https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k,Dataset card for Evol-Instruct generates an error,closed,,No package available,closed,,No API available,/projects/wizardlm-13B.yaml,6.0
https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1,,Llama2,Airoboros (synthetic),Purposely left ambiguous,Jon Durbin,https://github.com/jondurbin,Only active on GitHub since May 2023,partial,https://gist.github.com/jondurbin/87fc040b92a3073125ed516b04bc6e19,Repo exists for RL data but only a gist exists for model training and architecture,closed,,Llama2 training data is nowhere documented or disclosed,partial,,"Llama2, made conditionally available by Meta",open,https://github.com/jondurbin/airoboros,"Airoboros, an implementation of the Self-Instruct paper",open,https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1/tree/main,Made available through HuggingFace,partial,https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1#licence-and-usage-restrictions,Licensing left ambiguous because of murky status of OpenAI-derived Self-Instruct data,partial,,What little code available is not very systematically documented,partial,https://huggingface.co/jondurbin/airoboros-l2-70b-gpt4-1.4.1/discussions/2#64c29e4c617b36543dedac9a,Some info can be gleaned at link but most remains undocumented,closed,,No preprint found,closed,,No peer-reviewed paper found,partial,https://huggingface.co/jondurbin/airoboros-65b-gpt4-1.4,Instructs reader to look up model card for prior 65B Llama1 version,partial,https://huggingface.co/datasets/jondurbin/airoboros-gpt4-1.4.1,Datasheet for RL data only,closed,,No package found,closed,,No API found,/projects/airoboros.yaml,5.5
https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md,"From the readme, ""ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese QA and dialogue. The model is trained for about 1T tokens of Chinese and English corpus, supplemented by supervised fine-tuning, feedback bootstrap, and reinforcement learning wit human feedback. With only about 6.2 billion parameters, the model is able to generate answers that are in line with human preference.""",GLM (own),Unspecified,Apache 2.0,THUDM,https://github.com/THUDM,Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University,partial,https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md#deployment,Some code made available on Github,partial,http://doi.org/10.18653/v1/2022.acl-long.26,"Training data not centrally made available, but described in 2022 ACL paper, appears to be mostly public datasets",open,https://huggingface.co/THUDM/chatglm-6b/tree/main,Model made available through HuggingFace,closed,,"docs mention ""supervised fine-tuning, feedback bootstrap, and reinforcement learning wit human feedback"", but none of the datasets used are clearly specified.",closed,,No weights or checkpoints corresponding to the delta of the LLM vs RLHF provided,open,https://github.com/THUDM/ChatGLM-6B/blob/main/LICENSE,Apache 2.0,partial,https://github.com/THUDM/ChatGLM-6B/blob/main/ptuning/README_en.md,"Some documentation available, but a lot of code is not commented or explained.",partial,,Full details architecture not specified in a single place,closed,,,partial,https://aclanthology.org/2022.acl-long.26/,"ACL 2022 paper describes the training of the GLM base model, but the RLHF portion is more recent (there is also a related ICLR paper for a newer generation https://openreview.net/forum?id=-Aw0rrrPUF)",closed,https://huggingface.co/THUDM/chatglm-6b,No modelcard; the HuggingFace modelcard spot is used just as the homepage for the model.,closed,,No datasheet,closed,,No package,open,https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md#api-deployment,API provided through fastapi uvicorn,/projects/ChatGLM-6B.yaml,5.5
https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md,"From the readme, ""ChatGLM-6B uses technology similar to ChatGPT, optimized for Chinese QA and dialogue. The model is trained for about 1T tokens of Chinese and English corpus, supplemented by supervised fine-tuning, feedback bootstrap, and reinforcement learning wit human feedback. With only about 6.2 billion parameters, the model is able to generate answers that are in line with human preference.""",GLM (own),Unspecified,Apache 2.0,THUDM,https://github.com/THUDM,Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University,partial,https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md#deployment,Some code made available on Github,partial,http://doi.org/10.18653/v1/2022.acl-long.26,"Training data not centrally made available, but described in 2022 ACL paper, appears to be mostly public datasets",open,https://huggingface.co/THUDM/chatglm-6b/tree/main,Model made available through HuggingFace,closed,,"docs mention ""supervised fine-tuning, feedback bootstrap, and reinforcement learning wit human feedback"", but none of the datasets used are clearly specified.",closed,,No weights or checkpoints corresponding to the delta of the LLM vs RLHF provided,open,https://github.com/THUDM/ChatGLM-6B/blob/main/LICENSE,Apache 2.0,partial,https://github.com/THUDM/ChatGLM-6B/blob/main/ptuning/README_en.md,"Some documentation available, but a lot of code is not commented or explained.",partial,,Full details of architecture not specified in a single place,closed,,,partial,https://aclanthology.org/2022.acl-long.26/,"ACL 2022 paper describes the training of the GLM base model, but the RLHF portion is more recent (there is also a related ICLR paper for a newer generation https://openreview.net/forum?id=-Aw0rrrPUF)",closed,https://huggingface.co/THUDM/chatglm-6b,No modelcard; the HuggingFace modelcard spot is used just as the homepage for the model.,closed,,No datasheet,closed,,No package,open,https://github.com/THUDM/ChatGLM-6B/blob/main/README_en.md#api-deployment,API provided through fastapi uvicorn,/projects/ChatGLM-6B.yaml,5.5
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1,,unclear,unspecified,Apache 2.0,Mistral AI,https://mistral.ai/,,partial,https://github.com/mistralai/mistral-src,repository provides 'minimal code to run our 7B model',closed,,No information provided on pretraining data,open,https://github.com/mistralai/mistral-src#download-the-model,Base LLM model made available for download,closed,https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1,No information provided expect that instruction tuning is done using an unspecified 'variety of publicly available conversation datasets',partial,https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/tree/main,Instruct version of the model made available but no information on fine-tuning procedure provided,open,https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1/blob/main/README.md,Apache 2.0,closed,https://github.com/mistralai/mistral-src,the little code that is available is uncommented and undocumented,partial,https://github.com/mistralai/mistral-src,Some information on architecture provided in github repo,partial,http://arxiv.org/abs/2310.06825,"Preprint rehashes marketing blurbs also given in blog and provides no details about pretraining datasets, instruction tuning datasets, or fine-tuning process, hence partial.",closed,,No peer reviewed paper available,closed,,"No model card available, HuggingFace modelcard just points to a corporate blog post",closed,,No datasheet available,partial,https://docs.mistral.ai/quickstart/,Docker image shared on github,open,https://docs.mistral.ai/api,API specification provided by vLLM,/projects/mistral-7B.yaml,5.5
https://github.com/nlpxucan/WizardLM,Empowering Large Pre-Trained Language Models to Follow Complex Instructions,LLaMA-7B,Evol-Instruct (synthetic),CC-BY-NC-4.0,Microsoft & Peking University,https://github.com/nlpxucan,,partial,https://github.com/nlpxucan/WizardLM/tree/main/WizardLM,Fast-evolving repository contains WizardLM code,partial,,"Based on LLaMA, which is claimed to be public but nowhere exactly documented.",closed,,"Based on LLaMA weights, which are not openly available though a leaked versions is in wide circulation.",open,https://github.com/nlpxucan/WizardLM/tree/main/WizardLM#training-data,The Evol-Instruct dataset contains 70k instruction-following sequences generated from Evol-Instruct,partial,https://huggingface.co/WizardLM/WizardLM-7B-V1.0/tree/main,Model weights offered as a delta to LLaMA,partial,https://github.com/nlpxucan/WizardLM/blob/main/WizardLM/MODEL_DIFF_LICENSE,"Restricted for academic research purposes only. Code and Model diff release under CC-BY-NC-4.0, software code under Apache 2.0",partial,https://github.com/nlpxucan/WizardLM/tree/main/WizardLM,"Code is only partially documented, not clearly versioned, and appears to be in flux.",open,https://arxiv.org/abs/2304.12244,Architecture described in preprint and partly accessible in code repository,open,https://arxiv.org/abs/2304.12244,Preprint describes method for creating large amounts of LLM-based synthetic RLHF data and fine-tuning WizardLM based on it,closed,,No peer-reviewed paper or data audit found,closed,https://huggingface.co/WizardLM/WizardLM-7B-V1.0,Model card is only a placeholder and generates an error (missing yaml metadata),closed,https://huggingface.co/datasets/WizardLM/WizardLM_evol_instruct_V2_196k,Dataset card for Evol-Instruct generates an error,closed,,No package available,closed,,No API available,/projects/wizardlm-7B-V1.yaml,5.5
https://qwenlm.github.io/blog/qwen1.5/,"This is based on the 72B version, the largest of 8 available model sizes.",QwenLM,Unspecified,Qianwen License,Alibaba Cloud,,Qwen (abbr. for Tongyi Qianwen 通义千问) refers to the large language model family built by Alibaba Cloud,partial,https://github.com/QwenLM/Qwen1.5/,Repository provides sparse source code and some examples for SFT,closed,,Pretraining data not specified or documented.,open,https://huggingface.co/Qwen/Qwen1.5-72B/tree/main,Also available in smaller model sizes,closed,https://qwen.readthedocs.io/en/latest/training/SFT/llama_factory.html,Data not specified or documented. Some example code in repo provides directions but no details.,open,https://huggingface.co/Qwen/Qwen1.5-72B-Chat/tree/main,Also available in smaller model sizes,closed,,Qianwen License,partial,,Repository is fairly well-documented.,partial,,No clear description of architecture found.,closed,,No preprint found.,closed,,No peer-reviewed paper found.,closed,,"Model card on HF only serves as a pointer to the model, no actual info provided.",closed,,No datasheet.,partial,,No specific package provided but integrates well with many widely used packages,open,,Available through various APIs,/projects/qwen-1.5-chat.yaml,5.0
Expand Down
Loading

0 comments on commit 0a6e82d

Please sign in to comment.