Skip to content

Commit

Permalink
Apply automatic changes
Browse files Browse the repository at this point in the history
  • Loading branch information
mdingemanse authored and github-actions[bot] committed Jun 17, 2024
1 parent ad9bd7f commit 1eec44e
Show file tree
Hide file tree
Showing 3 changed files with 8 additions and 8 deletions.
2 changes: 1 addition & 1 deletion docs/df.csv
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1,,Mistral,Unspecified
https://huggingface.co/stabilityai/StableBeluga2,,LLaMA2,Orca-style (synthetic),Stable Beluga Non-Commercial Community License,Stability AI,https://stability.ai/,,closed,,"No repository with open code related to training, fine-tuning or evaluation found",closed,,"Pretraining data nowhere disclosed or documented, and described only in the vaguest terms in a corporate preprint released by Meta",partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,Download only after requesting access; requires signing a consent form,closed,,"Says: 'Stable Beluga 2 is trained on our internal Orca-style dataset', 'created synthetically using high-quality instructions' from COT Submix Original, NIV2 Submix Original, FLAN 2021 Submix Original, T0 Submix Original.",open,https://huggingface.co/stabilityai/StableBeluga2/tree/main,Instruction-tuned model weights available,partial,https://huggingface.co/stabilityai/StableBeluga2/blob/main/LICENSE.txt,"Usage requires signing StabilityAI's bespoke 'Stable Beluga Non-Commercial Community License', not an OSI recognised open license",closed,,Code on HuggingFace only covers minimal examples; no documentation available.,partial,https://huggingface.co/stabilityai/StableBeluga2#model-details,Says 'Stable Beluga 2 is a Llama2 70B model finetuned on an Orca style Dataset'.,partial,https://huggingface.co/papers/2306.02707,"Preprint from Microsoft describes Orca method of finetuning using GPT4-derived synthetic data, but no details of this particular architecture",closed,,No peer-reviewed paper found,partial,https://huggingface.co/stabilityai/StableBeluga2,"There is a model card, but it provides only a minimum of detail",closed,,Datasheet not provided.,closed,,Package not provided,partial,,model too large to run on HuggingFace free inference API,/projects/StableBeluga2.yaml,4.0
https://crfm.stanford.edu/2023/03/13/alpaca.html,project_notes,LLaMA,Self-Instruct (synthetic),LLaMA license agreement,Stanford University CRFM,https://crfm.stanford.edu/,,open,,,closed,,"Based on LLaMA, whose pretraining data is nowhere disclosed or documented.",partial,,"LLaMA based, copyright status unclear",partial,https://github.com/tatsu-lab/stanford_alpaca#data-release,alpaca_data.json contains 52K instruction-following data we used for fine-tuning the Alpaca model.,partial,https://github.com/tatsu-lab/stanford_alpaca#data-release,LLaMA based,closed,https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform,Pegged to LLaMA licence agreement,partial,https://github.com/tatsu-lab/stanford_alpaca,Insofar as code is made available it is fairly well documented,open,https://github.com/tatsu-lab/stanford_alpaca#fine-tuning,Fair bit of documentation available on github repository,closed,,No preprint found; uses the release-by-blogpost playbook,closed,,No peer-reviewed work found.,closed,,No model card found,closed,,No data sheet found,closed,,,closed,,,/projects/alpaca.yaml,4.0
https://huggingface.co/tiiuae/falcon-180B-chat,"Falcon-180B-Chat is a 180B parameters causal decoder-only model built by TII based on Falcon-180B and finetuned on a mixture of Ultrachat, Platypus and Airoboros.",Falcon 180B,"OpenPlatypus, Ultrachat, Airoboros (synthetic)",FALCON 180B TII LICENSE VERSION 1.0,Technology Innovation Institute,https://falconllm.tii.ae,,closed,https://huggingface.co/tiiuae/falcon-180B-chat,No source code shared anywhere. The over 200 github repositories of TII appear to include no LLM or Falcon related code.,partial,https://huggingface.co/datasets/tiiuae/falcon-refinedweb,"From the documentation 'The key ingredient for the high quality of the Falcon models is their training data, predominantly based (>80%) on RefinedWeb — a novel massive web dataset based on CommonCrawl' (https://huggingface.co/blog/falcon). However, only a small sample is made available.",partial,https://huggingface.co/tiiuae/falcon-180B,"requires signing up and accepting ""acceptable use policy""",partial,https://github.com/project-baize/baize-chatbot,"No details provided beyond ""finetuned on a mixture of Ultrachat, Platypus and Airoboros"".",partial,https://huggingface.co/tiiuae/falcon-180B-chat/tree/main,"No RL weights or checkpoints made available; fine-tuned 'chat' model only available after signing ""acceptable use policy"".",closed,https://huggingface.co/spaces/tiiuae/falcon-180b-license/blob/main/LICENSE.txt,"Released under Falcon 180B TII license (not OSI approved) and a separate ""acceptable use policy"".",closed,,"No code available, so no code documented.",partial,https://huggingface.co/tiiuae/falcon-180B-chat#model-architecture-and-objective,"Architecture described as ""causal decoder-only"" and ""optimized for inference"", but very few details available.",partial,https://arxiv.org/abs/2306.01116,"Preprint covers the creation and curation of RefinedWeb dataset, but not other aspects of the model, hence partial.",closed,,No peer-reviewed paper known.,partial,https://huggingface.co/tiiuae/falcon-180B-chat,"Model card on HuggingFace is mostly used to advertise the model, not to document its training and evaluation details.",closed,,There is no datasheet available.,closed,,There is no package.,closed,https://huggingface.co/tiiuae/falcon-180B-chat,"There is no API, and HuggingFace inference API is disabled for this model.",/projects/Falcon-180B-chat.yaml,3.5
https://ai.google.dev/gemma/docs,Model weights and developer tools,Gemma,Unspecified,,Google DeepMind,https://ai.google.dev/gemma/docs,,partial,https://github.com/google-deepmind/gemma,No pre-training or instructing-tuning code made available. Some developer tools available.,closed,,No details provided on pre-training data.,partial,https://www.kaggle.com/models/keras/gemma/frameworks/keras/variations/gemma_7b_en,"Base model weights shared via Kaggle, requires privacy-defying access request.",closed,,"Documentation says 'These versions of the model are trained with human language interactions and can respond to conversational input, similar to a chat bot.' ",partial,,"Instruction-tuned model weights shared via Kaggle, requires privacy-defying access request",closed,https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335?pli=1,"Bespoke Gemma Community License Agreement and restrictive Terms of Use, neither open in the sense of OSI. Only Inference code shared under Apache 2.0.",closed,,No pretraining or finetuning code found. No code documentation except for deployment of the open weights.,partial,https://www.kaggle.com/models/google/gemma,Architecture described in very general terms in model card,partial,https://arxiv.org/abs/2403.08295,"Preprint released Apr 2024 details architecture and evaluation, but provides no information on pre-training, instruction tuning and SFT datasets",closed,,No paper found,open,https://www.kaggle.com/models/google/gemma,Model card on Kaggle provides some detail on model internals and evaluation,closed,,"No datasheet found, pre-training and instruction-tuning data nowhere specified.",closed,,"No package provided, access is gated through Kaggle, Vertex Model Garden, Google Cloud",closed,,"No API provided, access gated through Kaggle, Vertex Model Garden, Google Cloud",/projects/gemma-instruct.yaml,3.5
https://www.microsoft.com/en-us/research/project/orca/,"This file applies to Orca 2 7B and 13B, both fine-tunes of corresponding Llama2 base models.",LLaMA2,"FLAN, Math, undisclosed (synthetic)","Microsoft Research License, Llama Community License",Microsoft Research,https://www.microsoft.com/en-us/research/project/orca/,,closed,,"No source code made available anywhere for data curation, training, fine-tuning, or evaluation",closed,,"Llama pretraining Data nowhere disclosed or documented, and described only in the vaguest terms in a corporate preprint released by Meta",partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,Download only after requesting access; requires signing a consent form,closed,,"Instruction-tuning and explanation-tuning data described in preprint, but not disclosed or made available",open,https://huggingface.co/microsoft/Orca-2-7b,7B and 13B finetunes of Llama2 made available through HuggingFace,closed,https://github.com/facebookresearch/llama/blob/main/LICENSE,"Licensed under Microsoft Research License and Meta's bespoke community license, neither OSI recognised",closed,,Code only covers minimal examples; no documentation available.,partial,https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/,"Architecture sketched in corporate preprints, though many details missing.",partial,https://arxiv.org/pdf/2311.11045.pdf,"Corporate preprint from Microsoft Research has some detail on instruction-tuning, explanation-tuning and evaluation",closed,,No peer-reviewed paper found,partial,https://huggingface.co/microsoft/Orca-2-7b,Model card serves as landing page more than careful model documentation,closed,,Datasheet not provided.,closed,,Package not provided,partial,,Model too large to load on HuggingFace inference library,/projects/orca_2.yaml,3.5
https://huggingface.co/CohereForAI/c4ai-command-r-v01,,,Aya Collection,CC-BY-NC and C4AI acceptable use policy,Cohere AI,https://cohere.com,,closed,,"No codebase available to study or adjust model architecture, training, or inner workings.",closed,https://docs.cohere.com/docs/data-statement,"No documentation, listing or audit of pre-training data available. Cohere itself identifies it as coheretext-filtered and gives the size as 200Gb.",closed,,No checkpoint or model prior to SFT and instruction-tuning made available,open,https://huggingface.co/collections/CohereForAI/aya-datasets-660415741bd4852f01c81c77,Aya Collection (Aya Open Science initiative) is a multilingual collection of 513 million instances of promts and completions including instruction-style templates.,open,https://huggingface.co/CohereForAI/c4ai-command-r-v01/tree/main,Fine-tuned model weights made available for download,partial,https://docs.cohere.com/docs/c4ai-acceptable-use-policy,Licensed under CC-BY-NC and requires agreeing to C4AI acceptable use policy,closed,,"No source code available, so no documentation of code.",closed,,Architecture only sparsely documented.,closed,,No preprint appears to be made available at this time.,closed,,No paper known to document the Cohere Command R+ model or architecture.,partial,https://huggingface.co/CohereForAI/c4ai-command-r-v01-4bit,"Model card on HF document some aspects but provides no data on training data, instruction-tuning methods",closed,,Datasheet not available.,closed,,No separate package available.,closed,,API access available only when signing up.,/projects/command-r.yaml,3.0
https://ai.google.dev/gemma/docs,Model weights and developer tools,Gemma,Unspecified,,Google DeepMind,https://ai.google.dev/gemma/docs,,partial,https://github.com/google-deepmind/gemma,No pre-training or instructing-tuning code made available. Some developer tools available.,closed,,No details provided on pre-training data.,partial,https://www.kaggle.com/models/keras/gemma/frameworks/keras/variations/gemma_7b_en,"Base model weights shared via Kaggle, requires privacy-defying access request.",closed,,"Documentation says 'These versions of the model are trained with human language interactions and can respond to conversational input, similar to a chat bot.' ",partial,,"Instruction-tuned model weights shared via Kaggle, requires privacy-defying access request",closed,https://console.cloud.google.com/vertex-ai/publishers/google/model-garden/335?pli=1,"Bespoke Gemma Community License Agreement and restrictive Terms of Use, neither open in the sense of OSI. Only Inference code shared under Apache 2.0.",closed,,No pretraining or finetuning code found. No code documentation except for deployment of the open weights.,partial,https://www.kaggle.com/models/google/gemma,Architecture described in very general terms in model card,closed,,No preprint found,closed,,No paper found,open,https://www.kaggle.com/models/google/gemma,Model card on Kaggle provides some detail on model internals and evaluation,closed,,"No datasheet found, pre-training and instruction-tuning data nowhere specified.",closed,,"No package provided, access is gated through Kaggle, Vertex Model Garden, Google Cloud",closed,,"No API provided, access gated through Kaggle, Vertex Model Garden, Google Cloud",/projects/gemma-instruct.yaml,3.0
https://ai.meta.com/resources/models-and-libraries/llama/,,LLaMA2,"Meta, StackExchange, Anthropic",Unclear,Facebook Research,https://github.com/facebookresearch,,closed,https://github.com/facebookresearch/llama/tree/main,"Repository only offers 'a minimal example to load Llama 2 models and run inference'; no training, fine-tuning, evaluation code made available",closed,,"Data nowhere disclosed or documented, and described only in the vaguest terms in a corporate preprint released by Meta",partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,Download only after requesting access; requires signing a consent form,closed,,RLHF data including 1 million Meta-specific tuning prompts not made available (even as it incorporates some open RLHF datasets),partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,Download only after requesting access; requires signing a consent form,closed,https://github.com/facebookresearch/llama/blob/main/LICENSE,"Usage requires signing Meta's bespoke 'community license', not an OSI recognised open license",closed,,Code only covers minimal examples; no documentation available.,partial,https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/,"Architecture sketched in preprint, though many details missing.",partial,https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/,"Corporate preprint quite some detail on pretraining, RLHF, and safety measures but none on datasets.",closed,,No peer-reviewed paper found,partial,https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md,"There is a model card, but it provides the absolute minimum of detail, and none whatsoever on training data.",closed,,Datasheet not provided.,closed,,Package not provided,partial,,API only available behind a privacy-defying signup form,/projects/llama-2-chat.yaml,3.0
https://huggingface.co/Nanbeige/Nanbeige2-8B-Chat,Comes in 8B and 16B versions,Unknown,Unknown,Apache 2.0 and bespoke community license,Nanbeige LLM lab,https://huggingface.co/Nanbeige,,open,https://github.com/Nanbeige/Nanbeige,"github repo contains sparse but clear code for training, tuning, and inference",closed,,No information on pre-training datasets except a claim of 4.5T tokens. Request for information on HF community was closed without comment.,closed,,Base model not shared,closed,https://huggingface.co/Nanbeige/Nanbeige2-8B-Chat/discussions/2#6621e15a4d17641cf788cbd5,"No information on finetuning and DPO datasets. Some information provided on request (see link), but official documentation not updated.",open,https://huggingface.co/Nanbeige/Nanbeige2-8B-Chat/tree/main,Model weights for finetuned model shared,partial,,Apache 2.0 but commercial use requires signup and an additional community license,closed,,No documentation of the codebase,closed,,Architecture not clearly specified,closed,,No preprint found,closed,,No paper found,closed,,No model card found,closed,,No datasheet found,closed,,No package found,partial,https://huggingface.co/spaces/Nanbeige/Nanbeige-Plus-Chat-v0.1,"No API, but HuggingFace space available",/projects/nanbeige-chat.yaml,3.0
https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6,,Meta Llama 3,"Meta, undocumented",Meta Llama 3 Community License,Facebook Research,https://github.com/facebookresearch,,closed,https://github.com/meta-llama/llama3,Repository only offers minimal code,closed,,"Data nowhere disclosed or documented, and described only in the vaguest terms in a release blog post",partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,Download only after requesting access; requires signing a consent form,closed,,No information available on instruction-tuning.,partial,https://ai.meta.com/resources/models-and-libraries/llama-downloads/,Download only after requesting access; requires signing a consent form,closed,https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct/tree/main,"Even inspecting the model requires signing Meta Llama 3's bespoke 'community license', not an OSI recognised open license",closed,,Code only covers minimal examples; no documentation available.,partial,https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/,Architecture sketched in glossy blog post.,closed,,No preprint or any other scientific documentation available.,closed,,No peer-reviewed paper available.,partial,https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md,"There is a model card, but it provides the absolute minimum of detail, and none whatsoever on training data.",closed,,Datasheet not provided.,closed,,Package not provided,partial,,API only available behind a privacy-defying signup form,/projects/llama-3-instruct.yaml,2.5
Expand Down
Loading

0 comments on commit 1eec44e

Please sign in to comment.