Missing model card / data sheet with info on pretraining and RLHF datasets #9

mdingemanse · 2023-09-28T09:37:01Z

At opening-up-chatgpt.github.io we're documenting data sources and degrees of openness along several dimensions for instruction-tuned LLMs. I am looking for information about (1) pretraining dataset and (2) RLHF datasets but have not found any details. The HuggingFace model card says

For full details of this model please read our release blog post

The release blog post provides no information on this at present.

aakosm · 2023-09-28T10:01:15Z

Information on the the language composition of the pretraining dataset would also be welcome, as there are no mention on multilingual capabilities of the model in the linked blog post.

149189 · 2023-09-29T18:15:31Z

I would like to work on this project!

AlexWortega · 2023-10-02T16:20:34Z

Upvote thread

mdingemanse · 2023-10-02T17:11:21Z

FWIW Mistral currently sits in the bottom 5 of the live tracker of LLM openness:

* Add MoE and Pipelining support * Update readme * Update requirements * Add faster loading * Make sliding window optional and add rope_theta with smart default --------- Co-authored-by: devendrachaplot <[email protected]>

mdingemanse mentioned this issue Oct 8, 2023

Keeping track of issues filed at LLM+RL projects opening-up-chatgpt/opening-up-chatgpt.github.io#60

Open

Good-Coffee mentioned this issue Dec 26, 2023

vLLM Build Issue using the provided Dockerfile #99

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing model card / data sheet with info on pretraining and RLHF datasets #9

Missing model card / data sheet with info on pretraining and RLHF datasets #9

mdingemanse commented Sep 28, 2023

aakosm commented Sep 28, 2023

149189 commented Sep 29, 2023

AlexWortega commented Oct 2, 2023

mdingemanse commented Oct 2, 2023

Missing model card / data sheet with info on pretraining and RLHF datasets #9

Missing model card / data sheet with info on pretraining and RLHF datasets #9

Comments

mdingemanse commented Sep 28, 2023

aakosm commented Sep 28, 2023

149189 commented Sep 29, 2023

AlexWortega commented Oct 2, 2023

mdingemanse commented Oct 2, 2023