Model characteristics dataset #3

samuelrince · 2024-02-27T17:21:37Z

Description

To compute the impacts of a query, we need some characteristics of the model that was used. Especially in the case of LLMs we need the total count of parameters.

Solution

A CSV or JSON file to store all known models and metadata like the total parameters count.

Considerations

Proprietary models

In many cases, we don't know the underlying architecture of models. thus we will need to guesstimate it (see issue #1 for OpenAI). The estimation can be based on performance achieved by this models in various leaderboards compared to open-weight models. It is crucial to keep the source of this assessment because it influences a lot the impacts.

Total parameters vs active parameters

In the case of mixture of experts models we can definie the active parameters count as the sum of all active/used parameters to run the computation. (example with mixtral).

AndreaLeylavergne · 2024-03-04T05:19:23Z

I can work on this issue. Is there anyone else to work with me? Is there any information already available so that i can better understand what is excepted here ?

samuelrince · 2024-03-04T14:38:04Z

A first thing would be to define a file format, to store information about models. We can start simple with that (example only):

Model name	Total parameters	Active parameters	Source
mistralai/Mistral-7B-v0.1	7.3	7.3	https://docs.mistral.ai/models/
mistralai/Mixtral-8x7B-v0.1	46.7	12.9	https://docs.mistral.ai/models/
openai-community/gpt2-xl	1.5	1.5	https://huggingface.co/openai-community/gpt2-xl
gpt-3.5-turbo	20	20	#1

We need to take into account that we will probably include other data in the future, like min/max parameters value for proprietary models.

Also have a clear scope of how we can collect this data automatically. Probably someone else on the internet already worked on that?

samuelrince · 2024-03-04T16:31:07Z

Found in my notes: https://docs.google.com/spreadsheets/d/1O5KVQW1Hx5ZAkcg8AIRjbQLQzx2wVaLl0SqUu-ir9Fs/edit#gid=1158069878

samuelrince · 2024-03-07T11:52:11Z

Some clarifications after a quick meeting with @AndreaLeylavergne.

We will start with the main LLM providers (aligned with what we have already implemented in the package). So we will focus on OpenAI, Mistral AI and Anthropic first.

We need to report LLMs in the following spreadsheet : model repository

Column description:

provider: name of the provider in lower case (e.g. openai, mistralai, anthropic) ;
name: name of the model according to the official API documentation of each provider ;
total_parameters: the total parameters count of the model (will need to be assessed when unknown) ;
active_parameters: for sparse mixture of experts (sMoE) models, the number of active parameters at inference time (will need to be assessed when unknown).
warnings: a list of formatted warning related to the assumptions we made on the model parameters
sources: a list of formatted sources that we used to add the model (mainly URLs)

The model name should be the same as defined in API documentation:

Mistral AI models: https://docs.mistral.ai/guides/model-selection/
OpenAI models: https://platform.openai.com/docs/models
Anthropic models: https://docs.anthropic.com/claude/docs/models-overview

To find popular models and some assessments on their architecture, we can use this database.

samuelrince mentioned this issue Mar 19, 2024

Add model repository #17

Merged

samuelrince linked a pull request Mar 19, 2024 that will close this issue

Add model repository #17

Merged

samuelrince closed this as completed in #17 Mar 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model characteristics dataset #3

Model characteristics dataset #3

samuelrince commented Feb 27, 2024 •

edited

Loading

AndreaLeylavergne commented Mar 4, 2024

samuelrince commented Mar 4, 2024 •

edited

Loading

samuelrince commented Mar 4, 2024

samuelrince commented Mar 7, 2024

Model characteristics dataset #3

Model characteristics dataset #3

Comments

samuelrince commented Feb 27, 2024 • edited Loading

Description

Solution

Considerations

Proprietary models

Total parameters vs active parameters

AndreaLeylavergne commented Mar 4, 2024

samuelrince commented Mar 4, 2024 • edited Loading

samuelrince commented Mar 4, 2024

samuelrince commented Mar 7, 2024

samuelrince commented Feb 27, 2024 •

edited

Loading

samuelrince commented Mar 4, 2024 •

edited

Loading