You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's cut to the chase, a lot of people contribute to this file and it's community maintained on best-effort basis. There's no documentation regarding it other than code, nor does there seem to be any prevailing convention in the way entries are structured.
The issue is the multi-dimensionality and permutations of the top-level key which has resulted in 4+ conventions making working with the file programmatically hard and confusing to new comes such as myself. Additionally, I think we could consolidate more details in this dataset and perhaps restructure it completely after the first steps have been taken.
The top-level key variants are:
a) {model_name} : original I take it, gpt-4
b) {litellm_provider}/{model_name}: azure/gpt-4
c) {model_variant}/{model_name}: e.g. standard/1024-x-1792/dall-e-3 and hd/1792-x-1024/dall-e-3
d) {litllm_provider}/{model_variant}/{model_name}: e.g., azure/standard/1024-x-1024/dall-e-3
e) {model_name} with mode encoded in litellm_provider, e.g., gpt-3.5-turbo-instruct has litellm_provider: text-completion-openai and mode: completion
f) {mode}_{model_name}: e.g., text-bison with litellm_provider: vertex_ai-text-models and mode: completion. Here maybe text-bison is the API name but why categorise the provider by mode as well? There's also palm/text-bison-safety-recitation-off with litellm_provider: palm, so following a different convention yet again.
g) From Bedrock we have cohere.embed-multilingual-v3 and bedrock/*/6-month-commitment/cohere.command-light-text-v14. Where's the bedrock/ prefix or left like that because it's the API name?
h) And this: cloudflare/@cf/meta/llama-2-7b-chat-int8 with litellm_provider: cloudflare. I get it but why the sudden @? API again?
j) And then we have Cohere's missig models. Cohere chat API is not included, maybe because it's not supported? But the thing is the completion and chat APIs use same model names. So here we'll get {litellm_provider}/{mode}/{model_name}, yet another key scheme.
And there's more. I think the goal's been to key by API name, but that breaks with Cohere at least. And other thing is that some are encoding model attributes in litellm_provider field, which is why OpenAI models are split over at least two providers. Since there is a single provider, the data should reflect that.
The variety and dimensionality of the data has reached JSON's limits I think, at least with a single key. Considering this file is used in various ways, I think a step-by-step approach would be best to untangle this mess. How about we nail down the interim format and document it, and then consider what the data actually is.
E.g., it contains almost all the details about embedding models, except for output dimensions, which are a direct cost factor driving vector store memory consumption and important when sizing environments and comparing products for a use-case. So why not include dimensions? And why stop at the cost domain? It would be useful to have various model details in a structured format governed by a schema. E.g., there's many deprecated models in there, why not have a status field or some such to say deprecated? And a date to go with that?
You start adding data and think about keeping it clean and this ball of mud could turn out to be a really valuable lookup/reference resource, a source of truth. But as said, this all takes time, so for now I'd be happy if we can agree on how to encode keys and providers. Data format and databases can come a bit later, along with process for keeping it up to date.
So, I think the current data should be discoverable, sortable in a meaningful way, and be updated in a consistent manner from ere on out, and cleaned up when time permits. And of course be programmatically accessible, and as long as it's a file it should be a format supporting comments and schemas. Being able to deserialize to Pydantic models should be a given.
I don't know what the best way to key this is, but it must be a permutation of the above examples. A reasonable taxonomy to me would be, e.g., {litellm_provider}/{mode}/{model_name}/{model_variant} yielding from previous examples:
hd/1792-x-1024/dall-e-3 -> openai/img/gen/dall-e-3/1792-x-1024 (there's no hd in the API and we can generate, edit, or create variants) or
cohere.embed-multilingual-v3 -> bedrock/cohere/embedding/cohere.embed-multilingual-v3 and cohere/completion/command and cohere/chat/command.
So in summary stop using API names for the index key, we can add a model_api_name field or some such to retain that.
And let's add dimensions and whatever useful details can be scraped.
Deprecation dates could give automatic warnings to users and not just as log messages but more prominently. And let's start thinking what this data could be tomorrow, what opportunities we're missing out on.
I think in long term the whole thing could be kept up to date by LLMs equipped with the right tools. E.g., give gpt-4-vision-preview a sandboxed Puppeteer or Playwright instance over a websocket, and it could navigate complex sites, interact with JS, take screenshots and analyze them if needed, and much more. Prompt it right (they're scared of free access to anything, brainwashed by trainers, poor things), and let it run test suites, use git and GitHub, code in peace, and CI the whole thing and we'd never have to worry about this crap again. Might as well ask the model to keep an eye out for news while it's perusing sites and write survey reports back home. Why not?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Let's cut to the chase, a lot of people contribute to this file and it's community maintained on best-effort basis. There's no documentation regarding it other than code, nor does there seem to be any prevailing convention in the way entries are structured.
The issue is the multi-dimensionality and permutations of the top-level key which has resulted in 4+ conventions making working with the file programmatically hard and confusing to new comes such as myself. Additionally, I think we could consolidate more details in this dataset and perhaps restructure it completely after the first steps have been taken.
a)
{model_name}
: original I take it,gpt-4
b)
{litellm_provider}/{model_name}
:azure/gpt-4
c)
{model_variant}/{model_name}
: e.g.standard/1024-x-1792/dall-e-3
andhd/1792-x-1024/dall-e-3
d)
{litllm_provider}/{model_variant}/{model_name}
: e.g.,azure/standard/1024-x-1024/dall-e-3
e)
{model_name}
with mode encoded inlitellm_provider
, e.g.,gpt-3.5-turbo-instruct
haslitellm_provider: text-completion-openai
andmode: completion
f)
{mode}_{model_name}
: e.g.,text-bison
withlitellm_provider: vertex_ai-text-models
andmode: completion
. Here maybetext-bison
is the API name but why categorise the provider by mode as well? There's alsopalm/text-bison-safety-recitation-off
withlitellm_provider: palm
, so following a different convention yet again.g) From Bedrock we have
cohere.embed-multilingual-v3
andbedrock/*/6-month-commitment/cohere.command-light-text-v14
. Where's thebedrock/
prefix or left like that because it's the API name?h) And this:
cloudflare/@cf/meta/llama-2-7b-chat-int8
withlitellm_provider: cloudflare
. I get it but why the sudden@
? API again?j) And then we have Cohere's missig models. Cohere chat API is not included, maybe because it's not supported? But the thing is the completion and chat APIs use same model names. So here we'll get
{litellm_provider}/{mode}/{model_name}
, yet another key scheme.And there's more. I think the goal's been to key by API name, but that breaks with Cohere at least. And other thing is that some are encoding model attributes in
litellm_provider
field, which is why OpenAI models are split over at least two providers. Since there is a single provider, the data should reflect that.The variety and dimensionality of the data has reached JSON's limits I think, at least with a single key. Considering this file is used in various ways, I think a step-by-step approach would be best to untangle this mess. How about we nail down the interim format and document it, and then consider what the data actually is.
E.g., it contains almost all the details about embedding models, except for output dimensions, which are a direct cost factor driving vector store memory consumption and important when sizing environments and comparing products for a use-case. So why not include dimensions? And why stop at the cost domain? It would be useful to have various model details in a structured format governed by a schema. E.g., there's many deprecated models in there, why not have a
status
field or some such to saydeprecated
? And a date to go with that?You start adding data and think about keeping it clean and this ball of mud could turn out to be a really valuable lookup/reference resource, a source of truth. But as said, this all takes time, so for now I'd be happy if we can agree on how to encode keys and providers. Data format and databases can come a bit later, along with process for keeping it up to date.
So, I think the current data should be discoverable, sortable in a meaningful way, and be updated in a consistent manner from ere on out, and cleaned up when time permits. And of course be programmatically accessible, and as long as it's a file it should be a format supporting comments and schemas. Being able to deserialize to Pydantic models should be a given.
I don't know what the best way to key this is, but it must be a permutation of the above examples. A reasonable taxonomy to me would be, e.g.,
{litellm_provider}/{mode}/{model_name}/{model_variant}
yielding from previous examples:hd/1792-x-1024/dall-e-3
->openai/img/gen/dall-e-3/1792-x-1024
(there's nohd
in the API and we can generate, edit, or create variants) orcohere.embed-multilingual-v3
->bedrock/cohere/embedding/cohere.embed-multilingual-v3
andcohere/completion/command
andcohere/chat/command
.So in summary stop using API names for the index key, we can add a
model_api_name
field or some such to retain that.And let's add dimensions and whatever useful details can be scraped.
Deprecation dates could give automatic warnings to users and not just as log messages but more prominently. And let's start thinking what this data could be tomorrow, what opportunities we're missing out on.
I think in long term the whole thing could be kept up to date by LLMs equipped with the right tools. E.g., give
gpt-4-vision-preview
a sandboxed Puppeteer or Playwright instance over a websocket, and it could navigate complex sites, interact with JS, take screenshots and analyze them if needed, and much more. Prompt it right (they're scared of free access to anything, brainwashed by trainers, poor things), and let it run test suites, use git and GitHub, code in peace, and CI the whole thing and we'd never have to worry about this crap again. Might as well ask the model to keep an eye out for news while it's perusing sites and write survey reports back home. Why not?Beta Was this translation helpful? Give feedback.
All reactions