Time to revamp model_prices_and_context_window.json? #1397

wfjt · 2024-01-10T09:11:33Z

wfjt
Jan 10, 2024

Let's cut to the chase, a lot of people contribute to this file and it's community maintained on best-effort basis. There's no documentation regarding it other than code, nor does there seem to be any prevailing convention in the way entries are structured.

The issue is the multi-dimensionality and permutations of the top-level key which has resulted in 4+ conventions making working with the file programmatically hard and confusing to new comes such as myself. Additionally, I think we could consolidate more details in this dataset and perhaps restructure it completely after the first steps have been taken.

The top-level key variants are:
a) {model_name} : original I take it, gpt-4
b) {litellm_provider}/{model_name}: azure/gpt-4
c) {model_variant}/{model_name}: e.g. standard/1024-x-1792/dall-e-3 and hd/1792-x-1024/dall-e-3
d) {litllm_provider}/{model_variant}/{model_name}: e.g., azure/standard/1024-x-1024/dall-e-3
e) {model_name} with mode encoded in litellm_provider, e.g., gpt-3.5-turbo-instruct has litellm_provider: text-completion-openai and mode: completion
f) {mode}_{model_name}: e.g., text-bison with litellm_provider: vertex_ai-text-models and mode: completion. Here maybe text-bison is the API name but why categorise the provider by mode as well? There's also palm/text-bison-safety-recitation-off with litellm_provider: palm, so following a different convention yet again.
g) From Bedrock we have cohere.embed-multilingual-v3 and bedrock/*/6-month-commitment/cohere.command-light-text-v14. Where's the bedrock/ prefix or left like that because it's the API name?
h) And this: cloudflare/@cf/meta/llama-2-7b-chat-int8 with litellm_provider: cloudflare. I get it but why the sudden @? API again?
j) And then we have Cohere's missig models. Cohere chat API is not included, maybe because it's not supported? But the thing is the completion and chat APIs use same model names. So here we'll get {litellm_provider}/{mode}/{model_name}, yet another key scheme.

And there's more. I think the goal's been to key by API name, but that breaks with Cohere at least. And other thing is that some are encoding model attributes in litellm_provider field, which is why OpenAI models are split over at least two providers. Since there is a single provider, the data should reflect that.

The variety and dimensionality of the data has reached JSON's limits I think, at least with a single key. Considering this file is used in various ways, I think a step-by-step approach would be best to untangle this mess. How about we nail down the interim format and document it, and then consider what the data actually is.

E.g., it contains almost all the details about embedding models, except for output dimensions, which are a direct cost factor driving vector store memory consumption and important when sizing environments and comparing products for a use-case. So why not include dimensions? And why stop at the cost domain? It would be useful to have various model details in a structured format governed by a schema. E.g., there's many deprecated models in there, why not have a status field or some such to say deprecated? And a date to go with that?

You start adding data and think about keeping it clean and this ball of mud could turn out to be a really valuable lookup/reference resource, a source of truth. But as said, this all takes time, so for now I'd be happy if we can agree on how to encode keys and providers. Data format and databases can come a bit later, along with process for keeping it up to date.

So, I think the current data should be discoverable, sortable in a meaningful way, and be updated in a consistent manner from ere on out, and cleaned up when time permits. And of course be programmatically accessible, and as long as it's a file it should be a format supporting comments and schemas. Being able to deserialize to Pydantic models should be a given.

I don't know what the best way to key this is, but it must be a permutation of the above examples. A reasonable taxonomy to me would be, e.g., {litellm_provider}/{mode}/{model_name}/{model_variant} yielding from previous examples:

hd/1792-x-1024/dall-e-3 -> openai/img/gen/dall-e-3/1792-x-1024 (there's no hd in the API and we can generate, edit, or create variants) or
cohere.embed-multilingual-v3 -> bedrock/cohere/embedding/cohere.embed-multilingual-v3 and cohere/completion/command and cohere/chat/command.

So in summary stop using API names for the index key, we can add a model_api_name field or some such to retain that.
And let's add dimensions and whatever useful details can be scraped.

Deprecation dates could give automatic warnings to users and not just as log messages but more prominently. And let's start thinking what this data could be tomorrow, what opportunities we're missing out on.

I think in long term the whole thing could be kept up to date by LLMs equipped with the right tools. E.g., give gpt-4-vision-preview a sandboxed Puppeteer or Playwright instance over a websocket, and it could navigate complex sites, interact with JS, take screenshots and analyze them if needed, and much more. Prompt it right (they're scared of free access to anything, brainwashed by trainers, poor things), and let it run test suites, use git and GitHub, code in peace, and CI the whole thing and we'd never have to worry about this crap again. Might as well ask the model to keep an eye out for news while it's perusing sites and write survey reports back home. Why not?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Time to revamp model_prices_and_context_window.json? #1397

{{title}}

Replies: 0 comments

Select a reply

Time to revamp model_prices_and_context_window.json? #1397

wfjt Jan 10, 2024

Replies: 0 comments

wfjt
Jan 10, 2024