-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a FIM pipeline to Providers #102
Conversation
model_in_request = completion_request['model'] | ||
if not model_in_request.startswith('anthropic/'): | ||
completion_request['model'] = f'anthropic/{model_in_request}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this block? I don't think any anthropic models start with anthropic. Either way, why do we change the model to have the anthropic prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to LiteLLM docs if we prepepnd anthropic/
to the model name it will always force the request to anthropic. Otherwise, it will try to map the model name to one that they have in their registry. However, it accepts most names. I tried with claude-3-5-sonnet-latest
and didn't accept it as a valid anthropic model. Hence, I made this change. But agree, maybe this is being too careful and can be removed.
Drafting because as it is FIM is working for Anthropic but will not work with the other providers that we have: OpenAI and llama.cpp. I will wait to rebase the changes by Jakub to get things working for all providers. |
Looks good! |
This adds a pipeline processing before the completion is ran where the request is either change or can be shortcut. This pipeline consists of steps, for now we implement a single step `CodegateVersion` that responds with the codegate version if the verbatim `codegate-version` string is found in the input. The pipeline also passes along a context, for now that is unused but I thought this would be where we store extracted code snippets etc. To avoid import loops, we also move the `BaseCompletionHandler` class to a new `completion` package. Since the shortcut replies are more or less simple strings, we add yet another package `providers/formatting` whose responsibility is to convert the string returned by the shortcut response to the format expected by the client, meaning either a reply or a stream of replies in the LLM-specific format. We use the `BaseCompletionHandler` as a way to convert to the LLM-specific format.
Related: #87, #43 The PR adds a FIM pipeline independent from chat completion pipeline. It could still be faulty since we need: - Message normalizer. We now expect all messages to have the key `messages`. However, there are incoming messages with `prompt`. - Secreets detector. There's the skeleton of a class called SecretAnalyzer that is meant to analyze the messages and return a warning if it detected a secret.
5239136
to
4ee938f
Compare
Don't merge yet. You should use the go port that uses the large signatures list we have I will have this up in a minute |
Related: #87, #43
The PR adds a FIM pipeline independent from chat completion pipeline.
It could still be faulty since we need:
messages
. However, there are incoming messages withprompt
.