-
Notifications
You must be signed in to change notification settings - Fork 61
/research
using perplexity api
#866
Comments
/start |
Skipping |
@pavlovcik I'll get this done this afternoon |
These are the models available through the API, the replit one is the only one that isn't a chat model. What model do you want to run? Looking at their UI you have claude, gpt-4 and perplexity to choose from but it's not clear right off the bat what model is perplexity from those options on the api reference docs, perhaps the perplexity model isn't available through the API? Or is it a white-labeled llama 70b? It defaults to Mistral 7B so I'm assuming it's that and to just run with the default? |
Should have looked into this first, but their context limit isn't capable of handling our needs at least not until they increase the limits. Where possible, we try to match the Hugging Face implementation. We are open to adjusting the API, so please reach out with feedback regarding these details.
|
What if on issue view, we make On the pull request review, we should use GPT-4 (with code interpreter?) so that we can pass in the diff, the conversation, and it can suggest direct code adjustments. Perplexity Pros:
GPT-4 Pros:
I'm using the free version of perplexity so I only have used the |
I hear what you are saying Pav and I think until they up the context limit our hands are tied. Review currently doesn't care about linked and conversation etc Ask does consider all of the linked context as well as the current issue context which in my demos with miniscule issues, convos and prs it was eating up 4k tokens like it was nothing. The original scope of ask was that it would take as much context as possible to be able to provide better responses for research/issue brainstorming/planning. What I had tried was just replaced the askGPT core api call and replaced it with perplexity, also swapped out the gptContextCall for perp but couldn't get any decent responses due to context window and formatting I think GPT3.5 will perform better with the additional context than perp with the reduced context window but an improved model. Soon as that 16k window hits I think switching it out would be the best idea, although pricing adds a matter of perplexity to the AI feature suite. We'd need to allow for a switch of sorts so that if no perp API KEY is provided but an OpenAI one is then we use the right model I took the 7 day free trial for the annual plan and done a bit of playing around myself, pretty good i must say.
Isn't code interpreter just a python plugin custom text splitter/parser?
From what I gathered, I think they are using Mistral as their main model |
Add the spec and if the token counter is too high, then perhaps just the sender comment. I'll only know for sure how valuable the feature is when testing with real issues. But intuitively the more context we provide, the more relevant results I would expect. |
I'll give it a try and open the draft What I'm troubled with is:
I'll add the spec, count the links in the body, determine tokencount of spec and question, if it's 1/3 or more than that'll do? If it's less than 1/3 grab whatever the body is of the linked context and fire? telegrammed you my api key for perp |
Hi @pavlovcik @Keyrxng - I believe we can make this easier If this looks useful (we're used in production)- please let me know how we can help. UsagePerplexity request curl http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "perplexity/mistral-7b-instruct",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7
}' gpt-3.5-turbo request curl http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7
}' claude-2 request curl http://0.0.0.0:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-2",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7
}' |
Available commands
|
I appreciate you taking the time Ishaan, and while I'm but a lowly grunt, I do think that it's more than what we need at the moment although if there is need for more than a couple of models then it may be considered at that point. For me personally, I'll likely make use of it in personal projects so again, appreciate the shout. I'm working on improving tokenization before the call on our end as Mistral's way is unique and not provided by TikToken by default. It was over estimating by near double in most cases or under estimating by half in the others. |
Agreed with @Keyrxng but thanks for letting us know about your product. I'm also curious to know how you found this issue @ishaan-jaff
Not sure if you're using the code I shared in the other thread under github-agents, but that is specifically for gpt tokenization. Different models I guess have different encoders. |
I am yes or at least drew from that initially I'm hoping I can just string the entire convo as exampled by perp and mistral docs using the special characters it's been trained with then it's just a case of either
|
I presume that all these commercial models have solutions for token counting, like OpenAI's tiktoken. |
Well, yeah it tends to vary from model to model depending on how that model was trained, what special characters were used etc For instance in the context of Mistral instruct:
|
The above was taken from the Mistral docs whereas the example below is from perplexity and there is clear differences between the two. I'm inclined to believe the Mistral docs above the Perp docs but still leaves me wondering slightly
|
reading your comment again I may have misunderstood at first. Perp uses the says API structure as OpenAI so it returns the tokens used for input, output and both but it's after the fact obviously. some shit QA: |
So the underlying isn't Tiktoken it's google SentencePieceProcessor, tried to get something close with TikToken but no joy. I've had to get the js wrapper for SPP but the prompt tokenization is just about spot on some more shit QA: |
https://blog.perplexity.ai/blog/introducing-pplx-api
Perplexity is optimized for q&a and live web research so perhaps it's a better backend for the ask command.
I use their consumer facing product and it's very effective.
The text was updated successfully, but these errors were encountered: