-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for read/write prompts and/or responses to a file (+ OpenAI Batching Support?) #255
Comments
Would something like this work? from mirascope.openai import OpenAICall
class MyCall(OpenAICall):
prompt_template = "Template {var}"
var: str = ""
my_call = MyCall()
values = ["one", "two", "three"]
for value in values:
messages = my_call.messages()
with open(f"filename_{value}", "w") as f:
f.write(messages) I know there's details here that are missing, but wondering if this is sufficient or if you want this baked directly into the class? Something like pass a list of tuples where each tuple contains all the template variables and then you can call something like |
I was thinking more of a way that was baked into the class. Or a way to take in a series of |
So there's one path I think where we figure out how to specifically handle OpenAI batch functionality and add support to classes like The issue I see is that trying to better generalize this may result in not being able to make it a single line without overcomplicating that line. Could you provide a minimal real example of an expected output you might expect? I'd love to better understand the use-case, particularly wrt. one vs. multiple calls and how those outputs are similar / differ. |
Now that the Anthropic API has batch support as well, we should make sure to design this to support both Anthropic and OpenAI. |
Description
I would like to be able to load and write Mirascope objects directly to a file for analysis and processing in a very simple way. Here are a few reasons:
Inference optimizations:
OpenAI has a batch API for inference that is 50% cheaper for all of its models. vLLM and many other perform better throughput under higher load and have support for offline inference. While you can wrap Mirascope in asynch calls inside or outside of the library, you might not want the client to handle this in all use cases. The context in many cases for better performance is usually held at the inference server, not the client (current load, priority of other calls, GPU configuration, recommended simultaneous calls, etc.)
Prompt/Prefix caching can be further optimized Ahead-of-Time instead of Just-in-Time as well.
Timing:
I might want these calls to be a part of a chron job anyways
Compute Re-use:
I might want to re-evaluate the series of prompts on different models, or I might want to resume with a different model inside a chain of calls.
Analysis:
These files can be easily imported to other systems for analysis and processing without official integrations.
The text was updated successfully, but these errors were encountered: