-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multimodal-in and multimodal-out #18
Comments
We provide a script for multimodal inference: you can follow the instructions to run the script. |
Thanks for you good job! I tried the multi-modal in and out script, But it generates nothing when prompt to generate images. What's the possible reason? |
@Mr-Loevan Hi,can you give us more details? For example, your input.json and the output of your model. I just tried to use the following input.json for inference: [
{
"type": "text",
"content": "Draw a picture showing a serene lakeside view at sunrise with mist rising from the water, surrounded by dense pine forests and mountains in the background."
}
] The output of the model is as follows: It is a picturesque scene that reflects the beauty of nature in all its glory. The image captures the early morning hours when the sun rises over the horizon, casting a warm glow over the landscape. The lake surface is mirror-like, creating a reflection of the surrounding trees and mountains. There is a sense of tranquility and peace in the air, as if the area is protected from the hustle and bustle of everyday life.
<img: ./outputs/inference/1.png> |
I found that if I remove the requirement of format, it would generate more things |
We will implement the script so that the model can take image as input.
The text was updated successfully, but these errors were encountered: