Passing image data from custom function as UserMessage media #1321
Replies: 2 comments
-
Hello wahts up!!?? i hpoe it can help you! When working with image data and OpenAI's models, there are a few strategies to handle the process effectively. Here's a structured approach to achieving this: 1. Directly Passing Image Data via PromptingIf you want to pass image data to an OpenAI model to generate metadata or perform image analysis, you'll need to handle it in a way that the model can interpret. However, as of my last update, OpenAI’s GPT-3.5 and GPT-4 models primarily handle text input and don't directly process binary image data. 2. Custom Function to Handle Image DataSince OpenAI models generally don't process raw image data directly, you would need to perform a few steps to handle image data:
3. Creating a Custom Function for IntegrationIf you're integrating this process into a broader application:
Here’s a simple workflow example: from PIL import Image
import openai
def extract_image_description(image_path):
# Placeholder for actual image processing logic
# For example, use OCR or image analysis API
return "Description of the image."
def generate_metadata(description):
openai.api_key = 'YOUR_OPENAI_API_KEY'
response = openai.Completion.create(
model="text-davinci-003",
prompt=f"Generate metadata for the following image description: {description}",
max_tokens=150
)
return response.choices[0].text.strip()
def main(image_path):
description = extract_image_description(image_path)
metadata = generate_metadata(description)
print(metadata)
if __name__ == "__main__":
main("path_to_your_image.jpg") 4. Alternative Approaches
In summary, since direct image processing is not supported, extracting descriptive text or metadata from the image using other tools or services and then using that information with OpenAI’s text-based models is the recommended approach. i really hope it may help |
Beta Was this translation helpful? Give feedback.
-
So I ended up with a single "GenerateImageMetadataTool" function that does the following in its
private Map<String, Object> generateMetadata(ImageData imageData, GenerateImageMetadataTool.Request request) {
ChatClient chatClient = ChatClient.builder(chatModel).build();
ChatClient.ChatClientRequest chatClientRequest = chatClient.prompt();
Resource image = new InputStreamResource(imageData.getInputStream());
Message userMessage = new UserMessage(request.userPrompt, List.of(new Media(MimeTypeUtils.parseMimeType(imageData.getMimeType().toString()), image)));
Message systemMessage = new SystemMessage(request.systemPrompt);
chatClientRequest.messages(List.of(systemMessage, userMessage));
Map<String, Object> result = chatClientRequest.call().entity(new ParameterizedTypeReference<Map<String, Object>>() {
});
LOG.info("Successfully generated image metadata for content item with id {} and property {}: {}", request.id, request.property, result);
return result;
} The prompt for the tool specifies the userPrompt and systemPrompt to be used for the metadata generation request, e.g.:
Not sure if this is good practice but it works... |
Beta Was this translation helpful? Give feedback.
-
Hi,
I have a custom function that is retrieving image data from a 3rd party system.
The next step in the prompt would be to have the LLM/OpenAI generate metadata for it.
I understand the binary data could be passed as Media with the UserMessage.
I am wondering how this next step could be implemented:
I hope this makes some sense. Any feedback is much appreciated!
--Henning
Beta Was this translation helpful? Give feedback.
All reactions