Enabling GPT-4 Turbo with Vision

This repository now includes an example of integrating GPT-4 Turbo with Vision with Azure AI Search. This feature enables indexing and searching images and graphs, such as financial documents, in addition to text-based content.

Feature Overview

Document Handling: Source documents are split into pages and saved as PNG files in blob storage. Each file's name and page number are embedded for reference.
Data Extraction: Text data is extracted using OCR.
Data Indexing: Text and image embeddings, generated using Azure AI Vision (Azure AI Vision Embeddings), are indexed in Azure AI Search along with the raw text.
Search and Response: Searches can be conducted using vectors or hybrid methods. Responses are generated by GPT-4 Turbo with Vision based on the retrieved content.

Getting Started

Prerequisites

Create a Computer Vision account in Azure Portal first, so that you can agree to the Responsible AI terms for that resource. You can delete that account after agreeing.
The ability to deploy a GPT-4 Turbo with Vision model in the supported regions. If you're not sure, try to create a deployment from your Azure OpenAI deployments page. You should be able to select:

Model Version

gpt-4 vision-preview
Ensure that you can deploy the Azure OpenAI resource group in a region where all required components are available:
- Azure Open AI models
  - gpt-35-turbo
  - text-embedding-ada-002
  - gpt-4v
- Azure AI Vision

Setup and Usage

Update repository: Pull the latest changes.
Enable GPT-4 Turbo with Vision: Set the environment variable with azd env set USE_GPT4V true. This flag is used to deploy necessary components for vision fuctionality and to toggle UI components.
Clean old deployments (optional): Run azd down --purge for a fresh setup.
Start the application: Execute azd up to build, provision, deploy, and initiate document preparation.
Web Application Usage:
- Access the developer options in the web app and select "Use GPT-4 Turbo with Vision".
- Sample questions will be updated for testing.
- Interact with the questions to view responses.
- The 'Thought Process' tab shows the retrieved data and its processing by GPT-4 Turbo with Vision.

Feel free to explore and contribute to enhancing this feature. For questions or feedback, use the repository's issue tracker.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpt4v.md

gpt4v.md

Enabling GPT-4 Turbo with Vision

Feature Overview

Getting Started

Prerequisites

Setup and Usage

Files

gpt4v.md

Latest commit

History

gpt4v.md

File metadata and controls

Enabling GPT-4 Turbo with Vision

Feature Overview

Getting Started

Prerequisites

Setup and Usage