Skip to content

Commit

Permalink
added updated docs (#503)
Browse files Browse the repository at this point in the history
* added updated docs

* added backend readme

* edited OCR readme

* minor edits

* formatting edits

* edited azure link

* added UI with mermaind.js

* added UI with mermaid(2)

* added UI with mermaid(3)

* add link to user guide

* deleted diagram

* added note about swagger docs

* added gradle update for swagger docs
arinkulshi-skylight authored Jan 14, 2025
1 parent 0d8b79f commit 626cce4
Showing 6 changed files with 443 additions and 46 deletions.
172 changes: 138 additions & 34 deletions OCR/README.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,63 @@
## OCR
# OCR Layer - ReportVision

The **OCR Layer** in the ReportVision project processes document images, performs segmentation and optical character recognition (OCR), and computes accuracy metrics by comparing OCR outputs to ground truth data.

---

## Table of Contents
1. [Introduction](#introduction)
2. [Installation](#installation)
3. [Running the Application](#running-the-application)
4. [Development Tools](#development-tools)
5. [Testing](#testing)
6. [End-to-End Benchmarking](#end-to-end-benchmarking)
7. [Dockerized Development](#dockerized-development)
8. [Benchmarking](#end-to-end-benchmarking)
9. [Project Architecture](#project-architecture)
10. [API Endpoints](#api-endpoints)


---

## Introduction

The OCR layer uses **Poetry** for dependency management and virtual environment setup. It provides:
- An API for performing OCR operations.
- Support for benchmarking OCR accuracy.
- Configuration for different OCR models and segmentation templates.

### Installation

### Prerequisites
- Python 3.9 or later
- [Poetry](https://python-poetry.org/) for dependency management
- Docker (optional for containerized development)

```shell
pipx install poetry
```

### Running The Application
Activate the virtual environment and install dependencies, all subsequent commands assume you are in the virtual env

```shell
poetry shell
poetry install
```

Run unit tests

```shell
poetry run pytest
```

Run benchmark tests

```shell
cd tests
poetry run pytest benchmark_test.py -v
fastapi dev ocr/api.py
```

poetry run pytest bench_test.py -v
### Testing

Run main, hoping to convert this to a cli at some point
Run unit tests

```shell
poetry run main
```shell
poetry run pytest
```

To build the OCR service into an executable artifact

```shell
poetry run build
```
### Development Tools

Adding new dependencies

@@ -82,12 +101,37 @@ To run the API in prod mode
poetry run api
```

### Test Data Sets

You can also run the script `pytest run reportvision-dataset-1/medical_report_import.py` to pull in all relevant data.
To build the OCR service into an executable artifact

```shell
poetry run build
```

### Dockerized Development

It is also possible to run the project in a collection of docker containers. This is useful for development and testing purposes as it doesn't require any additional dependencies to be installed.

### Run end-to-end benchmarking
To start the containers, run the following command:

```shell
docker compose -f dev-env.yaml up
```

This will start the following containers:

- ocr: The OCR service container
- frontend: The frontend container

The frontend container will automatically reload when changes are made to the frontend. To access the frontend, navigate to http://localhost:5173 in your browser.

The OCR service container will restart automatically when changes are made to the OCR code. To access the API, navigate to http://localhost:8000/ in your browser.


### End to End Benchmarking

#### Overview
End-to-end benchmarking evaluates OCR accuracy by:

End-to-end benchmarking scripts can:

@@ -117,21 +161,81 @@ Run notes:
* Benchmark takes one second per segment for OCR using the default `trocr` model. Please be patient or set a counter to limit the number of files processed.
* Only one segment can be input at a time

### Dockerized Development

It is also possible to run the entire project in a collection of docker containers. This is useful for development and testing purposes as it doesn't require any additional dependencies to be installed on your local machine.
### Test Data Sets

To start the containers, run the following command:
You can run the script `pytest run reportvision-dataset-1/medical_report_import.py` to pull in all relevant data.

```shell
docker compose -f dev-env.yaml up
```

This will start the following containers:

- ocr: The OCR service container
- frontend: The frontend container
## Project Architecture

The frontend container will automatically reload when changes are made to the frontend code. To access the frontend, navigate to http://localhost:5173 in your browser.
The OCR Layer is organized as follows:

The OCR service container will restart automatically when changes are made to the OCR code. To access the API, navigate to http://localhost:8000/ in your browser.
- **`ocr/`**:
- **`api.py`**: Defines the API for the OCR service.
- **`main.py`**: Entry point script to run the OCR service.
- **`segmenter.py`**: Handles image segmentation based on templates and labels.
- **`ocr_engine.py`**: OCR logic using the specified OCR models.
- **`metrics.py`**: Computes metrics (e.g., confidence, Levenshtein distance) by comparing OCR results with ground truth.
- **`config.py`**: Contains configuration files for paths, environment variables, and model settings.

- **`tests/`**: Contains unit tests, integration tests, and benchmarking scripts.
- **`benchmark_test.py`**: Tests benchmarking logic for OCR and metrics.
- **`unit_test.py`**: Includes unit tests for individual components of the OCR service.
- **`benchmark_main.py`**: Main script for running end-to-end benchmarking, including segmentation, OCR, and metrics computation.

- **`data/`**: location of segmentation templates, labels, ground truth, and test datasets (not included in the repository by default).

- **`reportvision-dataset-1/`**: Example dataset folder for running benchmarks and tests.
- **`medical_report_import.py`**: Script to import and prepare medical reports for testing.

- **`Dockerfile`**: Defines the container for running the OCR service in a Dockerized environment.

- **`dev-env.yaml`**: Docker Compose file for setting up a development environment with containers for the OCR service and frontend.

- **`pyproject.toml`**: Poetry configuration file specifying project dependencies and settings.

- **`poetry.lock`**: Lock file generated by Poetry to ensure dependency consistency.

## API Endpoints

### For Swagger Docs Start API and go to /docs endpoint

ex: http://localhost:8000/docs

The OCR service exposes the following API endpoints:

#### Health Check
- **`GET /`**
- **Description**: Returns the status of the OCR service.
- **Response**: Status message indicating the service's health.

#### Image Alignment
- **`POST /image_alignment/`**
- **Description**: Aligns a source image with a segmentation template.
- **Request Body**:
- `source_image` (Base64-encoded string): The source image to align.
- `segmentation_template` (Base64-encoded string): The segmentation template to align with.
- **Response**:
- Base64-encoded string of the aligned image.

#### Image File to Text
- **`POST /image_file_to_text/`**
- **Description**: Processes an image file and a segmentation template to extract text based on labeled regions.
- **Request Body**:
- `source_image` (file): The uploaded source image file.
- `segmentation_template` (file): The uploaded segmentation template file.
- `labels` (JSON string): Defines labeled regions in the segmentation template.
- **Response**:
- JSON object containing text extracted from labeled regions.

#### Image to Text
- **`POST /image_to_text`**
- **Description**: Processes Base64-encoded images and extracts text from labeled regions.
- **Request Body**:
- `source_image` (Base64-encoded string): The source image.
- `segmentation_template` (Base64-encoded string): The segmentation template.
- `labels` (JSON string): Defines labeled regions in the segmentation template.
- **Response**:
- JSON object containing text extracted from labeled regions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -19,8 +19,9 @@

## Overview

Describe the purpose of your project. Add additional sections as necessary to help collaborators and potential collaborators understand and use your project.

Please see the [UserGuide](./user_guide.md) to get a overview of this project.


## Public Domain Standard Notice
This repository constitutes a work of the United States Government and is not
subject to domestic copyright protection under 17 USC § 105. This repository is in
158 changes: 158 additions & 0 deletions backend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
# Backend Middleware - Spring Boot Application

This document provides a guide for the **Backend Middleware** of the ReportVision project. This middleware bridges the **frontend app** with the **OCR backend**

---

## Table of Contents
1. [Introduction](#introduction)
2. [Installation](#installation)
3. [Testing](#testing)
4. [Project Architecture](#project-architecture)
5. [Key Features](#key-features)
6. [API Endpoints](#api-endpoints)
7. [Troubleshooting](#troubleshooting)


## Introduction

The backend of ReportVision is a **Spring Boot** application designed to:
- Serve as middleware connecting the frontend with OCR.
- Manage storage of template in the DB
- Act as a middle layer to pass data for OCR extraction


### Installation

## To Run the Project please ensure you have docker set up
1. Clone the repository:
```bash
git clone https://github.com/CDCgov/ReportVision.git
cd ReportVision/backend
2. Run the app
Make sure you are in root

```shell
docker-compose -f backend.yaml up --build
```

3. Verify the app is running by visiting http://localhost:8081/api/health

# Testing

You can run gradle tests by bash into container

```shell
docker ps
```
Get the container id

```shell
docker exec -it <CONTAINER_ID> /bin/bash
```

```shell
./gradlew test
```

## Project Architecture

The backend is organized into the following directories and files:

- **`src/main/java/gov/cdc/reportvision/`**:
- **`controllers/`**: handle API requests from the frontend.
- **`services/`**: service layer for managing templates, data extraction, and interactions with the OCR backend.
- **`models/`**: Data models representing application entities
- **`repositories/`**: Interfaces for database operations,
- **`config/`**: Configuration files for security, database connections, and CORS policies.
- **`utils/`**: Utility classes for validation, logging, and file manipulation.
- **`src/test/`**: Includes unit and integration tests for the backend.
- **`Dockerfile`**: Docker configuration file for containerizing the application.

- **`README.md`**: Documentation for the backend application.


## Key Features

#### Template Management
- **Upload, retrieve, and delete templates**:
- Allows users to upload new templates for document segmentation.
- Retrieve a list of all saved templates.
- Delete templates by their unique ID.

#### Data Extraction
- **Document Processing**:
- Connects to the OCR backend to process documents using predefined templates.
- Extracts data based on segmented areas defined in the templates.
- Returns structured extracted data.

#### Validation and Error Handling
- **Data Integrity Checks**:
- Validates user inputs and template configurations.
- Provides error messages for invalid requests or processing failures.

#### Secure Integration
- **Authentication**:
- Implements JWT based authentication.
- Configurable CORS policies to control frontend and third-party access.


## API Endpoints

The backend middleware exposes the following RESTful API endpoints:

### For Swagger Docs Start API and go to /swagger-ui.html endpoint

ex: http://localhost:8080/swagger-ui.html

#### Health Check
- **`GET /api/health`**
- **Description**: Returns the status of the backend server.
- **Response**: Status message indicating the server's health.
#### Template Management
- **`POST /api/templates`**
- **Description**: Upload a new template for document segmentation.
- **Request Body**: JSON containing template details.
- **Response**: Confirmation of the uploaded template.
- **`GET /api/templates`**
- **Description**: Retrieve a list of all available templates.
- **Response**: JSON array of template metadata.
- **`DELETE /api/templates/{id}`**
- **Description**: Delete a specific template by its unique ID.
- **Response**: Confirmation of deletion.
#### Data Extraction
- **`POST /api/extract`**
- **Description**: Process a document using a selected template and return extracted data from OCR.
- **Request Body**: JSON containing the document and selected template ID.
- **Response**: JSON object with extracted data.
#### Configuration Management
- **`GET /api/config`**
- **Description**: Retrieve the current configuration settings of the application.
- **Response**: JSON object with configuration details.
## Troubleshooting
### Common Issues
#### Database Connection Fails
- **Cause**: The backend is unable to connect to the database.
- **Solution**:
- Ensure the database server is running.
- Verify that the `DB_URL`, `DB_USERNAME`, and `DB_PASSWORD` environment variables are correctly configured.
#### CORS Errors
- **Cause**: Frontend requests are being blocked due to Cross-Origin Resource Sharing (CORS) policies.
- **Solution**:
- Update the `CorsConfig` class in the `config/` directory.
- Add the necessary origins to the allowed list.
#### OCR Service Not Responding
- **Cause**: The backend is unable to communicate with the OCR service.
- **Solution**:
- Verify that the `OCR_SERVICE_URL` is correctly set
1 change: 1 addition & 0 deletions backend/build.gradle.kts
Original file line number Diff line number Diff line change
@@ -38,6 +38,7 @@ dependencies {
developmentOnly("org.springframework.boot:spring-boot-devtools")
implementation("org.springframework.boot:spring-boot-starter-validation")
implementation("com.h2database:h2")
implementation("org.springdoc:springdoc-openapi-starter-webmvc-ui:2.3.0")
}

tasks.withType<Test> {
Loading

0 comments on commit 626cce4

Please sign in to comment.