ScannerAI - Receipt Scanner and Classifier

ScannerAI is a Python application that processes retail receipts using computer vision and AI to extract, classify and analyze receipt data. It features a graphical user interface for viewing and editing receipt information with support for COICOP (Classification of Individual Consumption According to Purchase) code classification.

Features

Receipt Processing: Extract text and structured data from receipt images and PDFs
Item Classification: Automatically classify items using COICOP codes
Interactive GUI: View, edit, and manage receipt data
Multiple OCR Options: Support for multiple OCR models including Gemini and OpenAI
Batch Processing: Process multiple receipts from a folder
Export Options: Save processed data in JSON or CSV formats
Progress Tracking: Visual progress tracking for batch operations

User Interface

Installation

1. Clone the repository

git clone https://github.com/datasciencecampus/receipt_scanner.git
cd receipt_scanner

2. Set up virtual environment

You are strongly recommended to install resources into a virtual environment.

python -m venv scanner-venv
source scanner-venv/bin/activate # source scanner-venv/Scripts/activate on Windows
python -m pip install --upgrade pip
python -m pip install .

[!NOTE] If you intend doing any development work, please install the package as editable and with the dev optional dependencies:
python -m pip install -e ".[dev]"
Moreover, once you have installed the package, please install the pre-commit hooks. These hooks help us to ensure repository security and a consistent code style.

3. Install the required dependencies

pip install -r requirements.txt

4. Set up configuration

Create a configuration file (e.g., config.txt) at the folder 'scannerai/_config/' with the following parameters:

DEBUG_MODE=False
ENABLE_PREPROCESSING=False
SAVE_PROCESSED_IMAGE=False
ENABLE_PRICE_COUNT=False
OCR_MODEL=3
CLASSIFIER_MODEL_PATH=/path/to/your/trained/model
LABEL_ENCODER_PATH=/path/to/your/label/encoder
GEMINI_API_KEY_PATH=/path/to/your/gemini/api/key
OPENAI_API_KEY_PATH=/path/to/your/openai/api/key
GOOGLE_CREDENTIALS_PATH=/path/to/your/google/credentials

Rember to update the path config = Config("/path/to/your/config.txt") in 'scannerai/_config/config.py'.

There is an example of 'scannerai_config.txt' for your reference.

Here is the descriptioin of the configuration parameters:

DEBUG_MODE: Enable/disable debug logging
ENABLE_PREPROCESSING: Enable/disable image preprocessing (not implemented at this moment, please set as False)
SAVE_PROCESSED_IMAGE: Save processed images
ENABLE_PRICE_COUNT: Enable token counting for API pricing
OCR_MODEL: Select OCR model
CLASSIFIER_MODEL_PATH: Path to trained classification model (if no trained model is available, please remove this parameter. The COICOP and confidence columns will display None)
LABEL_ENCODER_PATH: Path to label encoder (This should be generated together with the trained model. If not available, please remove this parameter.)
GEMINI_API_KEY_PATH: Path to Gemini API key file (only required if OCR_MODEL is set as 3)
OPENAI_API_KEY_PATH: Path to OpenAI API key file (only required if OCR_MODEL is set as 1 or 2)
GOOGLE_CREDENTIALS_PATH: Path to Google Cloud credentials (only required if OCR_MODEL is set as 3)

API Keys

The application requires API keys for OCR services:

Gemini API key (optional if Google's Gemini is selected as OCR_MODEL)
OpenAI API key (optional if OpenAI API is selected as OCR_MODEL)
Google Cloud credentials (optional if Google's Gemini is selected as OCR_MODEL)

Store API keys in separate files and update paths in the configuration.

Trained Model

We put a trained model as an example in receipt_scanner/src/scannerai/classifiers/trainedModels/, where you can set LRCountVectorizer.sav for CLASSIFIER_MODEL_PATH and encoder.pkl for LABEL_ENCODER_PATH.

The above model is trained based on Logistic Regression (LR) using a popular feature extraction method, Countvectorizer implemented in Scikit-learn Python package.

Usage

Starting the Application

streamlit run scripts/lcf_receipt_entry_streamlit.py

Basic Workflow

Click "Browse files" to select receipt images/PDFs
Click "Process Uploaded Files" to process the files
Edit shop name, total amount, payment mode and items including item name, price, or COICOP if needed
Add/delete items to the receipt if needed
Navigate between receipts using Previous/Next buttons
Save processed data in JSON or CSV format

Process a single receipt

Here is an example using Google's Gemini model to take image or pdf as input and output a dictionary of shop name, items and their prices, total amount and payment methods.

import json
import os
from scannerai.ocr.lcf_receipt_process_gemini import LCFReceiptProcessGemini

processor = LCFReceiptProcessGemini()
image_pathfile = os.path.join('/path/to/your/image.jpg')
result = processor.process_receipt(image_pathfile)
print(json.dumps(result, indent=2))

Project Structure

lcf_receipt_entry_streamlit.py: Main GUI application using streamlit
lcf_receipt_process_gemini.py: Receipt processing using Gemini Vision API
lcf_receipt_process_gpt4vision.py: Receipt processing using OpenAI model gpt-4o-mini
lcf_receipt_process_openai.py: Receipt processing using Tesseract to extract string from an image and then input to OpenAI model gpt-3.5-turbo
lcf_classify.py: COICOP classification implementation
scanner_utils.py: Utility functions
config.py: Configuration management

Dependencies

See requirements.txt for detailed dependencies.

Configuration

Pre-commit actions

This repository contains a configuration of pre-commit hooks. These are language agnostic and focussed on repository security (such as detection of passwords and API keys). If approaching this project as a developer, you are encouraged to install and enable pre-commits by running the following in your shell: 1. Install pre-commit:

  ```
  pip install pre-commit
  ```

Enable pre-commit:
```
pre-commit install
```
Once pre-commits are activated, whenever you commit to this repository a series of checks will be executed. The pre-commits include checking for security keys, large files and unresolved merge conflict headers. The use of active pre-commits are highly encouraged and the given hooks can be expanded with Python or R specific hooks that can automate the code style and linting. For example, the flake8 and black hooks are useful for maintaining consistent Python code formatting.

NOTE: Pre-commit hooks execute Python, so it expects a working Python build.

Data Science Campus

At the Data Science Campus we apply data science, and build skills, for public good across the UK and internationally. Get in touch with the Campus at [email protected].

License

The code, unless otherwise stated, is released under the MIT Licence.

The documentation for this work is subject to © Crown copyright and is available under the terms of the Open Government 3.0 licence.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
scripts		scripts
src		src
LICENSE		LICENSE
README.md		README.md
interface_streamlit.png		interface_streamlit.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScannerAI - Receipt Scanner and Classifier

Features

User Interface

Installation

API Keys

Trained Model

Usage

Starting the Application

Basic Workflow

Process a single receipt

Project Structure

Dependencies

Configuration

Pre-commit actions

Data Science Campus

License

About

Releases

Packages

Languages

License

datasciencecampus/receipt_scanner

Folders and files

Latest commit

History

Repository files navigation

ScannerAI - Receipt Scanner and Classifier

Features

User Interface

Installation

API Keys

Trained Model

Usage

Starting the Application

Basic Workflow

Process a single receipt

Project Structure

Dependencies

Configuration

Pre-commit actions

Data Science Campus

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages