Skip to content

Commit

Permalink
Containerize Sitelen Pona Writing Practice app
Browse files Browse the repository at this point in the history
- Create Dockerfile to build the application image
- Add entrypoint script for Docker container
- Add script to ensure model is downloaded
- Add .dockerignore to exclude unnecessary files from Docker image
- Add Streamlit configuration file
- Update README with Docker setup instructions

#28
  • Loading branch information
dr-rompecabezas committed Feb 25, 2025
1 parent 9431a45 commit 2131bce
Show file tree
Hide file tree
Showing 6 changed files with 171 additions and 19 deletions.
51 changes: 51 additions & 0 deletions writing-app/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Git
.git
.gitignore

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
venv/
ENV/
.venv/

# Environment Variables
.env

# IDE and Development Tools
.idea/
.vscode/
*.swp
*.swo
.ruff_cache/
.pytest_cache/
.python-version
.DS_Store
uv.lock

# Docker
Dockerfile
.dockerignore

# Tests
tests/
8 changes: 8 additions & 0 deletions writing-app/.streamlit/config.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[global]
disableWidgetStateDuplicationWarning = true

[server]
maxUploadSize = 20

[browser]
gatherUsageStats = false
36 changes: 36 additions & 0 deletions writing-app/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Use Python 3.11 as base image
FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Install system dependencies required for OpenCV and other packages
RUN apt-get update && apt-get install -y \
libgl1-mesa-glx \
libglib2.0-0 \
&& rm -rf /var/lib/apt/lists/*

# Copy requirements file
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Copy the application code and other necessary files
COPY . .

# Make the entrypoint script executable
RUN chmod +x scripts/docker_entrypoint.sh

# Create directories for models and templates if they don't exist
RUN mkdir -p models templates

# Expose the port Streamlit runs on
EXPOSE 8501

# Set environment variables for Streamlit
ENV STREAMLIT_SERVER_PORT=8501
ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0

# Use the entrypoint script
ENTRYPOINT ["./scripts/docker_entrypoint.sh"]
61 changes: 42 additions & 19 deletions writing-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,12 @@ A Streamlit-based web application for recognizing hand-drawn Sitelen Pona charac
## Table of Contents

- [Features](#features)
- [Technical Approach](#technical-approach)
- [Requirements](#requirements)
- [Setup and Running](#setup-and-running)
- [Option 1: Using Docker (Recommended)](#option-1-using-docker-recommended)
- [Option 2: Local Setup](#option-2-local-setup)
- [Testing](#testing)
- [Technical Approach](#technical-approach)
- [Project Structure](#project-structure)
- [Notes](#notes)
- [License](#license)
Expand All @@ -28,9 +30,7 @@ A Streamlit-based web application for recognizing hand-drawn Sitelen Pona charac
- [4. MediaPipe with EfficientNet (`mediapipe.app.py`)](#4-mediapipe-with-efficientnet-mediapipeapppy)
- [Results and Limitations](#results-and-limitations)
- [Conclusions and Future Directions](#conclusions-and-future-directions)
- [Screenshots](#screenshots)
- [Acceptable Solution Using MediaPipe with MobileNet](#acceptable-solution-using-mediapipe-with-mobilenet)
- [Unacceptable Solutions Using OpenCV and MediaPipe with EfficientNet](#unacceptable-solutions-using-opencv-and-mediapipe-with-efficientnet)
- [Screenshots of Failed Experiments Using OpenCV and MediaPipe with EfficientNet](#screenshots-of-failed-experiments-using-opencv-and-mediapipe-with-efficientnet)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->

Expand All @@ -49,28 +49,35 @@ A Streamlit-based web application for recognizing hand-drawn Sitelen Pona charac

<img width="1728" alt="writing-app-screenshot" src="https://github.com/user-attachments/assets/3a0a485d-9d09-4a7c-9caa-d801c7fdbaeb" />

## Technical Approach
## Requirements

Pre-Processing Pipeline:
- Python 3.11 (recommended)
- Dependencies listed in `pyproject.toml`

1. Image Loading & Color Space: OpenCV (cv2)
2. Resizing & Canvas Centering: OpenCV (cv2)
3. Feature Extraction: MobileNetV3 (via MediaPipe Tasks)
4. Embedding Comparison: NumPy (cosine similarity)
## Setup and Running

Neural Network Details:
### Option 1: Using Docker (Recommended)

* **Model**: MobileNetV3-Small (Quantized)
* **Input Size**: 224x224 RGB
* **Output**: 1x1024 L2-normalized embedding
* **Framework**: MediaPipe Tasks Vision
The easiest way to run the application is using Docker:

## Requirements
1. Build the Docker image:
```bash
docker build -t writing-app .
```

- Python 3.11 or higher
- Dependencies listed in `pyproject.toml`
2. Run the container:
```bash
docker run -p 8501:8501 writing-app
```

## Setup and Running
3. Open your browser and navigate to:
```
http://localhost:8501
```

The Docker container will automatically download any required models on first run.

### Option 2: Local Setup

1. Create and activate a Python virtual environment:

Expand Down Expand Up @@ -149,6 +156,22 @@ To run the tests locally:

The tests are also automatically run via GitHub Actions whenever changes are made to the `writing-app` directory.

## Technical Approach

Pre-Processing Pipeline:

1. Image Loading & Color Space: OpenCV (cv2)
2. Resizing & Canvas Centering: OpenCV (cv2)
3. Feature Extraction: MobileNetV3 (via MediaPipe Tasks)
4. Embedding Comparison: NumPy (cosine similarity)

Neural Network Details:

* **Model**: MobileNetV3-Small (Quantized)
* **Input Size**: 224x224 RGB
* **Output**: 1x1024 L2-normalized embedding
* **Framework**: MediaPipe Tasks Vision

## Project Structure

```text
Expand Down
8 changes: 8 additions & 0 deletions writing-app/scripts/docker_entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
#!/bin/bash
set -e

# First ensure the model exists
python scripts/ensure_model.py

# Then run the Streamlit app
exec streamlit run app.py
26 changes: 26 additions & 0 deletions writing-app/scripts/ensure_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/env python3
import os
from pathlib import Path
import urllib.request

def download_model(model_path: str, model_url: str):
"""Download the model if it doesn't exist."""
if not os.path.exists(model_path):
print(f"Downloading model to {model_path}...")
os.makedirs(os.path.dirname(model_path), exist_ok=True)
urllib.request.urlretrieve(model_url, model_path)
print("Model downloaded successfully!")
else:
print(f"Model already exists at {model_path}")

def main():
# MediaPipe model URLs and paths
models = {
"models/mobilenet_v3_small.tflite": "https://storage.googleapis.com/mediapipe-models/image_embedder/mobilenet_v3_small/float32/1/mobilenet_v3_small.tflite",
}

for model_path, model_url in models.items():
download_model(model_path, model_url)

if __name__ == "__main__":
main()

0 comments on commit 2131bce

Please sign in to comment.