Live Desktop Translator

Proof of concept for a desktop application that translates on-screen elements based on optical character recognition (OCR).

🔍 Overview

While web pages can be easily translated using browser extensions, the textual elements of some applications cannot be easily extracted (e.g. video games, images, scanned documents, ...). The goal of this application is to 1) capture a portion of the screen, 2) recognize the text it contains, 3) translate it and finally 4) paint it back on the screen, all in real time. Currently, this application comes in the form of an overlay and implements SuryaOCR for layout understanding, EasyOCR for optical character recognition and either Argos Translate for local translation or MyMemory for online translation.

Below is an example of translation from English to French by overlaying the application on top of a PDF:

This main limitation of this application is its ability to correctly understand the layout (i.e. should the recognized words be concatenated into a sentence or not). SuryaOCR, the layout understanding model currently in use, was trained on structured documents (mainly PDFs and newspapers) and does not work well in various scenarios such as algorithms, tables, game footage, captions, ...

🛠️ Development

Modal for adding/removing other input/output languages
Application packaging
Parameters save button
set updateIgnoreMouseEvents using coordinates instead of alpha value
Identify sub-tasks for specialized layout detection models (i.e. video game, outdoor, PDF, etc)

⚙️ Starting the app (for development)

1. Electron

cd electron_gui

# Install the required packages
npm install

# Launch the Electron app
npm start --enable-logging

2. Python

Option 1 (recommended) - With Anaconda

Download Anaconda.

For Windows users, if conda is not recognized as a command by the terminal, add C:\ProgramData\anaconda3\Scripts to the user's Path environment variables.

cd python_server

# Create the virtual environment and install the packages with conda
conda env create --file environment.yml --prefix ./ldtvenv

# Activate the virtual environment
conda activate .\ldtvenv

Option 2 (untested) - With pip

Download Python 3.12.7 (don't forget to add it to the PATH during install).

cd python_server

# Create the empty virtual environment
py -3.12 -m venv ldtvenv

# Activate the virtual environment
# On windows:
  .\ldtvenv\Scripts\activate
# On linux:
  source ldtvenv/bin/activate

# Install pytorch
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118

# Install PaddleOCR
pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# Install the rest of the packages
pip install -r requirements.txt

💾 Packaging the application

Start by bundling the Python application and its dependencies into a single executable that can be run by the user without installing Python. We'll use PyInstaller:

cd python_server
pyinstaller --onefile server.py

Currently, you need to copy the generated file python_server\dist\server.exe into electron_gui\assets\.

Then, we'll create the Electron executable using electron-forge:

cd electron_gui
npm run make

You'll find the resulting application in a path similar to electron_gui\out\live_desktop_translator-win32-x64 (the last folder depends on your system's architecture).

⚖️ License

This code is released under the MIT license. See the LICENSE file for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
electron_gui		electron_gui
python_server		python_server
readme_images		readme_images
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Live Desktop Translator

🔍 Overview

🛠️ Development

⚙️ Starting the app (for development)

1. Electron

2. Python

Option 1 (recommended) - With Anaconda

Option 2 (untested) - With pip

💾 Packaging the application

⚖️ License

About

Uh oh!

Uh oh!

Languages

License

ColinTr/LiveDesktopTranslator

Folders and files

Latest commit

History

Repository files navigation

Live Desktop Translator

🔍 Overview

🛠️ Development

⚙️ Starting the app (for development)

1. Electron

2. Python

Option 1 (recommended) - With Anaconda

Option 2 (untested) - With pip

💾 Packaging the application

⚖️ License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages