Skip to content

Restoring damaged Sinhala handwritten documents using image processing, machine learning (ML), and natural language processing (NLP) techniques.

Notifications You must be signed in to change notification settings

yesitha/Sinhala-Document-Restoration-FYP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Context-Aware Damage Detection and Text Restoration in Sinhala Handwritten Documents

This repository contains the implementation of a research-driven project aimed at restoring damaged Sinhala handwritten documents using image processing, machine learning (ML), and natural language processing (NLP) techniques. The project is divided into four modules, each addressing different aspects of document damage detection and restoration.

Project Overview

Handwritten Sinhala documents, often preserved in libraries, offices, and archives, are subject to various damages such as ink blotches, wear and tear, insect attacks, and missing words. Unlike English text restoration, Sinhala script poses unique challenges due to its curved characters and complex structures. This project applies advanced techniques to detect, classify, and restore damaged handwritten content, ensuring readability while preserving original writing styles.

Project Modules

🔹 Module 1 - Damage Detection & Classification

  • Detects damaged areas in handwritten Sinhala documents.
  • Classifies damages into ink blotches, wear and tear, blurred text, insect attacks, and peeled-off areas.

🔹 Module 2 - Damage Categorization & Missing Content Estimation

  • Categorizes detected damages into minor (affecting parts of letters) or contextual (missing full words or phrases).
  • Estimates missing content based on word spacing, character structure, and paragraph context.

🔹 Module 3 - Blurred Text Restoration & Letter-Level Damage Reconstruction

  • Restores blurred or bleed-through text to improve readability.
  • Reconstructs minor letter-level damages while preserving original handwriting style.
  • Ensures restored text blends seamlessly into the document.

🔹 Module 4 - Contextual Missing Text Reconstruction

  • Restores larger missing sections, including entire letters, words, and phrases.
  • Ensures the restored text maintains semantic coherence with the surrounding content.

About

Restoring damaged Sinhala handwritten documents using image processing, machine learning (ML), and natural language processing (NLP) techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published