Skip to content

System to automatically detect Nepali handwritten texts, recognize them and store digitally in .txt file.

Notifications You must be signed in to change notification settings

yubraaj11/handwritten-document-conversion

Repository files navigation

Handwritten Document Conversion

Introduction

This project focuses on building an end-to-end system to detect and convert handwritten text, specifically in Devanagari script. Leveraging advanced models like TrOCR (Transformer-based OCR), the system extracts text from scanned documents, including both printed and handwritten content, with a focus on Nepali language. The model uses a Vision Transformer (ViT) as an encoder to process image features and NepBERT, a variant of RoBERTa, as a decoder to generate text.

Goals

  • Develop a high-accuracy OCR model for handwritten Nepali text recognition.
  • Automate the conversion of scanned handwritten documents into digital text.

Contributors

  • Aayush Puri
  • Anil Paudel
  • Yubraj Sigdel

Project Architecture

Screenshot from 2024-10-07 16-50-07


Status

  • Current phase: Model Deployment

Known Issues

  • Minor inaccuracies in detecting certain handwritten styles.
  • Overfitting on specific types of Devanagari words. IT still lacks robustness to generailze in Nepali Handwritten Texts.

High-Level Next Steps

  • Fine-tune the model to handle additional handwritten styles.
  • Expand the system to support batch inference of documents.

Usage

Creating Virtual Environment

This project requires python-3.10. To ensure compatibility, we recommend creating a virtual environment.

conda create -n handwritten python==3.10
conda activate handwritten

Pulling Repository

For Linux

git clone git@github.com:fuseai-fellowship/hand-written-document-conversion.git

For Windows

git clone https://github.com/fuseai-fellowship/hand-written-document-conversion.git

Install required requirements

pip install -r requirements.txt

To sync and clean unused dependencies:

make deps-sync

The sample UI is as shown: (Delete this and paste the ui screenshot via update readme via github)


Usage Instructions

Follow the below instructions to run the system and test it on your documents:

  1. Upload a scanned handwritten document.
  2. Run the system to extract the handwritten text.
  3. View the results in digital format displayed beside the image input.

Data Source

  • The system uses a custom dataset with handwritten Nepali text, both printed and annotated.
  • Source data includes documents from various sectors such as education and government.

Code Structure

  • /src: Contains the core processing scripts.
  • /notebook: Contains the notebook used while finetuning TrOCR model.
  • /models: Includes the pre-trained YOLO model for text-detection.
  • /data: Houses training and test datasets.

Artifacts Location

  • Output files and extracted texts are stored in the /output directory.

Results

Metrics Used

  • Character Error Rate (CER): Measures accuracy in recognizing handwritten characters.

Evaluation Results

  • The system achieved a CER of 9.05% on the test set.
  • These results demonstrate the model’s ability to generalize across different handwriting styles.

About

System to automatically detect Nepali handwritten texts, recognize them and store digitally in .txt file.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published