Handwritten Document Conversion

Introduction

This project focuses on building an end-to-end system to detect and convert handwritten text, specifically in Devanagari script. Leveraging advanced models like TrOCR (Transformer-based OCR), the system extracts text from scanned documents, including both printed and handwritten content, with a focus on Nepali language. The model uses a Vision Transformer (ViT) as an encoder to process image features and NepBERT, a variant of RoBERTa, as a decoder to generate text.

Goals

Develop a high-accuracy OCR model for handwritten Nepali text recognition.
Automate the conversion of scanned handwritten documents into digital text.

Contributors

Aayush Puri
Anil Paudel
Yubraj Sigdel

Project Architecture

Status

Current phase: Model Deployment

Known Issues

Minor inaccuracies in detecting certain handwritten styles.
Overfitting on specific types of Devanagari words. IT still lacks robustness to generailze in Nepali Handwritten Texts.

High-Level Next Steps

Fine-tune the model to handle additional handwritten styles.
Expand the system to support batch inference of documents.

Usage

Creating Virtual Environment

This project requires python-3.10. To ensure compatibility, we recommend creating a virtual environment.

conda create -n handwritten python==3.10
conda activate handwritten

Pulling Repository

For Linux

git clone git@github.com:fuseai-fellowship/hand-written-document-conversion.git

For Windows

git clone https://github.com/fuseai-fellowship/hand-written-document-conversion.git

Install required requirements

pip install -r requirements.txt

To sync and clean unused dependencies:

make deps-sync

The sample UI is as shown: (Delete this and paste the ui screenshot via update readme via github)

Usage Instructions

Follow the below instructions to run the system and test it on your documents:

Upload a scanned handwritten document.
Run the system to extract the handwritten text.
View the results in digital format displayed beside the image input.

Data Source

The system uses a custom dataset with handwritten Nepali text, both printed and annotated.
Source data includes documents from various sectors such as education and government.

Code Structure

/src: Contains the core processing scripts.
/notebook: Contains the notebook used while finetuning TrOCR model.
/models: Includes the pre-trained YOLO model for text-detection.
/data: Houses training and test datasets.

Artifacts Location

Output files and extracted texts are stored in the /output directory.

Results

Metrics Used

Character Error Rate (CER): Measures accuracy in recognizing handwritten characters.

Evaluation Results

The system achieved a CER of 9.05% on the test set.
These results demonstrate the model’s ability to generalize across different handwriting styles.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
docs		docs
notebook		notebook
src		src
tests		tests
utils		utils
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
app.py		app.py
main.py		main.py
pyproject.toml		pyproject.toml
requirements.in		requirements.in
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handwritten Document Conversion

Introduction

Goals

Contributors

Project Architecture

Status

Known Issues

High-Level Next Steps

Usage

Creating Virtual Environment

Pulling Repository

For Linux

For Windows

Install required requirements

Usage Instructions

Data Source

Code Structure

Artifacts Location

Results

Metrics Used

Evaluation Results

About

Releases

Packages

Contributors 3

Languages

yubraaj11/handwritten-document-conversion

Folders and files

Latest commit

History

Repository files navigation

Handwritten Document Conversion

Introduction

Goals

Contributors

Project Architecture

Status

Known Issues

High-Level Next Steps

Usage

Creating Virtual Environment

Pulling Repository

For Linux

For Windows

Install required requirements

Usage Instructions

Data Source

Code Structure

Artifacts Location

Results

Metrics Used

Evaluation Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages