Ivanchevski spelling converter web app

A Django web app whose main functionality is converting text from the modern Bulgarian spelling into the historic Ivanchevski spelling.

Spelling converter logic

The main spelling conversion logic is in converter/converter.py, with some helper logic in converter/process_vocabs.py and converter/pos_tagger.py.

POS tagging using BERT

For some cases, we need part-of-speech (POS) tagging to determine the correct spelling of a given word. For this purpose we use a multilingual BERT model fine-tuned for POS tagging on the Bulgarian language (The dataset used for training is the one provided by Bulgarian Tree Bank).

The trained weights are not included in the repo, since they're a little too big. For now I've uploaded them here. They should go into converter/static/converter/bert_model/bert_last_epoch.h5. If they're not provided, the app will still work, but not do POS tagging.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
converter		converter
converter_web_app		converter_web_app
data		data
pcf_stuff		pcf_stuff
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
db.sqlite3		db.sqlite3
docker-compose.yml		docker-compose.yml
init-letsencrypt.sh		init-letsencrypt.sh
manage.py		manage.py
nginx.default		nginx.default
requirements.txt		requirements.txt
start-server.sh		start-server.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ivanchevski spelling converter web app

Spelling converter logic

POS tagging using BERT

About

Releases

Packages

Languages

License

avataar/converter-web-app

Folders and files

Latest commit

History

Repository files navigation

Ivanchevski spelling converter web app

Spelling converter logic

POS tagging using BERT

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages