English letter to sound tool
This project aims to implement a algorithm proposed in paper as follows:
- g2p_data_pre_process.py - a python script to pre-process g2p data by rules.
- g2p_aligner.py - a python script to train model by EM & DTW algorithm.
- assets/ - store the training data sets.
- log/ - settle the log files.
Try to match every word-phones pair to every rule based on the locations of grapheme & phoneme.
- Match a word-phones pair to a rule based on the locations of grapheme & phoneme.
- If a rule got matched, modify the phones and rematch the word-phones pair to every rule, until no rule would match.
- Set a admissible error to admit some error caused by previous silent graphemes(letters).