This is the project for the elective module "Python" at DHBW Stuttgart.
It is recommended to run the application with Python 3.9. With pip you can download all the libraries you need for the program.
pip install -r requirements.txt
You also need the model.pth
in the same folder where you start the application. This file contains the pre-trained model which is loaded by the programm to generate the results.
This pre-trained model has an accurency of 98.39%.
After that you only need to execute the following command:
python main.py
To train the neural network again or to change for example hyperparameters, you can simply run the TrainModel.ipynb
Notebook. At the end of the notebook it will save the trained model as model.pth
.
This programm uses the balanced EMNIST Dataset which means that there are only 47 classes instead of the full 62 classes. This makes it easier to train the neural net, because for each class is the same number of images and the net will not be biased towards a character or digit that is overrepresented. In the diagram below you can see which character are seen as the same (e.g. i=I, j=J). Source
Cohen, G., Afshar, S., Tapson, J., & van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373
The PyTorch library was used for the neural network. The neural network is a simple convolutional neural network with 2 Convolutional layers with the LeakyReLU activation function and each has a Max-Pooling Layer with a kernel_size of 2. After the last Convolutional Layer the size of the input is 5x5 with 72 channels. This tensor will be reshape that it fit into the last second Linear Layers which will produce the ouput. The ouput tensor is a array with the ouput_size (here 47 classes). The probabilities are generated with a softmax function by the app.
This programm is licensed under the MIT-License.