Skip to content

This was the final project for the course Deep Learning System that I took in Spring 2018.

Notifications You must be signed in to change notification settings

akterTaslima/Speech_Recognition

Repository files navigation

SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORK

We present an automatic speech recognition system developed using end-to-end deep learning. The traditional speech systems usually rely on laboriously engineered processing pipelines and also tend to perform poorly in noisy environments. Our architecture is much more simpler than them and directly learns function that is robust to background noise, reverberation, or speaker variation. Therefore we do not need to hand-designed these components and also do not need a phoneme dictionary. Deep learning models CNNs, RNNs and DNNs are complementary in their modeling capabilities, as CNNs are good at reducing frequency variations, RNNs are good at modeling spatial dependencies, and DNNs are appropriate for mapping features to a more separable space. In this project, we take advantage of the complementarity of CNNs, RNNs and DNNs by combining them into one unified CRNN architecture. Our system, provides state-of-the-art results on the widely studied TIMIT corpus and in noisy environments as well.

Final report: [LINK].

Authors:

Taslima Akter (takter@iu.edu)

Khandokar Md. Nayem (knayem@iu.edu)

Indiana University, Bloomington

About

This was the final project for the course Deep Learning System that I took in Spring 2018.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published