Audio emotion analysis is a process developed to extract features from audio clips and classify the emotion of the speaker. The data-set chosen is extracted from the The Ryerson Audio Visual Databaseof Emotional Speech and Song (RAVDESS). This project uses many different approaches to classify speech snippets into eight different emotions - Neutral, Calm, Happy, Sad, Angry, Fearful, Disgust, Surprised - after pre-processing the extracted data.
NOTE : To replicate the entire project, please ensure that the dataset is downloaded from RAVDESS and placed in the dataset/ravdess folder as done with the sample files that are already present in the folder
The demo of the proposed model can be visualized in the DEMO.py folder.
NOTE : Just run the Demo.ipynb file.
The implementation of the final model is present in the /code/finalModel folder.
The project report is also included for further details about the project.
A sample of the dataset is provided in the dataset/ravdess folder
The papers with reference to the literature survey, are provided in the litSurvey folder.
The folder named code contains all the code pertaining to the project. It contains the following folders:
- dataExtraction : Code that helps extract the useful features of the audio data (lang: Python)
- visualisations : Visualisations drawn form the extracted features (lang: R)
- baselineModels :
a. decisionTrees : Implementation of Decision Trees on the extracted Features (lang: Python)
b. KNN : Implementation of KNNs on the extracted Features (lang: R)
c. SVM : Implementation of SVMs on the extracted Features (lang: R)
d. ANN : Implementation of ANNs on the extracted Features (lang: R) - finalModel :
a. crossValidation : Cross validation (lang: Python)
b. allTrials : Other Models (lang: Python)
c. finalModels : The final model (lang: Python and HTML) - movingAverages : Implementation of Moving Averages (lang: Python)
The intermediatory extracted data is stored in the extractedData folder.
TEAM : #LRHC💥