This repository contains a Python script that demonstrates music genre classification using machine learning techniques. The script processes music track data and performs various tasks, including data preprocessing, dimensionality reduction, model training, and evaluation. It uses the scikit-learn library for machine learning and pandas for data manipulation.
The script in this repository demonstrates the following machine learning tasks:
- Data loading and preprocessing.
- Principal Component Analysis (PCA) for dimensionality reduction.
- Training and evaluating a decision tree classifier and a logistic regression classifier.
- Balancing the dataset.
- K-fold cross-validation to assess model performance.
The primary goal is to classify music tracks into two genres: Hip-Hop and Rock, based on track metrics and features.
To get started, you should have Python and the required dependencies installed on your system. You can install dependencies using pip:
The script performs the following machine learning tasks:
- Data loading and preprocessing Principal Component Analysis (PCA) for dimensionality reduction
- Decision tree and logistic regression model training and evaluation Dataset balancing for equal representation of Hip-Hop and Rock genres
- K-fold cross-validation for model assessment Dataset
- The dataset used in this project consists of track metadata and track metrics with genre labels.
- The dataset is included in the datasets folder as fma-rock-vs-hiphop.csv and echonest-metrics.json.
The script relies on the following Python libraries:
- pandas
- scikit-learn
- numpy
- matplotlib
You can install these dependencies using the requirements.txt file.