Welcome to the Diamond Price Prediction repository! This Jupyter Notebook predicts the price of diamonds using machine learning algorithms. The project is built using Python and utilizes four methods for error calculation, both with and without a pipeline. It includes data cleaning, analysis, visualization, normalization, encoding, training, and testing phases. The best result was achieved using the Decision Tree algorithm, with an accuracy of 97%.
This project aims to predict the price of diamonds based on various features such as carat weight, cut, clarity, color, and depth. It employs machine learning algorithms to train models and make predictions.
- Machine Learning Algorithms: Utilizes various machine learning algorithms for prediction, including regression models.
- Error Calculation: Implements four methods for error calculation to evaluate model performance.
- Pipeline: Demonstrates the use of pipelines for preprocessing data and building machine learning models.
- Linear Regression
- Decision Tree Regression
- Random Forest Regression
- Support Vector Regression
- Data Cleaning: Remove any null values or outliers from the dataset.
- Data Analysis: Analyze the dataset to understand the relationships between features and the target variable.
- Data Visualization: Visualize the data using charts and graphs to gain insights.
- Normalization and Encoding: Normalize numerical features and encode categorical features for model training.
- Training: Train machine learning models using various algorithms.
- Testing: Evaluate the performance of trained models using testing data.
The Decision Tree algorithm achieved the best result, with an accuracy of 97%.
To use this project, follow these steps:
-
Clone the repository:
git clone https://github.com/o2sa/diamond_price_prediction.git
-
Open the Jupyter Notebook:
jupyter notebook diamond_price_prediction.ipynb
-
Follow the instructions in the notebook to run the code and make predictions.
This project requires the following dependencies:
- Python (version)
- Jupyter Notebook
- scikit-learn
- pandas
- numpy
Install the dependencies using the following command:
pip install scikit-learn pandas numpy
Contributions to this project are welcome! If you'd like to contribute, please follow the standard GitHub workflow:
- Fork the repository.
- Create a new branch (
git checkout -b feature/new-feature
). - Make your changes.
- Commit your changes (
git commit -am 'Add new feature'
). - Push to the branch (
git push origin feature/new-feature
). - Create a new Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
Special thanks to acknowledged_name for inspiration and guidance.