Live Demo: movie-recommendation-all.streamlit.app
Your go-to tool for finding your next favorite movie!
The Movie Recommendation System is a content-based machine learning project that suggests movies similar to a selected movie. Built with Python and Streamlit, this project offers a sleek and interactive web app to explore movie recommendations.
-
📊 Data Collection:
- Datasets:
tmdb_5000_movies.csv
andtmdb_5000_credits.csv
from Kaggle. - Data includes genres, keywords, cast, crew, and more!
- Datasets:
-
🛠️ Data Preprocessing:
- Merged datasets and selected relevant columns.
- Handled missing values, duplicates, and converted data into usable formats.
- Created a
tags
column combining genres, cast, crew, and keywords.
-
🤖 Model Creation:
- Used
CountVectorizer
for vectorizing tags and calculated cosine similarity. - Pickle files save the model and data for quick access.
- Used
-
🌐 Website Development:
- Built with Streamlit to deliver an interactive experience.
- Displays recommended movies with posters fetched from TMDB API.
-
🚀 Deployment:
- The app is deployed and ready for you to explore!
- Clone the repository:
git clone https://github.com/parth-jatav/movie-recommendation-system.git
- Navigate to the project directory:
cd movie-recommendation-system
- Install the required dependencies:
pip install -r requirements.txt
- Run the Streamlit application:
streamlit run app.py
- Select a movie from the dropdown menu.
- Click the "Get Recommendations" button.
- View the recommended movies along with their posters.
- Source: Datasets are sourced from Kaggle:
- Content-Based Filtering: Recommends movies based on metadata similarity.
- Interactive UI: Built with Streamlit for a smooth user experience.
- Dynamic Posters: Fetches movie posters from the TMDB API.
- Pandas 🐼
- NumPy 🔢
- NLTK 🧠
- Scikit-learn 📈
- Streamlit 🌐
- Requests 🌍
- Pickle 🥒
Contributions are welcome! Please feel free to submit a pull request or open an issue if you have suggestions or bug reports.
- Thanks to TMDB for the movie data.
- Kudos to Kaggle for hosting the datasets.