Skip to content

Source-Recommendation-System takes an article from the user as input and outputs any relevant article from a dataset of 8.5 million articles.

Notifications You must be signed in to change notification settings

OvroAbir/Source-Recommendation-System

Repository files navigation

Source-Recommendation-System

Source Recommendation System takes an article from the user as input and outputs any relevant article from 8.5 million articles in the dataset to the user. It uses Apache Spark to handle this huge load of articles.

Prerequisites

This project uses rake-nltk library to extract keywords.

pip install rake-nltk

FakeNewsCorpus was used as dataset (27 GB) for news articles. Apache Spark has been used to handle this huge dataset. It needs to be correctly installed and configured. The configuration file for Spark can be found at spark-2.4.4-bin-hadoop2.7 folder. Hadoop was used as underlying distributed file system. The configuration for Hadoop can be found at hadoop-conf folder. Both of them needs to changed according to your configuration.

Source Code

The source code can be found at /src folder.

Algorithm & Implementation Details

This idea was implement as project for course work of Distributed System course in Colorado State Univeristy. Detailed description of the algorithm can be found here -

Authors

About

Source-Recommendation-System takes an article from the user as input and outputs any relevant article from a dataset of 8.5 million articles.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published