Skip to content

Project work completed in the graduate courses

Notifications You must be signed in to change notification settings

asaito333/Projects

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

81 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Analytics Projects

Some project work completed in the graduate courses

In this project, I analyzed "Inside Airbnb" data to identify what factors affect property's ratings for the purpose of providing useful direction for users to choose better properties or for property owners to increase rating scores. To achieve that, I incorporated this huge data into HDFS, manipulated data with Hive, and developed regression model by MLlib Scala in Spark. The following chart shows what factors are the most significant for ratings.

In this project, I analyzed TripAdviser's user review and rating data and identified how users rate and comment on hotels by its location for the purpose of user's better hotel choice. To achieve that, I performed topic modeling on user comments by hotel location and found out which factors users appreciate or not. The following image shows the part of the result.

In this project, we created a dataset with IMDb plain text data files by using the API (IMDbPY). Then, I analyzed how user ratings on movies differ across series numbers by using Python. The original data has multiple tables, which includes such as a title table, a movie type table, a rating table, and a box-office table, so I joined those tables in this code. All the works from data pre-processing to data visualization are done in this code. The following image shows the result.

About

Project work completed in the graduate courses

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%