Skip to content

Exploratory Data Analysis in modern Movie Dataset

Notifications You must be signed in to change notification settings

phu0n9/BigData_movie

Repository files navigation

Exploratory Data Analysis in modern Movie Dataset

This project is to classify the movie ROI rate and predict the revenue in online movie dataset of IMDb and TMDb from 1980 to 2017 by using visualization and machine learning algorithms

RMIT University in Vietnam

Course: EEET2574 Big Data in Engineering

Semester: 2020B

Assessment: Big Data Project

Date: 27/09/2020


Link for the whole project: https://drive.google.com/drive/folders/1a_Ot_xloD-2mZQhzNvMSPQK6MGsnqbvb?usp=sharing


Required datasets:

  • name.tsv
  • title.tsv
  • IMDb movies.csv
  • movies_metadata.csv
  • cast.csv
  • IMDb ratings.csv

Please run Jupyter Notebook files in exact order:

  1. extract.ipynb
  2. data.ipynb
  3. statistic.ipynb
  4. modeling.ipynb

Please find pdf file for the report

Releases

No releases published

Packages

No packages published