Skip to content

ImeshaDilshani/CineMatch-Intelligent-Movie-Recommender-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

🎬 Content Base Movie Recommender System

This repository contains the code for a content-based movie recommender system using cosine similarity. The project leverages metadata from movies to suggest similar movies based on their content. This type of recommender system is particularly useful for recommending items with similar attributes and providing personalized suggestions to users.

Python NumPy Pandas scikit-learn NLTK

Data Collection 📥

The data used in this project comes from the TMDB Movie Dataset available on Kaggle. This dataset consists of two files tmdb_5000_credits.csv and tmdb_5000_movies.csv.

  • Dataset from Kaggle: TMDB Movie Dataset
  • Merged datasets based on the movie title. These files contain comprehensive information about over 5,000 movies, including their cast, crew, plot keywords, genres, and more.

Important Columns Used 📝

  • id
  • title
  • overview
  • genres
  • keywords
  • cast
  • crew

Data Preprocessing 🧹

Preprocessing is a crucial step in any machine learning project. In this stage, I

  • Removed missing values and duplicates.
  • Converted string data to lists.
  • Removed spaces in names and concatenated columns to create a tags column.
  • Converted tags to lowercase.
  • Applied stemming to reduce words to their root form (e.g., "loved", "loving", and "love" become "love").

Feature Engineering 🛠️

Extracted important features from the dataset to create a comprehensive tags column. This column combines various textual data such as cast, crew, genres, and keywords into a single field, which serves as the input for our model.

Vectorization and Similarity Calculation

  • Used CountVectorizer from scikit-learn to convert text data into vectors.
  • Calculated cosine similarity between movie vectors.

Recommendation Function 🎥

A function is created to recommend movies based on the cosine similarity scores. Given a movie title, the function returns a list of similar movies, helping users discover new content based on their preferences.

💡 Example

recommend('Spider-Man')
Output:
- Spider-Man 3
- Spider-Man 2
- The Amazing Spider-Man 2
- Arachnophobia
- Kick-Ass

Releases

No releases published

Packages

No packages published