New York Airbnb Accommodation Recommender

This project provides a recommendation system for New York Airbnb accommodations using combined features such as text descriptions, images, price, and location data.

The dataset used for this project was collected from Inside Airbnb.

📥 Dataset

The full prepared dataset can be accessed using this link.

🚀 How to Use

Before running the application, ensure the KNN model and combined features are saved. You can generate these by running the notebook:

save_compute_similarities.ipynb

To launch the Streamlit app, execute the following command:

streamlit run airbnb_knn.py

🛠️ Implementation Pipeline

The project implementation follows the pipeline below:

🎯 Project Goals

Build a robust recommender system combining multiple features (price, description, images, location).
Utilize advanced text analysis (TF-IDF, Sentiment Analysis).
Compute similarity between listings using Cosine Similarity and KNN.

Key Methods

TF-IDF (Term-Frequency, Inverse Document Frequency Vectorizer)
- Quantifies the importance of a term in a document relative to its frequency and rarity across multiple documents.
  - Term Frequency (TF): Relative frequency of a term in a document.
  - Inverse Document Frequency (IDF): Measures how rare a term is across all documents.
Sentiment Analysis
- Sentiment analysis is performed using VADER (Valence Aware Dictionary and sEntiment Reasoner).
- Reference: Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text.
- VADER generates a Compound Score: a normalized sum of lexicon ratings ranging from -1 (most negative) to +1 (most positive).
Image Features
- Image Histograms (HSV): Histograms of Hue, Saturation, and Value capture the pixel value distribution in images.
- Similarity: Image similarity is computed by comparing HSV histograms.
Cosine Similarity
- Cosine similarity is calculated between textual descriptions and image HSV vectors to measure feature similarity.
K-Nearest Neighbors (KNN)
- Recommendations are generated using the KNN algorithm with:
  - k = 10
  - Combined features: price, location, image similarity, description similarity, and polarity.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
img		img
README.md		README.md
airbnb_knn.py		airbnb_knn.py
data_preparation.ipynb		data_preparation.ipynb
data_visualization.ipynb		data_visualization.ipynb
image_scraping.ipynb		image_scraping.ipynb
img_text_features.ipynb		img_text_features.ipynb
initial_airbnb_recommender.ipynb		initial_airbnb_recommender.ipynb
initial_airbnb_recommender.py		initial_airbnb_recommender.py
save_compute_similarities.ipynb		save_compute_similarities.ipynb
sentiment_analysis.ipynb		sentiment_analysis.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

New York Airbnb Accommodation Recommender

📥 Dataset

🚀 How to Use

🛠️ Implementation Pipeline

🎯 Project Goals

Key Methods

About

Releases

Packages

Languages

khrystynamk/Airbnb-accommodation-recommender

Folders and files

Latest commit

History

Repository files navigation

New York Airbnb Accommodation Recommender

📥 Dataset

🚀 How to Use

🛠️ Implementation Pipeline

🎯 Project Goals

Key Methods

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages