Skip to content

jotalis/GreanTeam

Repository files navigation

GreanTeam

Inspiration

Finding an Airbnb is difficult, especially searching for one that's just right. Guests looking to stay in Dublin seem to share the same sentiment! In an effort to improve the guest experience on Airbnb, we analysed Strata's 2 datasets of Airbnb guests' searches in Dublin and their following inquiries to hosts.

What it does

Our project consists of three main components: data visualization and analysis, machine learning, and a chatbot. In this Datathon, we used data analysis as a tool to the other parts of our project. As soon as we found which pieces of data were significant to whether a AirBnB user would have their booking accepted, we proceeded to create our machine learning model. This model takes input variables from the dataset provided to us by Strata, but we didn't stop there. We continued on to make a chatbot that is capable of providing recommendations to users based on the data we analyzed.

Exploration and Preprocessing

  • We were initially given two separate datasets
    • Each user had multiple entries, making it difficult to merge the datasets
      • Combined into one set by user ID
  • Outliers
    • Remove outliers with a modified version of z-scores

Visualization / Analysis

We used R's built-in statistical functions and p-value tests to run some preliminary analysis on the composite data file. This allowed us to discover which variables influence acceptance rate for Airbnb's. Using this information, we were able to generate informative plots using Seaborn and train our machine learning model.

Machine Learning

Random Forest Classifier Model

  • Compiled dataset
    • Converted qualitative data to numerical using a scikit-learn LabelEncoder
    • Output: Will a guest be accepted by a host? (1 for true, 0 for false)
    • Inputs: Guest message time, host message time, check-in time, origin country
    • Origin country had the biggest impact (accuracy jumped 78% - 98%)

Overall accuracy: 98.18% // Training time: 41.61ms

Machine Learning Model

Chatbot

Meet Bobby, an advanced AI tool developed with OpenAI's latest technology in assistants, designed to analyze and visualize data trends directly from datasets.

  • Deep Data Analysis: Excels in extracting meaningful insights and patterns.
  • Actionable Visualizations: Converts raw data into clear, actionable visual reports for strategic decision-making.

Bobby in action!

Challenges

Preprocessing

  • Finding a way to merge the datasets was difficult
    • Users had multiple entries in both datasets, so rows were merged
      • Numerical entries were averaged
      • Categorical entries were appended to sets
    • Preparing the data for the machine learning model was also a challenge
      • All categorical data had to be converted to numerical representations

What we built our project with

What We Learned

  • Throughout the course of this Datathon, we've developed several skills:
    • We now have a deeper understanding of machine learning techniques
    • We understand how data analysis can be used to highlight problems and inspire solutions
    • Most importantly, we learned to delegate work and collaborate effectively in a team setting

What's next for Dublin Dive

  • We plan to use an expanded range of data
    • The data we used was from 2014 (10 years ago!) so more recent data would be more relevant
  • We will also further refine the ML model
    • Potentially using a Neural Network

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •