Skip to content

Latest commit

 

History

History
37 lines (26 loc) · 1.13 KB

README.md

File metadata and controls

37 lines (26 loc) · 1.13 KB

Wine Enthusiast dataset analysis

🙍 Manuele Nolli

🏫 SUPSI

📆 2022/2023

Description

This document is an analysis of a public dataset found on Kaggle.com

The dataset contains 80k wine reviews with variety, location, winery, price, points, taster name and description. Each row represent a review of a wine.

My analysis will focus on the following questions:

  • Where are the wines produced?
  • What is the distribution of the points?
  • What is the distribution of the prices, and is it related to the points?
  • What is the distribution of the variety of wines?
  • How much tasters are there and how much reviews each of them has done?
    • Are there tasters that are more reliable than others?
    • Have the tasters a preference for a specific continent/country?
  • What are the most common words in the description of the wines?

💻 Notebook

Used libraries

  • Plotly
    • Express
    • Graph Object
    • Subplots
  • Numpy
  • Pandas
  • Matplotlib

🍷 Cheers!