Skip to content

This repository contains the explanatory data analysis done using python numpy, pandas, matplotlib, seaborn and pandasql.

Notifications You must be signed in to change notification settings

Twinklesahni23/EDA-using-python-and-SQL

Repository files navigation

EDA-Ecommerce-Python-and-SQL

This repository contains the explanatory data analysis summer sale of clothes of Ecommerce website 'Wish' which is done using python numpy, pandas, matplotlib, seaborn and pandasql.

Wish com-logo-1024x538

The indepth analysis has been done on a dataset collected from kaggle. The data contains a lot of useful features that one can dig into to generate the important insights for both, firm as well as the customers buying from Wish. The data contains the features like

  • Price
  • User Ratings
  • Merchant Ratings
  • Shipping Costs
  • Product Sizes
  • Product Colors

The insights from the analysis are as follow:

  • Discount: Higher the discount, higher is the number of clothes sold during the sale. For the number of products sold over 100k, the discount is of nearly 65% while number of products selling between 1k-50k have a discount range of 20%-30%. This notion confirms the price sensitive behaviour of buyers. image

  • User Ratings: The frequency of user ratings is strongly correlated with number of items sold. This means that products with higher number of reviews seems to be looked upon by the customers. Hence, the comapny should put emphasis on receing the feedback in the form of ratings/reviews from a customer after making a purchase. image

  • This is a result of SQL query for Top 5 sold products during Wish summer sale Screenshot (15)

  • Ad Boosts: Ad boost has no significant impact on the sale of the products. This means that items that seller has put into advertisement expense onto does not really influence the buying patterns of customers. This is a signal for the marketing team to work on better and compelling campaigns. image

  • Gender: Roughly 94% of all the products constitute clothes in Women Category.

  • Size: During the summer sale of clothes, S is the most sold size followed by XS and M. This can help in maintaining the buffer inventory for certain sizes to increase the revenue in upcoming sales. image

  • Color: From the sorted data, Black is the most demanded color (19.7%) followed by White (16.4%). The category of 'Other' colors include colors like Pink, Maroon, Gold, Brown etc. Screenshot (17)

  • Merchant Rating: Higher the number of Ratings of a Merchant, higher is the units sold for that merchant. Merchant 'simplevalueltd' is the highest rated merchant with a rating of 4.35 and more than 2 million rating counts. Their top selling product is a Cat imprint shirt for women. Also, it is equally important of Merchants to seek the feebacks for their services as products sold are the ones with merchant rating of 4 or more. This is a indication of buyer awareness implying that customers tend to buy products from higher rated merchants only. image

About

This repository contains the explanatory data analysis done using python numpy, pandas, matplotlib, seaborn and pandasql.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published