- Vamshidhar Reddy Parupally - 016001427
- Tirupati Venkata Sri Sai Rama Raju Penmatsa - 016037047
-
Dataset : https://www.kaggle.com/competitions/walmart-recruiting-store-sales-forecasting
-
Colab notebook for Vamshidhar Reddy's Contribution [also included file in github]: https://colab.research.google.com/drive/1PZpK60do4ptjMqOB4-XqOD75bDw2ygNl?usp=sharing
-
Colab notebook for Tirupati Venkata Sri Sai Rama Raju's Contribution : https://colab.research.google.com/drive/1fGAk9V3JTfaEljtP4_vEmfswIwGbq0gA?usp=sharing
-
Video Demo : https://drive.google.com/drive/folders/1Y_5N2be5BYA1S2c6qlq-oiWD9VdRFNs7?usp=sharing
- This project comprises a thorough data analysis, as well as time series analysis and sales forecasting using multiple models.�
- The data was collected between 2010 and 2012, and 45 Walmart locations across the country were examined.
- Stores:
- Store: store numbers rangingfrom 1–45.
- Type: store type ‘A’, ‘B’ or ‘C’.
- Size: no. of products available in the particular store ranging from 34,000 to 210,000
- Sales:
- Date: The date of the week.
- WeeklySales: sales during that Week.
- Store: The store number 1–45.
- Dept: One of 1–99.
- IsHoliday: Boolean value representing a holiday week or not.
- Features:
- Temperature: Temperature of the region during that week.
- FuelPrice: Fuel Price in that region during that week.
- MarkDown1:5 : Represents the Type of markdown and what quantity was available during that week.
- CPI: Consumer Price Index during that week.
- Unemployment: The unemployment rate during that week in the region of the store
-
Merged the 3 different csv files to form the actual dataset to make the training easy.
-
Correlation analysis,
-
Null value analysis,
-
Outlier analysis,
-
Label encoding
- Marking the NA values as 0 for “MarkDown” ‘s and dropping “CPI” and “Unemployment”
- Removing Anomalies
- [1] https://www.kaggle.com/code/bhatnagardaksh/walmart-sales-predictionRandom-Forest
- [2] https://medium.datadriveninvestor.com/walmart-sales-data-analysis-sales-prediction-using-multiple-linear-regression-in-r-programming-adb14afd56fb
- [3] https://www.kaggle.com/competitions/walmart-recruiting-store-sales-forecasting/code
- [4] https://www.kaggle.com/competitions/walmart-recruiting-store-sales-forecasting
- [5] https://www.kaggle.com/code/datamany/random-forest-rnn-walmart-sales-forecastMachine-Learning
- [6] https://machinelearningmastery.com/xgboost-for-regression/
- [7] https://www.analyticsvidhya.com/blog/2021/03/introduction-to-long-short-term-memory-lstm/
- [8] https://www.superdatascience.com/blogs/recurrent-neural-networks-rnn-the-vanishing-gradient-problem
- [9] https://www.analyticsvidhya.com/blog/2017/06/which-algorithm-takes-the-crown-light-gbm-vs-xgboost/
- [10] https://www.geeksforgeeks.org/ml-label-encoding-of-datasets-in-python/
- [11] https://towardsdatascience.com/nlp-101-word2vec-skip-gram-and-cbow-93512ee24314
- [12] https://towardsdatascience.com/time-series-data-analysis-resample-1ff2224edec9
- [13] https://towardsdatascience.com/the-complete-guide-to-time-series-analysis-and-forecasting-70d476bfe775/
- [14] https://towardsdatascience.com/efficient-time-series-using-pythons-pmdarima-library-f6825407b7f0/
- [15] https://www.analyticsvidhya.com/blog/2018/08/auto-arima-time-series-modeling-python-r/