Predicting demand for an ad, based on its full description (Russian Text Data, Ad Images), its context (Location, Similar Ads) and historical demand.
Dataset: https://www.kaggle.com/c/avito-demand-prediction
- This dataset contains multiple types of data - numeric, textual, images and time series. It will be interesting to learn how to train a model to utilize features from such different types of data.
- The textual data is Russian (foreign to me) and will provide unique challenge during data exploration as well as modelling.
Few of the images from training set and their translated text from Russian are shown below:
- Web Scrapping Population of Cities from Wikipedia
- Data Visualization
- Data Cleaning - Esp. description column
- Feature Engineering (Work in Progress)
- Extract Image Features using CNN architecture(s)
- Build the neural network architecture
- Evaluate on the Test Dataset