Online-Learning-Application-2023

Repository of the Online Learning Application project (with focus in online pricing and advertising). The proposed work, which is divided in seven main steps, aims to study and give a practical example of the main concepts related to bandit algorithms and online optimization applied in different scenarios.

Team

Name	ID
Paolo Pertino	10729600
Leonardo Pesce	10659489
Alberto Sandri	10698469
Enrico Simionato	10698193
Michael Vitali	10730463

Professor & Tutors	Role
Nicola Gatti	Online Learning Applications Professor
Matteo Castiglioni	Online Learning Applications Professor
Martino Bernasconi de Luca	Teaching Assistant

N.B.: the repository has been made public on the 22th of September at 12.30.

Project structure

As previously mentioned the project is divided in seven main steps, each one of them is contained in a single python file. The steps are the following:

Step 0: environment design: the first step is a simple introduction to the general context we considered. In this step we gave a brief introduction to the Pear corporation and its e-commerce platform.
Step 1: pricing problem: In this step the pricing setting is unknown, instead the advertising one is known. The objective is to maximize the reward while estimating the purchase conversation rates. We use TS and UCB1 to minimize the regret while estimating the conversion probabilities associated to each price, so that it is possible to understand which is the best one.
Step 2: advertising problem: In the advertising setting, the pricing curve is fully known, thus the price is considered to be fixed and set to be the best one in terms of conversion rate times margin. The objective is to maximize the reward while estimating the click-through rates and cumulative daily costs curves as the bid varies.
Step 3: Learning for joint pricing and advertising: the pricing and advertising settings are both unknown. The objective is to maximize the reward while estimating the purchase conversion rates, the number of clicks and cost.
Step 4: Context generation: in the setting of this step, and very often in real scenarios, users have different characteristics. Users interact with the e-commerce website in a variety of ways, which includes being more inclined to purchase the product at a different price and being more or less influenced by advertising campaigns. Considering this scenario and based on the characteristic of the users it is usually possible to identify a set of classes within which users have the same features. From a mathematical point of view each class can have a different demand curve and different functions of the clicks and the costs. There are two way to tackle the problem:
- considering an aggregated model: the e-commerce optimizes prices and bids considering the data as generated by a single class reaching sub-optimal results, in general;
- considering a disaggregated model: the e-commerce optimizes prices and bids considering the data as generated by a many classes reaching better results with respect to the previous case.
Step 5: Non-stationary environment: in this step the pricing part is unknown, instead the advertising one is known. In this case, the pricing curves are subject to two abrupt changes. The objective is to maximize the reward across all the phases while estimating the purchase conversation rates, using variations of UCB1 (like SW-UCB1 or CUSUM-UCB1) to deal with non-stationary settings. A sensitivity analysis on the effects of the hyperparameters of the algorithms is also performed.
Step 6: Non-stationary environments with many abrupt changes: in this step the pricing part is unknown, instead the advertising one is known. In this case, the pricing curves are subject to many abrupt changes. The objective is to maximize the reward across all the phases while estimating the purchase conversation rates, using algorithms to deal with non-stationary settings (SW-UCB1, CUSUM-UCB1, EXP3).

Requirements & Installation

The project has been developed using Python 3.x, the required packages are listed in the requirements.txt file. To install them, run the following command:

pip install -r requirements.txt

To run our experiments:

python stepX.py

where X is the step number (step1.py, step2.py, step3.py, step4_scenario1.py, step4_scenario2.py, step4_scenario3.py, step5.py, step6_1.py, step6_2.py).

Folder structure

├── Assets                                  # contains the assets used in this file
├── Deliverables
│   └── Slides                              # contains the slides of the final presentation
├── img                                     # contains the images used in the presentation
├── README.md
├── requirements.txt
└── src
    ├── Environments
    │   ├── Environment.py
    │   ├── __init__.py
    │   ├── MultiContextEnvironment.py
    │   └── NonStationaryEnvironment.py
    ├── Learners
    │   ├── Clairvoyant.py
    │   ├── ContextGeneratorLearner.py
    │   ├── ContextLearner.py
    │   ├── CUSUM.py
    │   ├── CUSUMUCBLearner.py
    │   ├── EXP3.py
    │   ├── GPTS_Learner.py
    │   ├── GPUCB_Learner.py
    │   ├── __init__.py
    │   ├── LearnerFactory.py
    │   ├── Learner.py
    │   └── PricingAdvertisingLearner.py
    │   ├── SWUCB.py
    │   ├── TSContextGeneratorLearner.py
    │   ├── TSPricingAdvertising.py
    │   ├── TSReward.py
    │   ├── UCBContextGeneratorLearner.py
    │   ├── UCBPricingAdvertising.py
    │   └── UCB.py
    ├── sensitivity_analysis.py
    ├── settings.py
    ├── step1.py
    ├── step2.py
    ├── step3.py
    ├── step4_scenario1.py
    ├── step4_scenario2.py
    ├── step4_scenario3.py
    ├── step5.py
    ├── step6_1.py
    ├── step6_2.py
    └── Utilities
        ├── ContextTree
        │   ├── ContextNode.py
        │   ├── ContextTree.py
        │   └── __init__.py
        ├── GPs
        │   ├── BaseGaussianProcess.py
        │   └── __init__.py
        ├── __init__.py
        └── plots.py

References

[1] Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design, Srinivas et al. (2010), https://arxiv.org/pdf/0912.3995.pdf

[2] On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems,Garivier, A., & Moulines, E. (2008), https://arxiv.org/pdf/0805.3415.pdf

[3] A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem, Liu et al., 2017, https://arxiv.org/pdf/1711.03539.pdf

[4] https://jeremykun.com/2013/11/08/adversarial-bandits-and-the-exp3-algorithm/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Online-Learning-Application-2023

Team

Project structure

Requirements & Installation

Folder structure

References

About

Releases

Packages

Contributors 5

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 148 Commits
Assets		Assets
Deliverables/Slides		Deliverables/Slides
img		img
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

leonardopesce/Online-Learning-Application-2023

Folders and files

Latest commit

History

Repository files navigation

Online-Learning-Application-2023

Team

Project structure

Requirements & Installation

Folder structure

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages