By: Maria Inês Arruda Gonçalves
In order to preserve the reak company name, we'll call it: The XX Company
This notebook is a part of the technical test for the Junior Data Analyst Role.
The aim of this notebook is to make an Exploratory Data Analysis (EDA) of the metrics provided by The XX Company. Here I have to analyze the data available and write my insights about the behavior of:
- Volume (impressions, clicks, overall website traffic…)
- CPC: Cost per click
- Total number of conversions
- XX’s share of the website
- Overall traffic of the advertiser
-
Is it possible to see any kind of seasonality? Did the values increase or decrease? By how much?
-
Are XX’s results and delivery stable?
- CTR: click-through rate
- CPC: cost per click
- CR: conversion rate
- ROAS: return on advertising spend
- Share: percentage of how many conversions are from XX
- CTR: click-through rate: Is the proportion of people who clicked by the people that saw a propaganda.
CTR =
- CPC: cost per click: Is the cost of each click\
CPC =
- CR: conversion rate: Is the proportion of conversions from XX by the amount of Clicks
CR =
- ROAS: Return on the amount spent on advertising
ROAS =
- Share: percentage of how many conversions are from XX
Share =
We can see date is not in the appropriate format and needs to be transformed.
One of the first analyses we can do is identify the behavior of the metrics depending on the day of the week
To identify the day of the week through the date, we can use the command 'dt.dayofweek'. Then, we can make a dictionary to write the day of the week on the column:
Then we can transform some analysis into a barplot
On the heatmap we saw a very strong correlation between Impressions and Clicks. We can also plot them through the whole period, to see how they look. In other words, the impressions and clicks throughout the days of November.
We can see that Monday and Sunday are the days where we have a greater dispersion, and they’re also the worst days when it comes to clicks and impressions.
An interesting point is that the outlier on the ROAS doesn't happened on the Black Friday, it happened on November 30th.
Looking into the data, this might have happened because on Black Friday it's when we spend the most with advertisements. Even though we had a great earning, the biggest ROAS actually happened on November 30th. That was the day where we got the higher return compared to the amount spent.
1) Is it possible to see any kind of seasonality? Did the values increase or decrease? By how much?
2)Are XX’s results and delivery stable?
1) We can notice some kind of seasonality when it comes to impressions and clicks. Through the plots above we can identify that Friday, Saturday and Thursday are the days when there are most impressions and clicks. This may be due to getting closer to the weekend and also because we had a Black Friday event this month.
2) When it comes to the results and delivery of XX we can see very clearly through the last graph that the gains promoted due to XX far exceed the costs of propaganda and thus generate
The full analyses can be seen in the jupyter notebook in this repository: XXCompany_Technical_Test.ipynb