This repository contains the report that I made for my Statistics and Data Analysis (SEE2003) class's final project, as well as the code used to analyse the data, and the data itself.
I chose to analyse air quality data in Hong Kong, by comparing the various pollutant concentrations in 2019 (For most of this year, covid19 was a non-factor) vs 2020 (affected in many ways by covid19 lockdowns and restrictions). I hypothesised that there would be a lower air pollutant concentrations in 2020 compared to 2019, perhaps due to the effects of covid restrictions.
Later, my data analysis proved this to indeed be the case. I then attempted to explain the reasons for the various differences observed in air pollutant concentrations, as well as looking for other trends.
I used Python in Jupyter Notebooks for this project. I made use of Numpy and Pandas to clean and organise my data, Matplotlib and Seaborn to visualise it, and Scipy to conduct more detailed analyses.
The jupyter notbook contains all the code that I used.
All other details about this project can be found in the report itself, also in this repository.