Skip to content
/ thesis Public

PhD thesis' data and code for interactive exploration and reproducibility

License

Notifications You must be signed in to change notification settings

matey97/thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

79 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Human Activity Recognition with Consumer Devices and Real-Life Perspectives

GH Deployment DOI

This repository collects all the resources employed during the development of the doctoral thesis in computer science titled "Human Activity Recognition with Consumer Devices and Real-Life Perspectives", authored by Miguel Matey Sanz ORCID logo and supervised by Dr. Carlos Granell Canut ORCID logo and Dr. Sven Casteleyn ORCID logo at the Universitat Jaume I.

The repository employs Quarto to present the main outcomes of the thesis in a book format deployed in GitHub Pages. The generated Quarto Book consitutes a "lite" version of the thesis. Therefore, the thesis document should be consulted for more context, references, or discussions.

Contents

The repository includes all the data, code and other resources employed throughout the develoment of the thesis:

  • data: contains the data employed in each one of the chapters of the thesis.
  • figs: contains the figures for each chapter of the thesis.
  • libs: Python library contanining all the code employed to execute the experiments (libs/chapter*/pipeline/) and analyses (libs/chapter*/analysis/) presented in the thesis.
  • reference: contains .qmd files with the documentation of the most important code resources in libs. Generated using quartodoc 🚀.
  • *.qmd files: Quarto Markdown documents
  • *.ipynb files: Jupyter notebooks containing the analyses whose results are presented in the thesis.
  • requirements.txt: Python libraries employed to execute experiments and analyses. All these experiments and analyses have been executed using Python 3.9.
  • .docker: contains a Dockerfile to build a Docker image with a computational environment to reproduce the experiments and analyses.

Reproducibility

From the begining of this thesis, the reproducibility of the results has been a paramounth objective. Therefore, all the outcomes presented in the thesis document can be reproduced using the *.ipynb notebooks. In addition, since all the scripts employed for the execution of experiments are provided, their replicability is also possible.

Reproducibility setup

Several options to setup a computational environment to reproduce the analyses are offered: online and locally.

Reproduce online with Binder

Binder

Binder allows to create custom computing environments in the cloud so it can be shared to many remote users. To open the Binder computing environment, click on the "Binder" badge above. You can also open the Binder enviroment while exploring the Quarto Book clicking in the "Launch Binder" link provided in some sections.

Note

Building the computing enviroment in Binder can be slow.

Reproduce locally

Install Python 3.9, download or clone the repository, open a command line in the root of the directory and install the required software executing the following command:

pip install -r requirements.txt

Tip

The usage of a virtual enviroment such as the ones provided by Conda or venv are recommended.

Reproduce locally with Docker

Install Docker for building an image based on the provided .docker/Dockerfile with a Jupyter environment and running a container based on the image.

Download the repository, open a command line in the root of the directory and:

  1. Build the image (don't forget the final .):
docker build --file .docker/Dockerfile --tag thesis .
  1. Run the image:
docker run -it -p 8888:8888 thesis
  1. Click on the login link (or copy and paste in the browser) shown in the console to access to a Jupyter environment.

Reproduce the analyses

The Python scripts employed to execute the experiments described in the thesis are located in libs/chapter*/pipeline/[n]_*.py, where n determines the order in which the scripts must be executed. The reproduction of these scripts is not needed since their outputs are already stored in the data/chapter*/ directories.

Note

When executing a script with a component of randomness (i.e., ML and DL models), the obtained results might change compared with the reported ones.

Caution

It is not recommended to execute these scripts, since they can run for hours, days or weeks depending on the computer's hardware.

To reproduce the outcomes presented in the thesis, open the desired Jupyter notebook (*.ipynb) file and execute its cells to generate reported results from the data generated in the experiments (libs/chapter*/pipeline/[n]_*.py scripts).

License

License!: GPL v3 License: ODbL License: CC BY-NC-SA 4.0

All the code contained in the .ipynb notebooks and the libs folder are licensed under the GPL-3.0 License.

The data contained in the data folder is licensed under the Open Data Commons Open Database License (ODbL).

The remaining documents included in this repository are licensed under the Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA 4.0).

Funding

This thesis has been funded by the Spanish Ministry of Universities with a predoctoral grant (FPU19/05352) and a research stay grant (EST23/00320). Financial support for derived activities of this dissertation (e.g., publications, conferences, etc.) was received from the SyMptOMS-ET project (PID2020-120250RB-I00), funded by MICIU/AEI/10.13039/501100011033.