Skip to content

UCB-stat-159-s23/project-group10

Repository files navigation

Review Assignment Due Date

Cancer OLS Challenge Project

Group 10:Danielle Killeen, Didi Duan, Noah Tran, Brandon Rodriguez, Jiayi Qiu


This repository is a comprehensive collection of our team's work on exploring and analyzing the correlation between cancer-related death rates and socioeconomic indicators in the United States. Our main goal is to identify the most critical factors that contribute to cancer fatalities by using advanced statistical techniques and machine learning models. The datasets we use in this repository cited from the public-access data website data.world: https://data.world/nrippner/ols-regression-challenge.

Binder Link: Binder

GitHub Pages Link: https://ucb-stat-159-s23.github.io/project-group10/Main.html

Instructions for replication:

Install cancerolstools package

The custom package cancerolstools can be installed using pip install .

You can run tests on the package using the command pytest cancerolstools.

Makefile support

The makefile supports 5 operations: creating an environment, building JupyterBook, run all the notebooks, clean up the folders, and prints documentation.

Project structure:

The package cancerolstools contains 3 scripts. Each maps roughly to the functions required in each of the 3 following notebooks.

The notebook Data-Preparation.ipynb will provide the steps to preprocess and merge the data, including a preliminary step to mapping the anomalies in the visualization.

The notebook Data-Visualization.ipynb will provide 2 different visualizations intended to guide the analysis.

The notebook Regression-AnalysisV2.ipynb is a Python adaptation of original code in R intended to run some linear regression techniques on the data. It will create a basic model, apply LASSO penalization, and also conduct the nonparametric bootstrap.

LICENSE contains information on the license.

environment.yml provides requirements to build the environment to replicate the results.

_config.yml, _toc.yml, requirements.txt used for building the JupyterBook.

setup.cfg, setup.py, pyproj.toml are files for the package cancerolstools.

About

project-project-group10 created by GitHub Classroom

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages