Skip to content

This repository serves as a showcase of my skills and accomplishments in the field of data science. It includes a collection of projects that demonstrate my proficiency in data analysis, machine learning, and statistical modeling.

License

Notifications You must be signed in to change notification settings

yestab335/data-science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Reports

Welcome to the Data Science Reports repository! This collection serves as a comprehensive database for all the data science projects undertaken during my academic journey at the University of Rhode Island.

Project Timeline

Freshman Year

  1. Project: passwords
  • Description: What makes a password strong? Is it the amount of characters it contains? Or perhaps a category it is apart of? In this project, I endevor into the inteicacies of what makes a password strong or weak.
  • Files:
    • main.R: The main project file used for importing the dataset and previewing before analysis.
    • variation.R: The project file used to explore the variation within the given dataset.
    • variationModeling.R: The project file used to model the variation with regression and linear models.
    • passwords.Rproj: RStudio project configuation to replicate the project on your own machine.
    • passwords.csv: The main dataset used in the project.
    • passwordStrengths.csv: Second dataset used in the project to compare with the main dataset.
    • passwords_analysis.ipynb: Jupyter Notebook containing the analysis code.
    • passwords_analysis.pdf: PDF version of the Jupyter Notebook containing the analysis code.
  1. Project: StudentSurvey
  • Description: Data from a survey of students in introductory statistics courses
  • Files:
    • main.R: The main project file used for importing the dataset and analysis.
    • StudentSurvey.Rproj: RStudio project configuation to replicate the project on your own machine.
    • StudentSurvey_analysis.ipynb: Jupyter Notebook containing the analysis code.
    • StudentSurvey_analysis.pdf: PDF version of the Jupyter Notebook containing the analysis code.
  1. Project: Crash Reporting
  • Description: Data on vehicular accidents in the United States along with a custom survey from other students
  • Files:
    • .style.yapf: The stylesheet rules for the python code formatter Yet Another Python Formatter (YAPF).
    • background.py: External and unimportant functions used for the interactive aspects of analysis.
    • main.py: The main project file used for importing the dataset, necessary dependencies, and analysis.
    • requirements.txt: List of required dependencies for the project.
    • survey.csv: CSV file of survey responses used in the project analysis.
    • user_options.py: File consisting of functions for the user's options within the terminal.
    • vehicle_accidents.csv: The main dataset used in the project.
    • visualizations.py: Functions for data visualizations used in the analysis.

Sophomore Year

  1. Project: ChangeOfCareer
  • Description: Project based research to determine what factors contribute to a change of career
  • Files:
    • .style.yapf: The stylesheet rules for the python code formatter Yet Another Python Formatter (YAPF).
    • career_change_prediction_dataset.csv: The main dataset used in the project.
    • confint.py: Script for calulating the 95% confidence interval for a machine learning algorithm.
    • datacleaner.py: Script for automactically going through the dataset and removing all missing and duplicate values.
    • main.py: The main project file used for importing the dataset, necessary dependencies, and analysis.
    • ml.py: File consisting of different machine learning algorithms used in this analysis.
    • models.py: File consisting of different complex models including cross validation and confidence intervals.
    • notebook.ipynb: Jupyter Notebook containing the analysis code.
    • requirements.txt: List of required dependencies for the project.
    • treevis.py: Script for printing and visualizing decision trees.
    • visualizations.py: Functions for data visualizations used in the analysis.

Usage

Feel free to explore each project folder for more details on the analyses performed, methodologies employed, and conclusions drawn. The code is provided in Jupyter Notebook format, making it easy to replicate and build upon the analyses.

Contributions

If you find any areas of improvement or have suggestions for new analyses, feel free to open an issue or submit a pull request. Contributions are welcome and encouraged!

License

This repository is licensed under the Creative Commons Zero v1.0 Universal license, which allows you to use the code for any purpose without restrictions.

Thank you for visiting the Data Science Reports repository! Happy exploring!

About

This repository serves as a showcase of my skills and accomplishments in the field of data science. It includes a collection of projects that demonstrate my proficiency in data analysis, machine learning, and statistical modeling.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published