Educational materials for the Practical Predictive Analytics Seminar (PPAS)
This repository includes documents to supplement the 2020 PPAS. Registrants will receive additional directions on how to prepare for the seminar.
These documents are for educational purposes only.
File | Description |
---|---|
PPAS_ChallengeAnswers_2020.pdf | Full answers to the challenge questions |
PPAS_ChallengeAnswers_2020.rmd | R code for the challenge answers |
PPAS_ChallengeQuestions_2020.docx | These questions will test what you have learned in the introductory exercises. Hints on the second page suggest relevant R functions in case you get stuck. |
PPAS_DataPrep_2020.pdf | Markdown output showing the R code to clean the raw data for modeling |
PPAS_DataPrep_2020.Rmd | R code for data prep |
PPASExpandedData.Rdata | Expanded data set for dataprep R code |
PPASExpandedData.rds | The dataset that results from the DataPrep R code |
PPAS_IntrotoR_2020.pdf | |
PPAS_IntrotoR_2020.Rmd | Introduction to using R and several key functions we will be using during the seminar. This document only briefly covers statistical modeling functions; it is more geared to ensuring you understand how such functions are called. |
PPAS_Modeling_Validation_2020.pdf | Modeling and validation guide |
PPAS_Modeling_Validation_2020.Rmd | Modeling and validation guide R code |
SampleStudy2.rpt | Insurance sample data set 2 |
SampleStudy3.rpt | Insurance sample data set 3 |
SampleStudy.rpt | Insurance sample data set 1 |
File | Description |
---|---|
01_FitLogisticGLM_challenge.R | R code with guided challenge to fit GLM (challenge 1) |
01_FitLogisticGLM_challenge_markdown.Rmd | R markdown version of code with guided challenge to fit GLM (challenge 1) |
01_FitLogisticGLM_solutions_markdown.Rmd | R markdown code with solutions to challenge 1 |
01_FitLogisticGLM_solutions_markdown.html | R markdown HTML output with solutions to challenge 1 |
02_heartdiseasedataset.RDS | Version of heart disease dataset created in the solution to challenge 2 |
02_PracticalDataConcerns_challenge.R | R code with guided challenge to tackle common practical data issues in modeling (challenge 2) |
02_PracticalDataConcerns_challenge_markdown.Rmd | R markdown version of code with guided challenge to tackle common practical data issues in modeling (challenge 2) |
02_PracticalDataConcerns_solutions_markdown.Rmd | R markdown code with solutions to challenge 2 |
02_PracticalDataConcerns_solutions_markdown.html | R markdown HTML output with solutions to challenge 2 |
02_SampleLogisticModel.RDS | A logistic GLM saved as an example solution |
03_ModelValidation_challenge.R | R code with guided challenge to validate a GLM fit (challenge 3) |
03_ModelValidation_challenge_markdown.Rmd | R markdown version of code with guided challenge to validate a GLM (challenge 3) |
03_ModelValidation_solutions_markdown.Rmd | R markdown code with solutions to challenge 3 |
03_ModelValidation_solutions_markdown.html | R markdown HTML output with solutions to challenge 3 |
heartdiseasedataset.csv | Original heart disease dataset from https://www.kaggle.com/mazharkarimi/heart-disease-and-stroke-prevention |
heartdiseasedataset.RDS | RDS version of original heart disease dataset |
heartdiseasedataset_modified.RDS | Modified version heart disease dataset for challenges |
ModifyDatasetForPPAS.R | R script to modify the original heart disease dataset for challenges |
TitanicSurvival.csv | A .csv copy of base R's classic Titanic survival dataset |
File | Description |
---|---|
[2020 Virtual PPAS - Session 1 Code.RMD](presentation_documents/2020 Virtual PPAS - Session 1 Code.RMD) | R markdown code for Session 2 |
[2020 Virtual PPAS - Session 1 Data.csv](presentation_documents/2020 Virtual PPAS - Session 1 Data.csv) | Dataset accompanying Session 2 R markdown file (above) |
PAS_MachineLearning_code_20200922.Rmd | R markdown code presented in session 4a (machine learning); note an error on final two A/E plots |
PAS_MachineLearning_code_20200922.pdf | PDF markdown output presented in session 4a (machine learning); note an error on final two A/E plots |