Skip to content

Latest commit

 

History

History
61 lines (51 loc) · 5.93 KB

README.md

File metadata and controls

61 lines (51 loc) · 5.93 KB

SOA PPAS 2020

Educational materials for the Practical Predictive Analytics Seminar (PPAS)

This repository includes documents to supplement the 2020 PPAS. Registrants will receive additional directions on how to prepare for the seminar.

These documents are for educational purposes only.


Pre-seminar materials file notes

File Description
PPAS_ChallengeAnswers_2020.pdf Full answers to the challenge questions
PPAS_ChallengeAnswers_2020.rmd R code for the challenge answers
PPAS_ChallengeQuestions_2020.docx These questions will test what you have learned in the introductory exercises. Hints on the second page suggest relevant R functions in case you get stuck.
PPAS_DataPrep_2020.pdf Markdown output showing the R code to clean the raw data for modeling
PPAS_DataPrep_2020.Rmd R code for data prep
PPASExpandedData.Rdata Expanded data set for dataprep R code
PPASExpandedData.rds The dataset that results from the DataPrep R code
PPAS_IntrotoR_2020.pdf
PPAS_IntrotoR_2020.Rmd Introduction to using R and several key functions we will be using during the seminar. This document only briefly covers statistical modeling functions; it is more geared to ensuring you understand how such functions are called.
PPAS_Modeling_Validation_2020.pdf Modeling and validation guide
PPAS_Modeling_Validation_2020.Rmd Modeling and validation guide R code
SampleStudy2.rpt Insurance sample data set 2
SampleStudy3.rpt Insurance sample data set 3
SampleStudy.rpt Insurance sample data set 1

Hands-on activities file notes

File Description
01_FitLogisticGLM_challenge.R R code with guided challenge to fit GLM (challenge 1)
01_FitLogisticGLM_challenge_markdown.Rmd R markdown version of code with guided challenge to fit GLM (challenge 1)
01_FitLogisticGLM_solutions_markdown.Rmd R markdown code with solutions to challenge 1
01_FitLogisticGLM_solutions_markdown.html R markdown HTML output with solutions to challenge 1
02_heartdiseasedataset.RDS Version of heart disease dataset created in the solution to challenge 2
02_PracticalDataConcerns_challenge.R R code with guided challenge to tackle common practical data issues in modeling (challenge 2)
02_PracticalDataConcerns_challenge_markdown.Rmd R markdown version of code with guided challenge to tackle common practical data issues in modeling (challenge 2)
02_PracticalDataConcerns_solutions_markdown.Rmd R markdown code with solutions to challenge 2
02_PracticalDataConcerns_solutions_markdown.html R markdown HTML output with solutions to challenge 2
02_SampleLogisticModel.RDS A logistic GLM saved as an example solution
03_ModelValidation_challenge.R R code with guided challenge to validate a GLM fit (challenge 3)
03_ModelValidation_challenge_markdown.Rmd R markdown version of code with guided challenge to validate a GLM (challenge 3)
03_ModelValidation_solutions_markdown.Rmd R markdown code with solutions to challenge 3
03_ModelValidation_solutions_markdown.html R markdown HTML output with solutions to challenge 3
heartdiseasedataset.csv Original heart disease dataset from https://www.kaggle.com/mazharkarimi/heart-disease-and-stroke-prevention
heartdiseasedataset.RDS RDS version of original heart disease dataset
heartdiseasedataset_modified.RDS Modified version heart disease dataset for challenges
ModifyDatasetForPPAS.R R script to modify the original heart disease dataset for challenges
TitanicSurvival.csv A .csv copy of base R's classic Titanic survival dataset

Presentation documments file notes

File Description
[2020 Virtual PPAS - Session 1 Code.RMD](presentation_documents/2020 Virtual PPAS - Session 1 Code.RMD) R markdown code for Session 2
[2020 Virtual PPAS - Session 1 Data.csv](presentation_documents/2020 Virtual PPAS - Session 1 Data.csv) Dataset accompanying Session 2 R markdown file (above)
PAS_MachineLearning_code_20200922.Rmd R markdown code presented in session 4a (machine learning); note an error on final two A/E plots
PAS_MachineLearning_code_20200922.pdf PDF markdown output presented in session 4a (machine learning); note an error on final two A/E plots