The bootcamp will be held August 20-21, so Day 1 is August 20 and Day 2 is August 21.
Unless otherwise noted, modules are about 75 minutes long: 40 minutes for presentation, 25 minutes for breakout and 10 minutes for discussion of solutions.
-
Day 1 morning (8:30-12:15) (learning R)
- Module 0: Introduction, what is R, starting R, why R? (Chris) (15 minutes)
- Module 1: Basics of R, with RStudio (Chris presents)
- R as a calculator
- helpful shortcuts: tab-complete, up arrow, Ctrl-{up arrow}
- vectors, indexing, and subset assignment
- some basic functions; help()
- vectorized calculations
- basic R objects: vectors, dataframes, lists
- basic graphics
- breakout problems
- Break (15 minutes)
- Module 2: Managing R and your analyses (Chris presents)
- managing R objects, the R workspace
- using packages (installing, loading, namespaces)
- the working directory
- file reading/writing
- Git, GitHub and version control
- getting R help online
- breakout problems
- Module 3: Working with data (Chris presents) (45 minutes)
- dataframes, lists, and matrices
- attributes, missing values and factors
- subsetting
- strings
- breakout problems
-
Lunch (on your own) (12:00-1:30)
-
Day 1 afternoon (1:30-5:00) (data processing and manipulation)
- Finish Module 3 as needed
- Module 4: Calculations (Alan presents)
- vectorized calculations and efficiency
- apply, lapply (map operations)
- tabulation, stratified analyses,
- merging/joining tables
- breakout problems
- Break (15 minutes)
- Module 5: Programming in R (Chris presents)
- writing your own functions, function arguments, functions as objects
- basic scoping and environments
- loops, if-else
- breakout problems/homework
-
Day 2 morning (9-12:45) (programming and data analysis)
- Module 6: Data manipulation using the tidyverse (Corrine presents)
- dplyr overview and piping
- stratified analyses: groupwise operations and split-apply-combine using dplyr
- reshaping and tidying data
- breakout problems
- Break (15 minutes)
- Module 7: Graphics (Florica presents)
- base R and ggplot2 overview
- ggplot2 basics
- using aesthetics to control plotting
- exporting graphics (vector/raster formats)
- breakout problems
- Module 8: Data analysis (Chris presents)
- regression, GLMs
- smoothing
- optimization
- simulation, sample()
- dates and times
- breakout problems
- Module 6: Data manipulation using the tidyverse (Corrine presents)
-
Lunch (on your own) (12:45-2:00)
-
Day 2 afternoon (2:00-4:30) (more advanced topics)
- Module 9: Workflows, coding practices, and project management (Chris presents) (60 minutes)
- debugging, timing, memory use
- scripting, source(), batch jobs
- good coding practices
- reproducible research
- Break (20 minutes)
- Module 10: Advanced topics morsels (Chris presents) (60 minutes)
- object-oriented programming (S3, S4, R6)
- computing on the language (using R to write and evaluate R code)
- errors and try-catch
- encodings
- working with databases
- parallel processing: future, parallel lapply, parallel for loops, RNG issues
- Module 11: Wrapping up (Chris presents) (5 minutes)
- R inconsistencies and different ways to do things
- Where to learn more (campus and non-campus resources)
- Module 9: Workflows, coding practices, and project management (Chris presents) (60 minutes)