Skip to content

nalbarr/coursera-johnhopkins-datascience

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

coursera-johnhopkins-datascience

This is main repository to group the my course work as part of Coursera Data Science Specialization by John Hopkins University, .

Instructors

  • Roger D. Peng, PhD, Associate Professor, Biostatistics
  • Brian Caffo, PhD, Professor, Biostatistics
  • Jeff Leek, PhD, Associate Professor, Biostatistics

Specialization

My Learning Objectives

  • Learn from Roger Peng, etc.; Understand John Hopkins perspective on Data Science
  • Understand R community within Healthcare, Biostatistics
  • Learn R as a language and understanding tooling and dependencies
  • Perform literature search at it applies to Healthcare use case using R for publishing research results.

Eureka! moment

If you plan to only audit class, one thing to focus on

  • Course 3: Getting and Cleaning Data (by analyzing the UCI Human Activity Recognition (HAR) Data Set
  • During this project, I felt most productive and synthesized multiple concepts and skills; Also, it felt more real world with having to more domain analysis and data wrangling

Key Takeaways

  • R is domain specific language (DSL) that most applied statistics will use from top down; Bottom up approach would be Python (Numpy, Scipy, etc.)
  • R is easy to learn; more procedural and assignment
  • Data frames, RShiny, etc. are nice to use
  • OO/module paradigm is complex and too many ways of doing the same thing
  • For complex data science projects that require pipelines
  • RStudio has community (i.e., Microsoft heavily backing)

Course work

Related

Releases

No releases published

Packages

No packages published