BST 262: Computing for Big Data - 2017/2018
Week 1 - Basic tools
- Lecture 1. Unix scripting, make
- Lecture 2. Version control: Git and GitHub
Week 2 - Creating and maintaining R packages
- Lecture 3. Rationale, package structure, available tools
- Lecture 4. Basics of software engineering: unit testing, continuous integration, code coverage
Week 3 - Software optimization
- Lecture 5. Measuring performance: profiling and benchmarking tools
- Lecture 6. Improving performance: an introduction to C/C++, Rcpp
Week 4 – Databases
- Lecture 7. Overview of SQL and noSQL databases
- Lecture 8. R database interfaces
Week 5 - Analyzing data that does not fit in memory
- Lecture 9. Pure R solutions, JVM solutions
- Lecture 10. An introduction to parallel computing; clusters and cloud computing
Week 6 – Visualization
- Lecture 11. Principles of visualization
- Lecture 12. Javascript and d3, maps and GIS
Weeks 7 & 8 - Guest lectures (order and precise schedule TBD)
- Lecture 13. Software project management
- Lecture 14. R and Spark
- Lecture 15. Advanced GIS and remote sensing
- Lecture 16. Cluster architecture