site |
---|
sandpaper::sandpaper_site |
After following this two-day lesson, learners will be able to:
- identify key open energy data sources suitable to answering energy research questions
- read in tabular data from XML, JSON, and Parquet formats using
pandas
- request data stored in cloud buckets
- request data from a variety of Application Programming Interfaces (APIs)
- scrape data from webpages using
beautifulsoup
- visualize data to quickly understand patterns and anomalies
- write Python classes and functions to break down complex cleaning tasks into reusable and discrete steps
- write automated tests to ensure that their code works as expected
- troubleshoot performance issues, and handle data that is too large to fit in memory
- automatically detect unexpected values in inputs and outputs by writing data validation tests
- transform a local codebase into a collaborative project using Github repositories, code documentation, and virtual environments.