Materials for Python Programming for Data Science WOW Course - this page will be updated as the course progresses.
The class workspace on Slack is https://pp4ds-ox.slack.com. I encourage you to ask questions should you have them in the Slack channel incase your classmates can help. Nick (your tutor; [email protected]) will also check Slack and provide support where possible. Download Slack from: https://slack.com/get
If you do not wish to use slack you can use Canvas to contact me and other students.
To use Jupyter yourself, I recommend you download and install Anaconda, a Python Data Science Platform, from: here Make sure you download the Python 3 version of Anaconda, ideally Python 3.8. You can also install Jupyter if you have a standard Python distribution installed. Ask your tutors for assistance if you need to install Jupyter on your own machine.
To get the contents of this repository I recommend that you install Git SCM, a source code management software, that will help you keep up-to-date with the repository. I will be adding content as the course progresses and Git will allow you to pull new material as it becomes available.
You can also run online live versions of the notebooks that are launched by Binder by clicking on the binder
buttons below without having to install anything yourself. Please note that Binder is still in beta testing and is hosted by University of California, Berkeley so may occasionally not work as expected (but is quite reliable).
Click on the green 'Code' buttone at the top right of this page, then you could either open this in an IDE such as Visual Studio Code, or clone via GitHub Desktop. Either way, cloning will create a local copy of this directory on your machine.
If you want to run the notebooks on your own computer at home, apart from installing Jupyter/Anaconda as per above, you will need to install Git, which is a source code management software, from here. Windows users can also get Git here: https://gitforwindows.org/. Once installed, you need to open up the command-line ("Command Prompt" on Windows or "Terminal" on Mac OSX) to run some commands.
Week 1: Introduction to Data Science
Week 2: Python basics: built-in types, functions and methods, if statement
Week 3: Python data structures: list, dicts, tuples, sets; for loops
Week 4: NumPy and the SciPy ecosistem. Basic statistics with NumPy
Week 5: Pandas for data science I
Week 6: Pandas for data science II
Week 7: Data visualisation: matplotlib and seaborn
Week 8: Object-oriented programming: classes, inheritance, and applications
Week 9: Data gathering and cleaning. Text pre-processing
Week 10: Introduction to experimental design and statistical test. Time-series Analysis.
- Lecture notes (face to face course): download
- Exercise 01A: Notebook Basic
- Exercise 01B: Running Code
- Exercise 01C: Working with Markdown
- Exercise 01D: Notebook Exercises
- Lecture notes (face to face course): download
- Exercise 02: Expressions
- Exercise 02: solutions
- Lecture notes (face to face course): download
- Exercise 03: Data Structures and Loops
- Exercise 03: solutions
- Lecture notes (face to face course): download
- Exercise 04: NumPy
- Exercise 04: solutions
Lecture Notes (face to face course): download- First Assignment: direct link
- Lecture Notes: download
- Exercise 06: Titanic Dataset
- Lecture Notes: download
- Exercise 07: IMDB Movies
- Lecture Notes: download
- Demo 08A: Overview of Matplotlib
- Demo 08B: Titanic Dataset exploration with Seaborn
- Second Assignment:
- Lecture Notes: download