Skip to content

Commit

Permalink
convert HW1 to Markdown
Browse files Browse the repository at this point in the history
No need for it to be a notebook.
  • Loading branch information
afeld committed Jan 18, 2025
1 parent 42185ae commit 75f11cc
Show file tree
Hide file tree
Showing 4 changed files with 84 additions and 163 deletions.
1 change: 0 additions & 1 deletion .github/workflows/notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ jobs:
matrix:
notebook:
- hw_0.ipynb
- hw_1.ipynb
- hw_2.ipynb
- hw_3.ipynb
- hw_4.ipynb
Expand Down
94 changes: 0 additions & 94 deletions hw_1.ipynb

This file was deleted.

51 changes: 51 additions & 0 deletions hw_1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Homework 1

_[General assignment information](assignments.md)_

## Coding

1. [Find a dataset.](final_project/resources.md#open-data-portals)
- It must have:
- At least one numeric column
- Between one thousand and one million rows
- If it's larger than that, you can filter it down.
- Don't spend too long on this step.
1. If there's more than one numeric column, pick one.
1. Create a new notebook.
1. Using pandas:
1. Read in the data.
1. Compute:
- The mean
- The median
- The mode
1. Do a `groupby()` with an [aggregation](https://pandas.pydata.org/docs/user_guide/groupby.html#aggregation).

Now [turn in the assignment](assignments.md).

## Tutorials

1. Read [The Joys (and Woes) of the Craft of Software Engineering](https://cs.calvin.edu/courses/cs/262/kvlinden/references/brooksJoysAndWoes.html)
- Note not _everything_ in there is applicable to data analysis
1. Filtering/indexing `DataFrame`s
- [Filter specific rows from a `DataFrame`](https://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/03_subset_data.html#how-do-i-filter-specific-rows-from-a-dataframe)
- [Boolean indexing](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#boolean-indexing)
1. Learn about functions
- [Video](https://www.youtube.com/watch?v=9Os0o3wzS_I&list=PL-osiE80TeTskrapNbzXhwoFUiLCjGgY7&index=8)
- [Blog post](https://python.land/introduction-to-python/functions)
1. Coding Style Guides - Please skim these; I don't expect you to understand and follow everything in them. The most important guidelines to pay attention to are indentation and keeping each statement on its own line.
- [The Hitchhiker’s Guide to Python](https://docs.python-guide.org/writing/style/)
- [PEP 8](https://www.python.org/dev/peps/pep-0008/)
1. [Guide to commenting your code](https://realpython.com/python-comments-guide/)
1. [Quartz Guide to Bad Data](https://github.com/Quartz/bad-data-guide#readme)

### Optional

- [Learn about data dictionaries](https://analystanswers.com/what-is-a-data-dictionary-a-simple-thorough-overview/)
- Glance through pandas' [comparison with other tools](https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/index.html) for any you are familiar with
- More on indexing:
- [How to Select Rows from Pandas DataFrame](https://datatofish.com/select-rows-pandas-dataframe/)
- Selecting Subsets of Data in Pandas: [Part 1](https://medium.com/dunder-data/selecting-subsets-of-data-in-pandas-6fcd0170be9c) and [Part 2](https://medium.com/dunder-data/selecting-subsets-of-data-in-pandas-39e811c81a0c)

## Participation

Reminder about the [between-class participation requirement](syllabus.md#participation).
101 changes: 33 additions & 68 deletions lecture_1_exercise.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,20 +16,13 @@
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2025-01-18T15:52:06.256870Z",
"iopub.status.busy": "2025-01-18T15:52:06.256737Z",
"iopub.status.idle": "2025-01-18T15:52:07.020501Z",
"shell.execute_reply": "2025-01-18T15:52:07.020017Z"
}
},
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2025-01-18 10:52:06 URL:https://data.cityofnewyork.us/resource/m6nq-qud6.csv [136065] -> \"2021_yellow_taxi_trips.csv\" [1]\r\n"
"2025-01-18 12:39:39 URL:https://data.cityofnewyork.us/resource/m6nq-qud6.csv [136065] -> \"2021_yellow_taxi_trips.csv\" [1]\n"
]
}
],
Expand All @@ -47,46 +40,39 @@
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"execution": {
"iopub.execute_input": "2025-01-18T15:52:07.025080Z",
"iopub.status.busy": "2025-01-18T15:52:07.023943Z",
"iopub.status.idle": "2025-01-18T15:52:07.329638Z",
"shell.execute_reply": "2025-01-18T15:52:07.327840Z"
}
},
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2021_yellow_taxi_trips.csv lecture_1_exercise.ipynb\r\n",
"LICENSE.md lecture_1_exercise_solution.ipynb\r\n",
"Makefile lecture_2.ipynb\r\n",
"\u001b[1m\u001b[36m__pycache__\u001b[m\u001b[m lecture_2.slides.html\r\n",
"\u001b[1m\u001b[36m_build\u001b[m\u001b[m lecture_2_exercise.ipynb\r\n",
"_config.yml lecture_3.ipynb\r\n",
"\u001b[1m\u001b[36m_static\u001b[m\u001b[m lecture_3.slides.html\r\n",
"_toc.yml lecture_3_exercise.ipynb\r\n",
"assignments.md lecture_3_exercise_solution.ipynb\r\n",
"conf.py lecture_4.ipynb\r\n",
"curve.ipynb lecture_4.slides.html\r\n",
"\u001b[35mdata\u001b[m\u001b[m lecture_5.ipynb\r\n",
"\u001b[1m\u001b[36mextras\u001b[m\u001b[m lecture_5.slides.html\r\n",
"\u001b[1m\u001b[36mfinal_project\u001b[m\u001b[m lecture_5_exercise_solution.ipynb\r\n",
"final_project.md lecture_6.ipynb\r\n",
"home.md lecture_6.slides.html\r\n",
"hw_0.ipynb lectures.md\r\n",
"hw_1.ipynb \u001b[1m\u001b[36mmeta\u001b[m\u001b[m\r\n",
"hw_2.ipynb meta.md\r\n",
"hw_3.ipynb nbdime_config.json\r\n",
"hw_4.ipynb pyproject.toml\r\n",
"index.md registration.md\r\n",
"joining_late.md resources.md\r\n",
"lecture_0.ipynb \u001b[35msolutions\u001b[m\u001b[m\r\n",
"lecture_0.slides.html syllabus.md\r\n",
"lecture_1.ipynb \u001b[1m\u001b[36mtmp\u001b[m\u001b[m\r\n",
"lecture_1.slides.html\r\n"
"2021_yellow_taxi_trips.csv lecture_1_exercise.ipynb\n",
"LICENSE.md lecture_1_exercise_solution.ipynb\n",
"Makefile lecture_2.ipynb\n",
"\u001b[1m\u001b[36m__pycache__\u001b[m\u001b[m lecture_2.slides.html\n",
"\u001b[1m\u001b[36m_build\u001b[m\u001b[m lecture_2_exercise.ipynb\n",
"_config.yml lecture_3.ipynb\n",
"\u001b[1m\u001b[36m_static\u001b[m\u001b[m lecture_3.slides.html\n",
"_toc.yml lecture_3_exercise.ipynb\n",
"assignments.md lecture_3_exercise_solution.ipynb\n",
"conf.py lecture_4.ipynb\n",
"curve.ipynb lecture_4.slides.html\n",
"\u001b[35mdata\u001b[m\u001b[m lecture_5.ipynb\n",
"\u001b[1m\u001b[36mextras\u001b[m\u001b[m lecture_5.slides.html\n",
"\u001b[1m\u001b[36mfinal_project\u001b[m\u001b[m lecture_5_exercise_solution.ipynb\n",
"final_project.md lecture_6.ipynb\n",
"home.md lecture_6.slides.html\n",
"hw_0.ipynb lectures.md\n",
"hw_1.md \u001b[1m\u001b[36mmeta\u001b[m\u001b[m\n",
"hw_2.ipynb meta.md\n",
"hw_3.ipynb nbdime_config.json\n",
"hw_4.ipynb pyproject.toml\n",
"index.md registration.md\n",
"joining_late.md resources.md\n",
"lecture_0.ipynb \u001b[35msolutions\u001b[m\u001b[m\n",
"lecture_0.slides.html syllabus.md\n",
"lecture_1.ipynb \u001b[1m\u001b[36mtmp\u001b[m\u001b[m\n",
"lecture_1.slides.html\n"
]
}
],
Expand All @@ -113,14 +99,7 @@
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"execution": {
"iopub.execute_input": "2025-01-18T15:52:07.343336Z",
"iopub.status.busy": "2025-01-18T15:52:07.341837Z",
"iopub.status.idle": "2025-01-18T15:52:07.361088Z",
"shell.execute_reply": "2025-01-18T15:52:07.354933Z"
}
},
"metadata": {},
"outputs": [],
"source": [
"# your code here"
Expand All @@ -138,14 +117,7 @@
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"execution": {
"iopub.execute_input": "2025-01-18T15:52:07.377337Z",
"iopub.status.busy": "2025-01-18T15:52:07.376933Z",
"iopub.status.idle": "2025-01-18T15:52:07.395369Z",
"shell.execute_reply": "2025-01-18T15:52:07.387834Z"
}
},
"metadata": {},
"outputs": [],
"source": [
"# your code here"
Expand All @@ -163,14 +135,7 @@
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"execution": {
"iopub.execute_input": "2025-01-18T15:52:07.402215Z",
"iopub.status.busy": "2025-01-18T15:52:07.401931Z",
"iopub.status.idle": "2025-01-18T15:52:07.414170Z",
"shell.execute_reply": "2025-01-18T15:52:07.410143Z"
}
},
"metadata": {},
"outputs": [],
"source": [
"# your code here"
Expand Down

0 comments on commit 75f11cc

Please sign in to comment.