Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transit Bunching & Gtfs Digest Portfolio #1235

Merged
merged 13 commits into from
Sep 28, 2024
Merged
29 changes: 26 additions & 3 deletions gtfs_digest/03_report.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,9 @@
"outputs": [],
"source": [
"# Comment out and leave this cell right below pandas\n",
"organization_name = \"Marin County Transit District\""
"# organization_name = \"Marin County Transit District\"\n",
"# organization_name = \"Monterey-Salinas Transit\"\n",
"organization_name = \"City of Visalia\""
]
},
{
Expand All @@ -65,8 +67,8 @@
},
"outputs": [],
"source": [
"%%capture_parameters\n",
"organization_name"
"#%%capture_parameters\n",
"#organization_name"
]
},
{
Expand Down Expand Up @@ -478,6 +480,27 @@
" pass"
]
},
{
"cell_type": "markdown",
"id": "43045583-c228-4fe1-9b49-b7b20d9352ae",
"metadata": {},
"source": [
"## Metrics for All Routes "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5837257e-9a54-4c36-8130-d0e6ff709025",
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" display(section2.agency_overview(sched_vp_df))\n",
"except:\n",
" display(Markdown(f\"\"\"{organization_name} only has schedule data.\"\"\"))"
]
},
{
"cell_type": "markdown",
"id": "f1b398b9-63ec-4edf-860f-194fc08e3066",
Expand Down
16 changes: 9 additions & 7 deletions gtfs_digest/27_transit_bunching_seconds.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,7 @@
"metadata": {},
"source": [
"### Get high frequency routes\n",
"* Group by mean frequency minutes for the operator-route-direction grain.\n",
"* Use mean?"
"* Katrina: <i>but want to understand how the original column is calculated (over what time period). I would also count the agencies/organizations represented in that subset to see if it fits our preconceptions about which agencies run frequent routes. Also check mix of buses/trains.</i>"
]
},
{
Expand Down Expand Up @@ -936,7 +935,7 @@
"metadata": {},
"source": [
"### `rt_stop_times3`: Some scheduled arrival seconds span longer than a day: filter them out\n",
"* There are 86,400 seconds in a day"
"* Katrina: <i>I assume the scheduled arrival sec > 86400 are after midnight, don't need to throw these out. Does rt arrival sec behave the same way, or do you need to create a datetime?</i>"
]
},
{
Expand Down Expand Up @@ -1086,7 +1085,8 @@
"metadata": {},
"source": [
"### `rt_stop_times4`: Sort so `stop sequence` for the `operator-stop_id-route-id_direction_id` will be in order.\n",
"* Comparing bunching by STOP, so we have to look at the `stop sequence-stop_id.`"
"* Comparing bunching by STOP, so we have to look at the `stop sequence-stop_id.`\n",
"* Katrina: <i>Maybe you want to sort by rt arrival seconds instead of scheduled?</i>"
]
},
{
Expand Down Expand Up @@ -1196,7 +1196,7 @@
" \"shape_array_key\",\n",
" \"direction_id\",\n",
" \"stop_sequence\",\n",
" \"scheduled_arrival_sec\",\n",
" \"rt_arrival_sec\",\n",
" ]\n",
").reset_index(drop=True)"
]
Expand Down Expand Up @@ -1283,7 +1283,8 @@
"metadata": {},
"source": [
"#### `rt_stop_times5`: Filter out values in `delay` that are in the 1 hour zone\n",
"* Actual times should not exceed more than an hour or less than hour."
"* Actual times should not exceed more than an hour or less than hour.\n",
"* Katrina: <i>I am not sure if you need to throw out \">1 hour delay\" trips, the customer experience we're interested in is actual wait times between stop arrivals</i>"
]
},
{
Expand Down Expand Up @@ -1667,7 +1668,8 @@
"source": [
"### Delete out rows that are `nan`??\n",
"* I am not sure if `nans` impact calculations of the mean scheduled headway and whatnot?\n",
"* These `nans` are becuase the first `operator-route-stop_id-stop_sequence` combo won't have anything to compare it to."
"* These `nans` are becuase the first `operator-route-stop_id-stop_sequence` combo won't have anything to compare it to.\n",
"* <i>I would fill in the actual/schedule headway columns with 0 rather than dropping the first row in each grouping. I wonder if it makes sense to use a more descriptive column name than headway, such as \"minutes since last vehicle\"</i>"
]
},
{
Expand Down
Loading
Loading