From d5437d69c734d0ce0bfd614efa8495028a3cd874 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Mon, 18 Nov 2024 19:55:44 +0000 Subject: [PATCH 01/13] start explore notebook --- ntd/proposed_changes_25-26.ipynb | 41 ++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) create mode 100644 ntd/proposed_changes_25-26.ipynb diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb new file mode 100644 index 000000000..04608e70d --- /dev/null +++ b/ntd/proposed_changes_25-26.ipynb @@ -0,0 +1,41 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a5010c8b-6fe2-49cd-8dfe-681e8de340d3", + "metadata": {}, + "source": [ + "# NTD Proposed Changes 2025-2026 Analysis" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8c8956ce-24a5-4d6b-8101-1ef5a547d3ca", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.13" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From b342290a8301a06061e183fa0d9630a547ae3476 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Mon, 18 Nov 2024 21:55:53 +0000 Subject: [PATCH 02/13] starting to add task details --- ntd/proposed_changes_25-26.ipynb | 58 +++++++++++++++++++++++++++++++- 1 file changed, 57 insertions(+), 1 deletion(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 04608e70d..1373f8990 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -8,10 +8,66 @@ "# NTD Proposed Changes 2025-2026 Analysis" ] }, + { + "cell_type": "markdown", + "id": "8e1248e8-1f49-462c-ba0c-424a2240f5ee", + "metadata": {}, + "source": [ + "Proposed Change Text:\n", + "\n", + "https://www.federalregister.gov/documents/2024/10/31/2024-25341/national-transit-database-proposed-reporting-changes-and-clarifications-for-report-years-2025-and\n" + ] + }, + { + "cell_type": "markdown", + "id": "923637b0-7d95-4614-a425-10a6238a3f1d", + "metadata": {}, + "source": [ + "Task:\n", + " - As Caltrans DDS prepare to submit comments of these proposed changes to NTD, the Data Science branch is tasked to address 3 areas the proposed changes may affect the branch.\n", + " - Analysis of the following areas to be submitted to the Transit Quality Branch by 11/27/2024" + ] + }, + { + "cell_type": "markdown", + "id": "afdfd59c-d9c1-4d48-b6e4-55e9af08be29", + "metadata": {}, + "source": [ + "## Area 1 - NTD Reporting Streamlining\n", + ">E: The first is on the topic of NTD reporting streamlining. There are a few items being proposed that may add additional reporting burden and some that propose to streamline things. Is this an area that you all have an opinion on?\n", + "\n", + ">K: will take a look based on our understanding of common past reporting errors identified in the NTD Modernization project and provide some comments. " + ] + }, + { + "cell_type": "markdown", + "id": "97f07bb1-4fd2-4ad0-bb65-2419928d5326", + "metadata": {}, + "source": [ + "## Area 2 - Rural, Full Reporters to Reduced Reporters\n", + "> E: The second area is on the topic of NTD data coming through as a result of a potential reduction in some full reporters in rural areas. I’m not sure which ones these would be because the rulemaking wouldn’t affect all reporters. I’m also not sure what data wouldn’t be reported as a result. Is this something the analyst team can look into further… ie see if the proposed change in Section G would affect any California agencies and what data we may not receive from NTD as a result.\n", + "\n", + "> K: identify which CA agencies are full reporters in rural areas that meet the criteria in section G, although I think we’d do it based on 2023 NTD data and FTA would do it on 2024 data. I am not sure if there would be data loss based on FTA’s assessment that these agencies were historically Rural reporters." + ] + }, + { + "cell_type": "markdown", + "id": "65592033-b6e9-4bf7-a24a-6b1d6f7a087b", + "metadata": {}, + "source": [ + "## Area 3 - Volunteer Reporters\n", + ">E: The third area is that I noticed that it was mentioned in Section H that voluntary reports may help a state receive more money. Given this helpful piece of information, I had 3 follow-up ideas. \n", + ">1. Can we figure out how it helps increase funding by being a voluntary reporter \n", + ">2. Can we do an audit to see which agencies aren’t reporters that maybe could be and \n", + ">3. What monetary benefit could be gained if we helped these agencies become voluntary reporters.\n", + "\n", + ">K: would auditing this entail simply looking at organizations that don’t have an NTD ID? Or something else?" + ] + }, { "cell_type": "code", "execution_count": null, - "id": "8c8956ce-24a5-4d6b-8101-1ef5a547d3ca", + "id": "b9f386ff-59fa-4eec-add5-b8c16a820cb3", "metadata": {}, "outputs": [], "source": [] From 3604212c7c655ada6df68e00b60230b6fc1641bd Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Tue, 19 Nov 2024 00:17:39 +0000 Subject: [PATCH 03/13] more content to area 1 --- ntd/proposed_changes_25-26.ipynb | 104 +++++++++++++++++++++++++++++-- 1 file changed, 100 insertions(+), 4 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 1373f8990..03f5f225d 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -28,15 +28,107 @@ " - Analysis of the following areas to be submitted to the Transit Quality Branch by 11/27/2024" ] }, + { + "cell_type": "code", + "execution_count": 1, + "id": "b5679a16-211e-400a-bf08-4c377b4b27ed", + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "\n", + "from calitp_data_analysis.tables import tbls\n", + "from siuba import _, collect, count, filter, show_query\n", + "\n", + "pd.set_option(\"display.max_columns\", None)\n", + "pd.set_option(\"display.max_rows\", None)" + ] + }, { "cell_type": "markdown", "id": "afdfd59c-d9c1-4d48-b6e4-55e9af08be29", "metadata": {}, "source": [ "## Area 1 - NTD Reporting Streamlining\n", - ">E: The first is on the topic of NTD reporting streamlining. There are a few items being proposed that may add additional reporting burden and some that propose to streamline things. Is this an area that you all have an opinion on?\n", + ">E: The first is on the topic of NTD reporting streamlining. There are `a few items being proposed that may add additional reporting burden` and `some that propose to streamline things`. Is this an area that you all have an opinion on?\n", "\n", - ">K: will take a look based on our understanding of common past reporting errors identified in the NTD Modernization project and provide some comments. " + ">K: will take a look based on our understanding of `common past reporting errors identified in the NTD Modernization project` and provide some comments. \n", + "\n", + ">E: That sounds great" + ] + }, + { + "attachments": { + "ba55d0e0-457e-411f-8851-c0eda48ae5ab.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "id": "d81a5cc6-0f45-4980-9f16-a9bac97483e1", + "metadata": {}, + "source": [ + "### Revisit the NTD Modernization - Issues Analysis\n", + "Slalom completed an analysis of the most common erorrs in NTD reporting for 2020, 2021 and 2022 by frequency, type and agency. \n", + "\n", + "As reported, 3 issues account for ~25% of all errors:\n", + ">1. RR20F-005: The cost per hour changed by 30% or more. \n", + ">2. A10-033: The number of General Purpose Maintenance Facilities differs from previous year. \n", + ">3. RR20F-146: The miles per vehicle changed by 20% or more.\n", + "\n", + "Slalom identified a list of the top 23 errors by frequency and most comments to determine which errors should be prioritized first.\n", + "\n", + "sorted by issue ID:\n", + "\n", + "![image.png](attachment:ba55d0e0-457e-411f-8851-c0eda48ae5ab.png)\n", + "- 2x A10 errors\n", + "- 2x A30 errors\n", + "- 18x RR20F errors\n", + "- 1x RR20U errors" + ] + }, + { + "cell_type": "markdown", + "id": "70b0e807-80d6-49d7-bad3-eb2df8b7663b", + "metadata": {}, + "source": [ + "### Notes from Proposed Changes document\n", + "\n", + "- **Sec B: increases reporting burden for all**\n", + " - requires agencies to submit shapes.txt\n", + " - align agency_id to NTD ID\n", + " \n", + "- **Sec C: decreases reporting burden for all**\n", + " - FTA acknowledges that the a15 and a10 is causing discrepanices\n", + " - removes A-10 form\n", + " - moves ADA data and other data from A-10 form to new extended A-15 form\n", + " - clarifies what a \"station\" or \"facility\" is and how to count them for reporting\n", + " - establish a standardize reporting method for passenger stations and facilities\n", + " - **`Sec C directly relates to the issues analysis, A10 error was the 2nd more common error in reporting`**\n", + "\n", + "- **Sec D: slight increase reporting burden for all**\n", + " - proposes to add new categories to A-20 form:\n", + " \t1. “Track—Turntable,” \n", + "\t\t2. “Power and Signal—Pump Rooms”\n", + " 3. “Power and Signal—Fan Plants” \n", + " - Adds a \"decade of construction\" field to these category. this way FTA can more accurately capture an asset was reconstructed or renovated.\n", + "\n", + "- **Sec E: may slightly increase reporting burden for some**\n", + " - clarifies what counts as a cyber security event and adds more choices to better describe what type of event has happened\n", + " - also expands what is IT infrastures and an heiarchy of events\n", + " - I believe this reporting only applies IF the agency experiences a security event. So an agency who has not had any security events, will not have an increased reporting burden\n", + "\n", + "- **Sec F: may slightly increase reporting burden for some**\n", + " - revise the NTD major event reporting requirements to capture the new “disabling damage” event category\n", + " - similar to Sec e, only applies if a safety event occurs\n", + "\n", + "- **Sec G: decreases reporting burden for some**\n", + " - there are some unique rural operators that operate in multiple, small areas, but have the full reporter status\n", + " - proposes that these unique operators get a waiver to turn them into reduced reporters\n", + " - FTA estimates this affects 10-15 agencies.\n", + "\n", + "- **Sec H: may significantly increase reporting burden for some**\n", + " - propose to have a new category in NTD reporting field to allow transit agencies to declare if they are a voluntary reporter or not\n", + " - if a transit agency decides to become a voluntary reporter, they must report everything that would be applicalbe to their agency(?)" ] }, { @@ -47,7 +139,9 @@ "## Area 2 - Rural, Full Reporters to Reduced Reporters\n", "> E: The second area is on the topic of NTD data coming through as a result of a potential reduction in some full reporters in rural areas. I’m not sure which ones these would be because the rulemaking wouldn’t affect all reporters. I’m also not sure what data wouldn’t be reported as a result. Is this something the analyst team can look into further… ie see if the proposed change in Section G would affect any California agencies and what data we may not receive from NTD as a result.\n", "\n", - "> K: identify which CA agencies are full reporters in rural areas that meet the criteria in section G, although I think we’d do it based on 2023 NTD data and FTA would do it on 2024 data. I am not sure if there would be data loss based on FTA’s assessment that these agencies were historically Rural reporters." + "> K: identify which CA agencies are full reporters in rural areas that meet the criteria in section G, although I think we’d do it based on 2023 NTD data and FTA would do it on 2024 data. I am not sure if there would be data loss based on FTA’s assessment that these agencies were historically Rural reporters.\n", + "\n", + "> E: 2.\tYeah, I’m not expecting there would be much impact here, but perhaps it could affect something like Fresno County or something that provides a lot of rural service, but is large enough to be a full reporter. It’s worth flagging and understanding if it could be an impact or not. " ] }, { @@ -61,7 +155,9 @@ ">2. Can we do an audit to see which agencies aren’t reporters that maybe could be and \n", ">3. What monetary benefit could be gained if we helped these agencies become voluntary reporters.\n", "\n", - ">K: would auditing this entail simply looking at organizations that don’t have an NTD ID? Or something else?" + ">K: would auditing this entail simply looking at organizations that don’t have an NTD ID? Or something else?\n", + "\n", + ">E: Yes, I think we would look at all transit agencies in the transit database that operate fixed route service that don’t appear to have an NTD ID and are not present in NTD data." ] }, { From 15f5da77da23a75238c3194ee069365f954f2c60 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Tue, 19 Nov 2024 18:33:10 +0000 Subject: [PATCH 04/13] comments to area 1 --- ntd/proposed_changes_25-26.ipynb | 39 +++++++++++++++++++++++++++----- 1 file changed, 33 insertions(+), 6 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 03f5f225d..c5e6c2794 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -3,7 +3,9 @@ { "cell_type": "markdown", "id": "a5010c8b-6fe2-49cd-8dfe-681e8de340d3", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ "# NTD Proposed Changes 2025-2026 Analysis" ] @@ -47,7 +49,10 @@ { "cell_type": "markdown", "id": "afdfd59c-d9c1-4d48-b6e4-55e9af08be29", - "metadata": {}, + "metadata": { + "jp-MarkdownHeadingCollapsed": true, + "tags": [] + }, "source": [ "## Area 1 - NTD Reporting Streamlining\n", ">E: The first is on the topic of NTD reporting streamlining. There are `a few items being proposed that may add additional reporting burden` and `some that propose to streamline things`. Is this an area that you all have an opinion on?\n", @@ -65,9 +70,11 @@ }, "cell_type": "markdown", "id": "d81a5cc6-0f45-4980-9f16-a9bac97483e1", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ - "### Revisit the NTD Modernization - Issues Analysis\n", + "### Revisit the NTD Modernization - Issues Analysis\n", "Slalom completed an analysis of the most common erorrs in NTD reporting for 2020, 2021 and 2022 by frequency, type and agency. \n", "\n", "As reported, 3 issues account for ~25% of all errors:\n", @@ -91,7 +98,7 @@ "id": "70b0e807-80d6-49d7-bad3-eb2df8b7663b", "metadata": {}, "source": [ - "### Notes from Proposed Changes document\n", + "### Notes from Proposed Changes document\n", "\n", "- **Sec B: increases reporting burden for all**\n", " - requires agencies to submit shapes.txt\n", @@ -128,7 +135,27 @@ "\n", "- **Sec H: may significantly increase reporting burden for some**\n", " - propose to have a new category in NTD reporting field to allow transit agencies to declare if they are a voluntary reporter or not\n", - " - if a transit agency decides to become a voluntary reporter, they must report everything that would be applicalbe to their agency(?)" + " - if a transit agency decides to become a voluntary reporter, they must complete the NTD report in its entirety. \n", + " - These reporters voluntarily comply with all NTD reporting requirements under the NTD rule (49 CFR Part 630) and the USOA." + ] + }, + { + "cell_type": "markdown", + "id": "e701c5eb-db38-428d-a810-76d186f3cffc", + "metadata": { + "tags": [] + }, + "source": [ + "## Comments for Area 1\n", + "Regarding the possible affects on the Data Science Branch, we have the biggest interest in Sections C and G as it affects metrics we typically use in analyses.\n", + "\n", + "Slalom conducted an analysis that looked into the types of reporting errors Caltrans received from NTD for 3 reporting years. The analysis found that Form A-10 errors were quite common. Section C of the proposed changes state that FTA is also awear of the issues in the A-10 and A-15 form. The proposed change eliminates the A-10 entirely and moves some of the initial A-10 metrics over to a new, extended A-15 form. This change aims to reduce the reporting burden for transit agencies.\n", + "\n", + "Sections G concerns changing rural operators with full reporter responsibilities to be reduced reporters. FTA aims to decrease the reporting burden, but this change affects an estimated 10-15 operators.\n", + "\n", + "Section H proposes a change to the NTD reporting platform to include a field that identify Voluntary reporters. This slightly increases the reporting burden for all NTD reporters. Sections E and F may conditionally increase the reporting burden for some operators, if the operator experiences cyber security or safetly events. Section D slightly incresses the reporting burden by proposing new categories in to A-20 form.\n", + "\n", + "\n" ] }, { From 6fa7473d759b87f10a6bfda745719af549448e77 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Wed, 20 Nov 2024 00:28:08 +0000 Subject: [PATCH 05/13] added notes to area 2 --- ntd/proposed_changes_25-26.ipynb | 67 ++++++++++++++++++++++++++++---- 1 file changed, 60 insertions(+), 7 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index c5e6c2794..2fff16752 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -147,13 +147,17 @@ }, "source": [ "## Comments for Area 1\n", - "Regarding the possible affects on the Data Science Branch, we have the biggest interest in Sections C and G as it affects metrics we typically use in analyses.\n", + "Regarding the possible affects on the Data Science Branch, we have the biggest interest in Sections C and G as it affects NTD metrics we typically use in analyses.\n", "\n", - "Slalom conducted an analysis that looked into the types of reporting errors Caltrans received from NTD for 3 reporting years. The analysis found that Form A-10 errors were quite common. Section C of the proposed changes state that FTA is also awear of the issues in the A-10 and A-15 form. The proposed change eliminates the A-10 entirely and moves some of the initial A-10 metrics over to a new, extended A-15 form. This change aims to reduce the reporting burden for transit agencies.\n", + "Slalom conducted an analysis that looked into the types of reporting errors Caltrans received from NTD for 3 reporting years. The analysis found that Form A-10 errors were quite common. Section C of the proposed changes state that FTA is also awear of the issues in the A-10 and A-15 form. The proposed changes eliminates the A-10 entirely and moves some of the initial A-10 metrics over to a new, extended A-15 form. This change aims to reduce the reporting burden for transit agencies.\n", "\n", - "Sections G concerns changing rural operators with full reporter responsibilities to be reduced reporters. FTA aims to decrease the reporting burden, but this change affects an estimated 10-15 operators.\n", + "Sections G concerns changing rural operators with full reporter responsibilities to be reduced reporters. FTA aims to decrease the reporting burden, but this change affects an estimated 10-15 operators. going from full reporter to reduced reporters would mean the operator does not need to report data related to passenger miles or monthly service or safety stats\n", "\n", - "Section H proposes a change to the NTD reporting platform to include a field that identify Voluntary reporters. This slightly increases the reporting burden for all NTD reporters. Sections E and F may conditionally increase the reporting burden for some operators, if the operator experiences cyber security or safetly events. Section D slightly incresses the reporting burden by proposing new categories in to A-20 form.\n", + "\n", + "\n", + "---\n", + "\n", + "Section H proposes a change to the NTD reporting platform to include a field that identify Voluntary reporters. This slightly increases the reporting burden for all NTD reporters. Sections E and F may conditionally increase the reporting burden for some operators, if the operator experiences cyber security or safety events. Finally, Section D slightly incresses the reporting burden by proposing new categories in to A-20 form.\n", "\n", "\n" ] @@ -164,11 +168,60 @@ "metadata": {}, "source": [ "## Area 2 - Rural, Full Reporters to Reduced Reporters\n", - "> E: The second area is on the topic of NTD data coming through as a result of a potential reduction in some full reporters in rural areas. I’m not sure which ones these would be because the rulemaking wouldn’t affect all reporters. I’m also not sure what data wouldn’t be reported as a result. Is this something the analyst team can look into further… ie see if the proposed change in Section G would affect any California agencies and what data we may not receive from NTD as a result.\n", + "> E: The second area is on the topic of NTD data coming through as a result of a `potential reduction in some full reporters in rural areas`. I’m not sure which ones these would be because the rulemaking wouldn’t affect all reporters. I’m also `not sure what data wouldn’t be reported as a result`. Is this something the analyst team can look into further… ie see if the proposed change in Section G would affect any California agencies and what data we may not receive from NTD as a result.\n", + "\n", + "> K: `identify which CA agencies are full reporters in rural areas that meet the criteria in section G`, although I think we’d do it based on `2023 NTD data` and FTA would do it on 2024 data. I am not sure if there would be data loss based on FTA’s assessment that these agencies were historically Rural reporters.\n", + "\n", + "> E: 2.\tYeah, I’m not expecting there would be much impact here, but perhaps it `could affect something like Fresno County` or something that provides a lot of rural service, but is large enough to be a full reporter. It’s worth flagging and understanding if it could be an impact or not. " + ] + }, + { + "cell_type": "markdown", + "id": "d96201a0-75f0-4d6e-b9fb-26d6849e428d", + "metadata": {}, + "source": [ + "### Understanding the difference between urban Full Reporters and urban Reduced Reporters\n", + "Per NTD reporting manual\n", + ">Full Reporters must provide the Annual Report, as well as Monthly Ridership (MR) and monthly Safety and Security reports. All other reporter types file their reports on an annual basis.\n", + "\n", + ">Full Reporters must report data for total revenues earned during the fiscal year. Reduced Reporters only report operating and capital expenditures incurred in the fiscal year, by source of revenue.\n", + "\n", + "**List of form used by both Urban Full and Reduced Reporters:**\n", + "1. Basic Information (Form P-10)\n", + "2. Modes and Types of Service (Form P-20)\n", + "3. Reporter Users (Form P-30)\n", + "4. General Transit Feed Specification Data for Fixed Route Modes (Form P-50)\n", + "5. Identification (Form B-10)\n", + "6. Geospatial Data for Demand Response Modes (Form B-15)\n", + "7. Contractual Relationship Data Requirements (Form B-30)\n", + "8. Transit Asset Management Performance Measure Targets (Form A-90)\n", + "9. Stations and Maintenance Facilities (Form A-10)\n", + "10. Transit Asset Management Facilities Inventory (Form A-15)\n", + "11. Revenue Vehicle Inventory (Form A-30)\n", + "12. Service Vehicle Inventory (Form A-35)\n", + "13. Reporting Federal Funding Allocation Data (Form FFA-10)\n", + "14. CEO Certification (Form D-10)\n", + "\n", + "\n", + "**List of unique forms for Urban Full Reporters:**\n", + "1. Reportable Segments (Form P-40)\n", + "2. Funding Sources (Form F-10)\n", + "3. Capital Expenses (Form F-20)\n", + "4. Operating Expenses: Uniform System of Accounts Functions and Object Classes (Form F-30)\n", + "5. Operating Expenses: Uniform System of Accounts Object Classes — Reconciling Items (Form F-40)\n", + "6. Uniform System of Accounts Object Classes: Financial Statement (Form F-60)\n", + "7. Monthly Ridership Reporting (Form MR-20)\n", + "8. Weekly Reference Reporting (Form WE-20)\n", + "9. Transit Way Mileage (Form A-20)\n", + "10. Employees (Form R-10)\n", + "11. Maintenance Performance (Form R-20)\n", "\n", - "> K: identify which CA agencies are full reporters in rural areas that meet the criteria in section G, although I think we’d do it based on 2023 NTD data and FTA would do it on 2024 data. I am not sure if there would be data loss based on FTA’s assessment that these agencies were historically Rural reporters.\n", "\n", - "> E: 2.\tYeah, I’m not expecting there would be much impact here, but perhaps it could affect something like Fresno County or something that provides a lot of rural service, but is large enough to be a full reporter. It’s worth flagging and understanding if it could be an impact or not. " + "**List of unique forms for Urban Reduced Reporters:**\n", + "1. S&S-60 Safety Data Form\n", + "2. Reduced Reporting Form (Form RR-20)\n", + "3. Transit Asset Management Performance Measure Targets (Form A-90)\n", + " \n" ] }, { From ca16af9b36be5ea7eff7de43e2ad865bb2f2e397 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Wed, 20 Nov 2024 23:34:53 +0000 Subject: [PATCH 06/13] more content to area 1. started area 2 --- ntd/proposed_changes_25-26.ipynb | 36 ++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 2fff16752..31b1bc123 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -26,8 +26,8 @@ "metadata": {}, "source": [ "Task:\n", - " - As Caltrans DDS prepare to submit comments of these proposed changes to NTD, the Data Science branch is tasked to address 3 areas the proposed changes may affect the branch.\n", - " - Analysis of the following areas to be submitted to the Transit Quality Branch by 11/27/2024" + "- As Caltrans DDS prepare to submit comments of these proposed changes to NTD, the Data Science branch is tasked to address 3 areas the proposed changes may affect the branch.\n", + "- Analysis of the following areas to be submitted to the Transit Quality Branch by 11/27/2024" ] }, { @@ -165,7 +165,9 @@ { "cell_type": "markdown", "id": "97f07bb1-4fd2-4ad0-bb65-2419928d5326", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ "## Area 2 - Rural, Full Reporters to Reduced Reporters\n", "> E: The second area is on the topic of NTD data coming through as a result of a `potential reduction in some full reporters in rural areas`. I’m not sure which ones these would be because the rulemaking wouldn’t affect all reporters. I’m also `not sure what data wouldn’t be reported as a result`. Is this something the analyst team can look into further… ie see if the proposed change in Section G would affect any California agencies and what data we may not receive from NTD as a result.\n", @@ -221,9 +223,35 @@ "1. S&S-60 Safety Data Form\n", "2. Reduced Reporting Form (Form RR-20)\n", "3. Transit Asset Management Performance Measure Targets (Form A-90)\n", - " \n" + "---\n", + "So if an operator, under this proposed change, goes from Full to Reduced reporter, we can expect to miss data from 11 forms. However, those unique forms dont look familar in the ntd validation report pipeline so im unsure what kind of impact the data science branch would see.\n", + "\n", + "Will need to see if theres equivilant forms between Full and Reduced reporters that report similar data but in different forms." ] }, + { + "cell_type": "markdown", + "id": "8cf55bc0-8bfc-451f-a663-5dbe44a14c81", + "metadata": {}, + "source": [ + "### Query the warehouse to get find all the reporters that meet Sec G criteria\n", + "\n", + "Sec G criteria:\n", + "- Receives funding under 49 U.S.C. `5311`,\n", + "- Reports `one or more` primary or secondary `UZA`s on their Federal Funding Allocation form (`FFA-10`),\n", + "- Operates `more than 30` Vehicles Operated in Maximum Service (`VOMS`),\n", + "- Operates `fewer total VOMS in urbanized areas (UZAs)` than `rural (non-UZA) areas`, and\n", + "- Allocates `more total Vehicle Revenue Miles (VRM) to non-UZAs` than `UZAs`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "20505f79-5721-4789-b10a-61bbdbb57b9e", + "metadata": {}, + "outputs": [], + "source": [] + }, { "cell_type": "markdown", "id": "65592033-b6e9-4bf7-a24a-6b1d6f7a087b", From 33d79122a3235deeb92f50b4479f65e59c2813d0 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Thu, 21 Nov 2024 19:53:18 +0000 Subject: [PATCH 07/13] queried all the NTD tables to try and find agencies with multiple UZA --- ntd/proposed_changes_25-26.ipynb | 534 ++++++++++++++++++++++++++++++- 1 file changed, 531 insertions(+), 3 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 31b1bc123..b983588e5 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -32,7 +32,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "id": "b5679a16-211e-400a-bf08-4c377b4b27ed", "metadata": {}, "outputs": [], @@ -244,13 +244,541 @@ "- Allocates `more total Vehicle Revenue Miles (VRM) to non-UZAs` than `UZAs`.\n" ] }, + { + "cell_type": "markdown", + "id": "80ff21e9-de8b-4dc4-b431-0a57f3919308", + "metadata": {}, + "source": [ + "#### dim_annual_ntd_agency_information" + ] + }, { "cell_type": "code", - "execution_count": null, + "execution_count": 26, "id": "20505f79-5721-4789-b10a-61bbdbb57b9e", "metadata": {}, "outputs": [], - "source": [] + "source": [ + "ntd_agency_info = (tbls.mart_ntd.dim_annual_ntd_agency_information()\n", + " >> filter(_._is_current == True,\n", + " _.year == 2022\n", + " )\n", + " >> collect()\n", + " )" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "id": "1fac9f3a-3bd9-4c4c-9f83-86002a1e56f8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 2969 entries, 0 to 2968\n", + "Data columns (total 46 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 2969 non-null object \n", + " 1 year 2969 non-null int64 \n", + " 2 ntd_id 2969 non-null object \n", + " 3 number_of_state_counties 54 non-null float64 \n", + " 4 tam_tier 2969 non-null object \n", + " 5 personal_vehicles 1151 non-null float64 \n", + " 6 density 1125 non-null float64 \n", + " 7 uza_name 1125 non-null object \n", + " 8 tribal_area_name 138 non-null object \n", + " 9 service_area_sq_miles 1044 non-null float64 \n", + " 10 total_voms 2969 non-null float64 \n", + " 11 city 2944 non-null object \n", + " 12 fta_recipient_id 2242 non-null float64 \n", + " 13 region 2969 non-null float64 \n", + " 14 state_admin_funds_expended 54 non-null float64 \n", + " 15 zip_code_ext 2463 non-null float64 \n", + " 16 zip_code 2944 non-null float64 \n", + " 17 ueid 2575 non-null object \n", + " 18 address_line_2 323 non-null object \n", + " 19 number_of_counties_with_service 54 non-null float64 \n", + " 20 reporter_acronym 1499 non-null object \n", + " 21 original_due_date 2959 non-null float64 \n", + " 22 sq_miles 1124 non-null float64 \n", + " 23 address_line_1 2914 non-null object \n", + " 24 p_o__box 833 non-null object \n", + " 25 fy_end_date 2969 non-null int64 \n", + " 26 reported_by_ntd_id 1775 non-null object \n", + " 27 population 1125 non-null float64 \n", + " 28 reporting_module 2969 non-null object \n", + " 29 service_area_pop 1045 non-null float64 \n", + " 30 subrecipient_type 1770 non-null object \n", + " 31 state 2969 non-null object \n", + " 32 volunteer_drivers 1151 non-null float64 \n", + " 33 primary_uza 0 non-null object \n", + " 34 doing_business_as 584 non-null object \n", + " 35 reporter_type 2969 non-null object \n", + " 36 legacy_ntd_id 2093 non-null object \n", + " 37 voms_do 1747 non-null float64 \n", + " 38 url 2876 non-null object \n", + " 39 reported_by_name 1775 non-null object \n", + " 40 voms_pt 671 non-null float64 \n", + " 41 organization_type 2915 non-null object \n", + " 42 agency_name 2969 non-null object \n", + " 43 _valid_from 2969 non-null datetime64[ns, UTC]\n", + " 44 _valid_to 2969 non-null datetime64[ns, UTC]\n", + " 45 _is_current 2969 non-null bool \n", + "dtypes: bool(1), datetime64[ns, UTC](2), float64(18), int64(2), object(23)\n", + "memory usage: 1.0+ MB\n" + ] + } + ], + "source": [ + "ntd_agency_info.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "id": "31591169-cdb1-45db-880a-6cdbcc013fb2", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "ntd_id\n", + "80285 1\n", + "40225 1\n", + "40223 1\n", + "40222 1\n", + "40221 1\n", + "Name: uza_name, dtype: int64" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ntd_agency_info.groupby(\"ntd_id\")[\"uza_name\"].nunique().sort_values(ascending=False).head()" + ] + }, + { + "cell_type": "markdown", + "id": "b6a189aa-2820-4efc-9d4f-5638c6d11379", + "metadata": {}, + "source": [ + "#### dim_annual_funding_sources" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "id": "fc8f9407-af22-4805-bb0e-571cb7acd495", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 6771 entries, 0 to 6770\n", + "Data columns (total 28 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 funding_source 6771 non-null object \n", + " 1 agency 6771 non-null object \n", + " 2 agency_voms 6771 non-null float64\n", + " 3 city 6689 non-null object \n", + " 4 fuel_tax 2257 non-null float64\n", + " 5 fta_capital_program_5309 2257 non-null float64\n", + " 6 fta_rural_progam_5311 2257 non-null float64\n", + " 7 fta_urbanized_area_formula 2257 non-null float64\n", + " 8 general_funds 4514 non-null float64\n", + " 9 income_tax 2257 non-null float64\n", + " 10 ntd_id 6771 non-null object \n", + " 11 organization_type 6771 non-null object \n", + " 12 other_dot_funds 2257 non-null float64\n", + " 13 other_federal_funds 2257 non-null float64\n", + " 14 other_fta_funds 2257 non-null float64\n", + " 15 other_funds 2257 non-null float64\n", + " 16 other_taxes 2257 non-null float64\n", + " 17 primary_uza_population 3405 non-null float64\n", + " 18 property_tax 2257 non-null float64\n", + " 19 reduced_reporter_funds 4514 non-null float64\n", + " 20 report_year 6771 non-null object \n", + " 21 reporter_type 6771 non-null object \n", + " 22 sales_tax 2257 non-null float64\n", + " 23 state 6771 non-null object \n", + " 24 tolls 2257 non-null float64\n", + " 25 transportation_funds 2257 non-null float64\n", + " 26 primary_uza_code 3405 non-null object \n", + " 27 primary_uza_name 3405 non-null object \n", + "dtypes: float64(18), object(10)\n", + "memory usage: 1.4+ MB\n" + ] + } + ], + "source": [ + "ntd_funding_sources = (tbls.mart_ntd.dim_annual_funding_sources()\n", + " >> filter(_.report_year == \"2023\",\n", + " # _.year == 2022\n", + " )\n", + " >> collect()\n", + " )\n", + "ntd_funding_sources.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "id": "f18b8c63-6052-4849-b1be-721a5cf54ecf", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "ntd_id\n", + "1 1\n", + "416 1\n", + "414 1\n", + "41199 1\n", + "41182 1\n", + "Name: primary_uza_name, dtype: int64" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ntd_funding_sources.groupby(\"ntd_id\")[\"primary_uza_name\"].nunique().sort_values(ascending=False).head()" + ] + }, + { + "cell_type": "markdown", + "id": "28e56afc-2d63-43a6-871b-b6554035ed7e", + "metadata": {}, + "source": [ + "#### dim_annual_service_agencies" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "id": "cf4cc544-785a-4476-bdda-9019e6a5e6de", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 2201 entries, 0 to 2200\n", + "Data columns (total 36 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 2201 non-null object \n", + " 1 report_year 2201 non-null object \n", + " 2 ntd_id 2201 non-null object \n", + " 3 agency 2201 non-null object \n", + " 4 reporter_type 2201 non-null object \n", + " 5 organization_type 2201 non-null object \n", + " 6 city 2173 non-null object \n", + " 7 state 2201 non-null object \n", + " 8 agency_voms 2201 non-null float64\n", + " 9 primary_uza_code 1089 non-null float64\n", + " 10 primary_uza_name 1089 non-null object \n", + " 11 primary_uza_area_sq_miles 2201 non-null object \n", + " 12 primary_uza_population 1089 non-null float64\n", + " 13 service_area_sq_miles 1059 non-null float64\n", + " 14 service_area_population 1060 non-null float64\n", + " 15 actual_vehicles_passenger_car_deadhead_hours 2201 non-null float64\n", + " 16 actual_vehicles_passenger_car_hours 2201 non-null float64\n", + " 17 actual_vehicles_passenger_car_miles 2201 non-null float64\n", + " 18 actual_vehicles_passenger_car_revenue_hours 2201 non-null float64\n", + " 19 actual_vehicles_passenger_car_revenue_miles 2201 non-null float64\n", + " 20 actual_vehicles_passenger_deadhead_miles 2201 non-null float64\n", + " 21 scheduled_vehicles_passenger_car_revenue_miles 2201 non-null float64\n", + " 22 charter_service_hours 380 non-null float64\n", + " 23 school_bus_hours 357 non-null float64\n", + " 24 trains_in_operation 2201 non-null float64\n", + " 25 directional_route_miles 2201 non-null float64\n", + " 26 passenger_miles 2201 non-null float64\n", + " 27 train_miles 2201 non-null float64\n", + " 28 train_revenue_miles 2201 non-null float64\n", + " 29 train_deadhead_miles 79 non-null float64\n", + " 30 train_hours 2201 non-null float64\n", + " 31 train_revenue_hours 2201 non-null float64\n", + " 32 train_deadhead_hours 79 non-null float64\n", + " 33 ada_upt 382 non-null float64\n", + " 34 sponsored_service_upt 1741 non-null float64\n", + " 35 unlinked_passenger_trips_upt 2201 non-null float64\n", + "dtypes: float64(26), object(10)\n", + "memory usage: 619.2+ KB\n" + ] + } + ], + "source": [ + "ntd_service_agencies = (tbls.mart_ntd.dim_annual_service_agencies ()\n", + " >> filter(_.report_year == \"2023\",\n", + " # _.year == 2022\n", + " )\n", + " >> collect()\n", + " )\n", + "ntd_service_agencies.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "id": "2901be3a-5c7f-4eaa-a45c-509d32bef198", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "ntd_id\n", + "1 1\n", + "44907 1\n", + "80302 1\n", + "80303 1\n", + "85 1\n", + "Name: primary_uza_name, dtype: int64" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ntd_service_agencies.groupby(\"ntd_id\")[\"primary_uza_name\"].nunique().sort_values(ascending=False).head()" + ] + }, + { + "cell_type": "markdown", + "id": "623ce1ce-bb7a-450a-b139-6f2cf6ac375b", + "metadata": {}, + "source": [ + "#### dim_annual_service_mode_time_periods" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "id": "ae899649-d9fa-4652-884b-f8912a768ff1", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 9114 entries, 0 to 9113\n", + "Data columns (total 70 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 9114 non-null object \n", + " 1 report_year 9114 non-null object \n", + " 2 type_of_service 9114 non-null object \n", + " 3 ntd_id 9114 non-null object \n", + " 4 agency 9114 non-null object \n", + " 5 mode 9114 non-null object \n", + " 6 mode_name 9114 non-null object \n", + " 7 time_period 9114 non-null object \n", + " 8 time_service_begins 4821 non-null object \n", + " 9 time_service_ends 4808 non-null object \n", + " 10 reporter_type 9114 non-null object \n", + " 11 city 9093 non-null object \n", + " 12 state 9114 non-null object \n", + " 13 actual_vehicles_passenger_car_deadhead_hours 9114 non-null float64\n", + " 14 actual_vehicles_passenger_car_hours 9114 non-null float64\n", + " 15 actual_vehicles_passenger_car_miles 9114 non-null float64\n", + " 16 actual_vehicles_passenger_car_revenue_hours 9114 non-null float64\n", + " 17 actual_vehicles_passenger_car_revenue_miles 9114 non-null float64\n", + " 18 actual_vehicles_passenger_deadhead_miles 9114 non-null float64\n", + " 19 ada_upt 0 non-null object \n", + " 20 agency_voms 9114 non-null float64\n", + " 21 aptl_questionable 833 non-null object \n", + " 22 average_passenger_trip_length_aptl_ 3133 non-null float64\n", + " 23 average_speed 3133 non-null float64\n", + " 24 average_speed_questionable 336 non-null object \n", + " 25 brt_non_statutory_mixed_traffic 0 non-null object \n", + " 26 charter_service_hours 0 non-null object \n", + " 27 days_of_service_operated 1908 non-null float64\n", + " 28 days_not_operated_strikes 1850 non-null float64\n", + " 29 days_not_operated_emergencies 1850 non-null float64\n", + " 30 deadhead_hours_questionable 77 non-null object \n", + " 31 deadhead_miles_questionable 98 non-null object \n", + " 32 directional_route_miles 9114 non-null float64\n", + " 33 directional_route_miles_questionable 42 non-null object \n", + " 34 mixed_traffic_right_of_way 0 non-null object \n", + " 35 mode_voms 0 non-null object \n", + " 36 mode_voms_questionable 56 non-null object \n", + " 37 organization_type 9114 non-null object \n", + " 38 passenger_miles 9114 non-null float64\n", + " 39 passenger_miles_questionable 784 non-null object \n", + " 40 passengers_per_hour 9114 non-null float64\n", + " 41 passengers_per_hour_questionable 476 non-null object \n", + " 42 primary_uza_area_sq_miles 9114 non-null float64\n", + " 43 primary_uza_code 9114 non-null float64\n", + " 44 primary_uza_name 9114 non-null object \n", + " 45 primary_uza_population 9114 non-null float64\n", + " 46 scheduled_revenue_miles_questionable 7 non-null object \n", + " 47 scheduled_vehicles_passenger_car_revenue_miles 9114 non-null float64\n", + " 48 school_bus_hours 0 non-null object \n", + " 49 service_area_population 9114 non-null float64\n", + " 50 service_area_sq_miles 9114 non-null float64\n", + " 51 sponsored_service_upt 0 non-null object \n", + " 52 train_deadhead_hours 770 non-null float64\n", + " 53 train_deadhead_miles 770 non-null float64\n", + " 54 train_hours 9114 non-null float64\n", + " 55 train_hours_questionable 0 non-null object \n", + " 56 trains_in_operation 9114 non-null float64\n", + " 57 trains_in_operation_questionable 0 non-null object \n", + " 58 train_miles 9114 non-null float64\n", + " 59 train_miles_questionable 0 non-null object \n", + " 60 train_revenue_hours 9114 non-null float64\n", + " 61 train_revenue_hours_questionable 0 non-null object \n", + " 62 train_revenue_miles 9114 non-null float64\n", + " 63 train_revenue_miles_questionable 0 non-null object \n", + " 64 unlinked_passenger_trips_upt 9114 non-null float64\n", + " 65 unlinked_passenger_trips_questionable 385 non-null object \n", + " 66 vehicle_hours_questionable 84 non-null object \n", + " 67 vehicle_miles_questionable 98 non-null object \n", + " 68 vehicle_revenue_hours_questionable 252 non-null object \n", + " 69 vehicle_revenue_miles_questionable 301 non-null object \n", + "dtypes: float64(29), object(41)\n", + "memory usage: 4.9+ MB\n" + ] + } + ], + "source": [ + "ntd_service_mode = (tbls.mart_ntd.dim_annual_service_mode_time_periods()\n", + " >> filter(_.report_year == \"2023\",\n", + " # _.year == 2022\n", + " )\n", + " >> collect()\n", + " )\n", + "ntd_service_mode.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "id": "9373a629-6506-4dcd-af77-c068947bd017", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "ntd_id\n", + "1 1\n", + "60032 1\n", + "60024 1\n", + "60022 1\n", + "60019 1\n", + "Name: primary_uza_code, dtype: int64" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ntd_service_mode.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=True).head()" + ] + }, + { + "cell_type": "markdown", + "id": "424fa017-17db-4bf9-8456-db7036fb656e", + "metadata": {}, + "source": [ + "#### dim_monthly_ridership_with_adjustments" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "id": "b76deddb-d6f6-4305-8a92-66c2092b88ed", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 27600 entries, 0 to 27599\n", + "Data columns (total 22 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 27600 non-null object \n", + " 1 ntd_id 27600 non-null object \n", + " 2 legacy_ntd_id 26568 non-null object \n", + " 3 agency 27600 non-null object \n", + " 4 reporter_type 27600 non-null object \n", + " 5 period_year_month 27600 non-null object \n", + " 6 period_year 27600 non-null object \n", + " 7 period_month 27600 non-null object \n", + " 8 primary_uza_name 27528 non-null object \n", + " 9 primary_uza_code 27528 non-null object \n", + " 10 _3_mode 27600 non-null object \n", + " 11 mode 27600 non-null object \n", + " 12 mode_name 27576 non-null object \n", + " 13 service_type 27600 non-null object \n", + " 14 mode_type_of_service_status 27600 non-null object \n", + " 15 tos 27600 non-null object \n", + " 16 upt 15650 non-null float64 \n", + " 17 vrm 15614 non-null float64 \n", + " 18 vrh 15556 non-null float64 \n", + " 19 voms 15556 non-null float64 \n", + " 20 _dt 27600 non-null object \n", + " 21 execution_ts 27600 non-null datetime64[ns, UTC]\n", + "dtypes: datetime64[ns, UTC](1), float64(4), object(17)\n", + "memory usage: 4.6+ MB\n" + ] + } + ], + "source": [ + "ntd_monthly_ridership = (tbls.mart_ntd.dim_monthly_ridership_with_adjustments ()\n", + " >> filter(_.period_year == \"2023\",\n", + " # _.year == 2022\n", + " )\n", + " >> collect()\n", + " )\n", + "ntd_monthly_ridership.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "id": "31f8ce6d-a7a9-4fb9-b33f-605cd38ee183", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "ntd_id\n", + "00001 1\n", + "60012 1\n", + "50522 1\n", + "55311 1\n", + "55312 1\n", + "Name: primary_uza_code, dtype: int64" + ] + }, + "execution_count": 53, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ntd_monthly_ridership.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=False).head()" + ] }, { "cell_type": "markdown", From 7062f13489c699d16212c3000a0459b327d7e55d Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Thu, 21 Nov 2024 22:41:58 +0000 Subject: [PATCH 08/13] quired dim_orgs for area 3 --- ntd/proposed_changes_25-26.ipynb | 327 ++++++++++++++++++++++++++++++- 1 file changed, 324 insertions(+), 3 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index b983588e5..1c7f03877 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -755,7 +755,7 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 55, "id": "31f8ce6d-a7a9-4fb9-b33f-605cd38ee183", "metadata": {}, "outputs": [ @@ -771,7 +771,7 @@ "Name: primary_uza_code, dtype: int64" ] }, - "execution_count": 53, + "execution_count": 55, "metadata": {}, "output_type": "execute_result" } @@ -798,9 +798,330 @@ }, { "cell_type": "code", - "execution_count": null, + "execution_count": 60, "id": "b9f386ff-59fa-4eec-add5-b8c16a820cb3", "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 1186 entries, 0 to 1185\n", + "Data columns (total 24 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 1186 non-null object \n", + " 1 source_record_id 1186 non-null object \n", + " 2 name 1186 non-null object \n", + " 3 organization_type 1186 non-null object \n", + " 4 roles 1186 non-null object \n", + " 5 itp_id 167 non-null float64 \n", + " 6 details 186 non-null object \n", + " 7 caltrans_district 508 non-null object \n", + " 8 website 1151 non-null object \n", + " 9 reporting_category 184 non-null object \n", + " 10 hubspot_company_record_id 279 non-null object \n", + " 11 gtfs_static_status 1186 non-null object \n", + " 12 gtfs_realtime_status 1186 non-null object \n", + " 13 _deprecated__assessment_status 1186 non-null bool \n", + " 14 manual_check__contact_on_website 665 non-null object \n", + " 15 alias 1186 non-null object \n", + " 16 is_public_entity 1186 non-null bool \n", + " 17 ntd_id 0 non-null object \n", + " 18 ntd_id_2022 0 non-null object \n", + " 19 public_currently_operating 1186 non-null bool \n", + " 20 public_currently_operating_fixed_route 1186 non-null bool \n", + " 21 _is_current 1186 non-null bool \n", + " 22 _valid_from 1186 non-null datetime64[ns, UTC]\n", + " 23 _valid_to 1186 non-null datetime64[ns, UTC]\n", + "dtypes: bool(5), datetime64[ns, UTC](2), float64(1), object(16)\n", + "memory usage: 182.0+ KB\n" + ] + } + ], + "source": [ + "dim_orgs = (tbls.mart_transit_database.dim_organizations()\n", + " >> filter(_._is_current == True,\n", + " _.ntd_id.isna()\n", + " )\n", + " >> collect()\n", + " )\n", + "dim_orgs.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "id": "d34f841d-110d-4294-886f-61f9e836836f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
keysource_record_idnameorganization_typerolesitp_iddetailscaltrans_districtwebsitereporting_categoryhubspot_company_record_idgtfs_static_statusgtfs_realtime_status_deprecated__assessment_statusmanual_check__contact_on_websitealiasis_public_entityntd_idntd_id_2022public_currently_operatingpublic_currently_operating_fixed_route_is_current_valid_from_valid_to
0eea7326fc87a575ce26e24cf56b8ff37recsupkiKC6Y6fFfVDAVNon-Profit Organization[]NaNDAV operates a fleet of vehicles around the co...11 - San Diegohttps://www.dav.org/veterans/i-need-a-ride/NoneNoneStatic OKRT IncompleteFalseUnknown[Disabled Veterans of America]FalseNoneNoneFalseFalseTrue2023-05-25 00:00:00+00:002098-12-31 23:59:59.999999+00:00
145227f9806ad4ffd1e8d4309e07f707erecmCZYY7aXn5MS9bIBICompany[]NaNNoneNonehttps://www.ibigroup.com/NoneNoneStatic OKRT IncompleteFalseUnknown[]FalseNoneNoneFalseFalseTrue2024-05-04 00:00:00+00:002098-12-31 23:59:59.999999+00:00
28a644e497439d25c01619c8ec6c85c44recEN9M5vpVOk2JRISAPCompany[]NaNNoneNonecrystalreports.comNoneNoneStatic OKRT IncompleteFalseUnknown[]FalseNoneNoneFalseFalseTrue2023-04-29 00:00:00+00:002098-12-31 23:59:59.999999+00:00
3fa22729bf0698cc6d2ac4f4a41e861b0recIAaOHjseoeNpTxUTACompany[]NaNNoneNonehttp://www.utatransit.net/NoneNoneStatic OKRT IncompleteFalseUnknown[]FalseNoneNoneFalseFalseTrue2023-04-29 00:00:00+00:002098-12-31 23:59:59.999999+00:00
43fd4d81306c49cb718c37b48fcbe585crecveQ8PTsiKdT7RUAinaCompany[]NaNOffices in Boston and Finland; no CA officesNonehttps://www.ainaptt.com/NoneNoneStatic OKRT IncompleteFalseUnknown[]FalseNoneNoneFalseFalseTrue2023-05-25 00:00:00+00:002098-12-31 23:59:59.999999+00:00
\n", + "
" + ], + "text/plain": [ + " key source_record_id name \\\n", + "0 eea7326fc87a575ce26e24cf56b8ff37 recsupkiKC6Y6fFfV DAV \n", + "1 45227f9806ad4ffd1e8d4309e07f707e recmCZYY7aXn5MS9b IBI \n", + "2 8a644e497439d25c01619c8ec6c85c44 recEN9M5vpVOk2JRI SAP \n", + "3 fa22729bf0698cc6d2ac4f4a41e861b0 recIAaOHjseoeNpTx UTA \n", + "4 3fd4d81306c49cb718c37b48fcbe585c recveQ8PTsiKdT7RU Aina \n", + "\n", + " organization_type roles itp_id \\\n", + "0 Non-Profit Organization [] NaN \n", + "1 Company [] NaN \n", + "2 Company [] NaN \n", + "3 Company [] NaN \n", + "4 Company [] NaN \n", + "\n", + " details caltrans_district \\\n", + "0 DAV operates a fleet of vehicles around the co... 11 - San Diego \n", + "1 None None \n", + "2 None None \n", + "3 None None \n", + "4 Offices in Boston and Finland; no CA offices None \n", + "\n", + " website reporting_category \\\n", + "0 https://www.dav.org/veterans/i-need-a-ride/ None \n", + "1 https://www.ibigroup.com/ None \n", + "2 crystalreports.com None \n", + "3 http://www.utatransit.net/ None \n", + "4 https://www.ainaptt.com/ None \n", + "\n", + " hubspot_company_record_id gtfs_static_status gtfs_realtime_status \\\n", + "0 None Static OK RT Incomplete \n", + "1 None Static OK RT Incomplete \n", + "2 None Static OK RT Incomplete \n", + "3 None Static OK RT Incomplete \n", + "4 None Static OK RT Incomplete \n", + "\n", + " _deprecated__assessment_status manual_check__contact_on_website \\\n", + "0 False Unknown \n", + "1 False Unknown \n", + "2 False Unknown \n", + "3 False Unknown \n", + "4 False Unknown \n", + "\n", + " alias is_public_entity ntd_id ntd_id_2022 \\\n", + "0 [Disabled Veterans of America] False None None \n", + "1 [] False None None \n", + "2 [] False None None \n", + "3 [] False None None \n", + "4 [] False None None \n", + "\n", + " public_currently_operating public_currently_operating_fixed_route \\\n", + "0 False False \n", + "1 False False \n", + "2 False False \n", + "3 False False \n", + "4 False False \n", + "\n", + " _is_current _valid_from _valid_to \n", + "0 True 2023-05-25 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", + "1 True 2024-05-04 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", + "2 True 2023-04-29 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", + "3 True 2023-04-29 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", + "4 True 2023-05-25 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 " + ] + }, + "execution_count": 62, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "dim_orgs.head()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "86359867-8b8a-41a9-8579-b38b7eb89bb3", + "metadata": {}, "outputs": [], "source": [] } From 8e10c74c5b3d48440eb398ebe3f476047e3a877e Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Fri, 22 Nov 2024 00:15:59 +0000 Subject: [PATCH 09/13] more content --- ntd/proposed_changes_25-26.ipynb | 37 ++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 1c7f03877..860cf6e6f 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -533,7 +533,40 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 65, + "id": "702e1dba-17a5-4364-a5c5-0c04955f9fab", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "City, County or Local Government Unit or Department of Transportation 1103\n", + "Independent Public Agency or Authority of Transit Service 422\n", + "Private-Non-Profit Corporation 358\n", + "Tribe 113\n", + "MPO, COG or Other Planning Agency 63\n", + "Area Agency on Aging 43\n", + "State Government Unit or Department of Transportation 27\n", + "Private-For-Profit Corporation 23\n", + "University 20\n", + "Other Publicly-Owned or Privately Chartered Corporation 15\n", + "Private Provider Reporting on Behalf of a Public Entity 7\n", + "Subsidiary Unit of a Transit Agency, Reporting Separately 7\n", + "Name: organization_type, dtype: int64" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ntd_service_agencies.organization_type.value_counts()" + ] + }, + { + "cell_type": "code", + "execution_count": 63, "id": "2901be3a-5c7f-4eaa-a45c-509d32bef198", "metadata": {}, "outputs": [ @@ -549,7 +582,7 @@ "Name: primary_uza_name, dtype: int64" ] }, - "execution_count": 44, + "execution_count": 63, "metadata": {}, "output_type": "execute_result" } From fdd0b6b8d27e4fe9e3a5aaf6eaaddfbe3209cb08 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Sat, 23 Nov 2024 00:31:13 +0000 Subject: [PATCH 10/13] decided on 2 tables --- ntd/proposed_changes_25-26.ipynb | 406 ++++++++++++++++--------------- 1 file changed, 213 insertions(+), 193 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 860cf6e6f..e040d285e 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -32,7 +32,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 1, "id": "b5679a16-211e-400a-bf08-4c377b4b27ed", "metadata": {}, "outputs": [], @@ -246,135 +246,27 @@ }, { "cell_type": "markdown", - "id": "80ff21e9-de8b-4dc4-b431-0a57f3919308", - "metadata": {}, - "source": [ - "#### dim_annual_ntd_agency_information" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "id": "20505f79-5721-4789-b10a-61bbdbb57b9e", - "metadata": {}, - "outputs": [], - "source": [ - "ntd_agency_info = (tbls.mart_ntd.dim_annual_ntd_agency_information()\n", - " >> filter(_._is_current == True,\n", - " _.year == 2022\n", - " )\n", - " >> collect()\n", - " )" - ] - }, - { - "cell_type": "code", - "execution_count": 27, - "id": "1fac9f3a-3bd9-4c4c-9f83-86002a1e56f8", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "RangeIndex: 2969 entries, 0 to 2968\n", - "Data columns (total 46 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 key 2969 non-null object \n", - " 1 year 2969 non-null int64 \n", - " 2 ntd_id 2969 non-null object \n", - " 3 number_of_state_counties 54 non-null float64 \n", - " 4 tam_tier 2969 non-null object \n", - " 5 personal_vehicles 1151 non-null float64 \n", - " 6 density 1125 non-null float64 \n", - " 7 uza_name 1125 non-null object \n", - " 8 tribal_area_name 138 non-null object \n", - " 9 service_area_sq_miles 1044 non-null float64 \n", - " 10 total_voms 2969 non-null float64 \n", - " 11 city 2944 non-null object \n", - " 12 fta_recipient_id 2242 non-null float64 \n", - " 13 region 2969 non-null float64 \n", - " 14 state_admin_funds_expended 54 non-null float64 \n", - " 15 zip_code_ext 2463 non-null float64 \n", - " 16 zip_code 2944 non-null float64 \n", - " 17 ueid 2575 non-null object \n", - " 18 address_line_2 323 non-null object \n", - " 19 number_of_counties_with_service 54 non-null float64 \n", - " 20 reporter_acronym 1499 non-null object \n", - " 21 original_due_date 2959 non-null float64 \n", - " 22 sq_miles 1124 non-null float64 \n", - " 23 address_line_1 2914 non-null object \n", - " 24 p_o__box 833 non-null object \n", - " 25 fy_end_date 2969 non-null int64 \n", - " 26 reported_by_ntd_id 1775 non-null object \n", - " 27 population 1125 non-null float64 \n", - " 28 reporting_module 2969 non-null object \n", - " 29 service_area_pop 1045 non-null float64 \n", - " 30 subrecipient_type 1770 non-null object \n", - " 31 state 2969 non-null object \n", - " 32 volunteer_drivers 1151 non-null float64 \n", - " 33 primary_uza 0 non-null object \n", - " 34 doing_business_as 584 non-null object \n", - " 35 reporter_type 2969 non-null object \n", - " 36 legacy_ntd_id 2093 non-null object \n", - " 37 voms_do 1747 non-null float64 \n", - " 38 url 2876 non-null object \n", - " 39 reported_by_name 1775 non-null object \n", - " 40 voms_pt 671 non-null float64 \n", - " 41 organization_type 2915 non-null object \n", - " 42 agency_name 2969 non-null object \n", - " 43 _valid_from 2969 non-null datetime64[ns, UTC]\n", - " 44 _valid_to 2969 non-null datetime64[ns, UTC]\n", - " 45 _is_current 2969 non-null bool \n", - "dtypes: bool(1), datetime64[ns, UTC](2), float64(18), int64(2), object(23)\n", - "memory usage: 1.0+ MB\n" - ] - } - ], - "source": [ - "ntd_agency_info.info()" - ] - }, - { - "cell_type": "code", - "execution_count": 40, - "id": "31591169-cdb1-45db-880a-6cdbcc013fb2", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "ntd_id\n", - "80285 1\n", - "40225 1\n", - "40223 1\n", - "40222 1\n", - "40221 1\n", - "Name: uza_name, dtype: int64" - ] - }, - "execution_count": 40, - "metadata": {}, - "output_type": "execute_result" - } - ], + "id": "35ebc22e-ea54-452d-b6eb-c9b0e2c1fbdb", + "metadata": { + "tags": [] + }, "source": [ - "ntd_agency_info.groupby(\"ntd_id\")[\"uza_name\"].nunique().sort_values(ascending=False).head()" + "### Any operators that operate in more than 1 UZA?" ] }, { "cell_type": "markdown", "id": "b6a189aa-2820-4efc-9d4f-5638c6d11379", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ "#### dim_annual_funding_sources" ] }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 19, "id": "fc8f9407-af22-4805-bb0e-571cb7acd495", "metadata": {}, "outputs": [ @@ -383,47 +275,49 @@ "output_type": "stream", "text": [ "\n", - "RangeIndex: 6771 entries, 0 to 6770\n", + "RangeIndex: 134 entries, 0 to 133\n", "Data columns (total 28 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", - " 0 funding_source 6771 non-null object \n", - " 1 agency 6771 non-null object \n", - " 2 agency_voms 6771 non-null float64\n", - " 3 city 6689 non-null object \n", - " 4 fuel_tax 2257 non-null float64\n", - " 5 fta_capital_program_5309 2257 non-null float64\n", - " 6 fta_rural_progam_5311 2257 non-null float64\n", - " 7 fta_urbanized_area_formula 2257 non-null float64\n", - " 8 general_funds 4514 non-null float64\n", - " 9 income_tax 2257 non-null float64\n", - " 10 ntd_id 6771 non-null object \n", - " 11 organization_type 6771 non-null object \n", - " 12 other_dot_funds 2257 non-null float64\n", - " 13 other_federal_funds 2257 non-null float64\n", - " 14 other_fta_funds 2257 non-null float64\n", - " 15 other_funds 2257 non-null float64\n", - " 16 other_taxes 2257 non-null float64\n", - " 17 primary_uza_population 3405 non-null float64\n", - " 18 property_tax 2257 non-null float64\n", - " 19 reduced_reporter_funds 4514 non-null float64\n", - " 20 report_year 6771 non-null object \n", - " 21 reporter_type 6771 non-null object \n", - " 22 sales_tax 2257 non-null float64\n", - " 23 state 6771 non-null object \n", - " 24 tolls 2257 non-null float64\n", - " 25 transportation_funds 2257 non-null float64\n", - " 26 primary_uza_code 3405 non-null object \n", - " 27 primary_uza_name 3405 non-null object \n", - "dtypes: float64(18), object(10)\n", - "memory usage: 1.4+ MB\n" + " 0 funding_source 134 non-null object \n", + " 1 agency 134 non-null object \n", + " 2 agency_voms 134 non-null float64\n", + " 3 city 134 non-null object \n", + " 4 fuel_tax 0 non-null object \n", + " 5 fta_capital_program_5309 134 non-null float64\n", + " 6 fta_rural_progam_5311 134 non-null float64\n", + " 7 fta_urbanized_area_formula 134 non-null float64\n", + " 8 general_funds 0 non-null object \n", + " 9 income_tax 0 non-null object \n", + " 10 ntd_id 134 non-null object \n", + " 11 organization_type 134 non-null object \n", + " 12 other_dot_funds 134 non-null float64\n", + " 13 other_federal_funds 134 non-null float64\n", + " 14 other_fta_funds 134 non-null float64\n", + " 15 other_funds 0 non-null object \n", + " 16 other_taxes 0 non-null object \n", + " 17 primary_uza_population 134 non-null float64\n", + " 18 property_tax 0 non-null object \n", + " 19 reduced_reporter_funds 0 non-null object \n", + " 20 report_year 134 non-null object \n", + " 21 reporter_type 134 non-null object \n", + " 22 sales_tax 0 non-null object \n", + " 23 state 134 non-null object \n", + " 24 tolls 0 non-null object \n", + " 25 transportation_funds 0 non-null object \n", + " 26 primary_uza_code 134 non-null object \n", + " 27 primary_uza_name 134 non-null object \n", + "dtypes: float64(8), object(20)\n", + "memory usage: 29.4+ KB\n" ] } ], "source": [ + "# Has 5311 data for operators, and UZA, VOMS, \n", "ntd_funding_sources = (tbls.mart_ntd.dim_annual_funding_sources()\n", " >> filter(_.report_year == \"2023\",\n", - " # _.year == 2022\n", + " _.fta_rural_progam_5311 > 0,\n", + " _.reporter_type == \"Full Reporter\"\n", " )\n", " >> collect()\n", " )\n", @@ -432,42 +326,39 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 20, "id": "f18b8c63-6052-4849-b1be-721a5cf54ecf", "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "ntd_id\n", - "1 1\n", - "416 1\n", - "414 1\n", - "41199 1\n", - "41182 1\n", - "Name: primary_uza_name, dtype: int64" + "Full Reporter 134\n", + "Name: reporter_type, dtype: int64" ] }, - "execution_count": 38, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "ntd_funding_sources.groupby(\"ntd_id\")[\"primary_uza_name\"].nunique().sort_values(ascending=False).head()" + "ntd_funding_sources[\"reporter_type\"].value_counts()" ] }, { "cell_type": "markdown", "id": "28e56afc-2d63-43a6-871b-b6554035ed7e", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ "#### dim_annual_service_agencies" ] }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 7, "id": "cf4cc544-785a-4476-bdda-9019e6a5e6de", "metadata": {}, "outputs": [ @@ -522,18 +413,143 @@ } ], "source": [ + "# Has UZA, VRM and VOMS. No 5311 funds\n", "ntd_service_agencies = (tbls.mart_ntd.dim_annual_service_agencies ()\n", " >> filter(_.report_year == \"2023\",\n", - " # _.year == 2022\n", + " _.agency_voms > 30\n", " )\n", " >> collect()\n", " )\n", "ntd_service_agencies.info()" ] }, + { + "cell_type": "markdown", + "id": "80ff21e9-de8b-4dc4-b431-0a57f3919308", + "metadata": { + "tags": [] + }, + "source": [ + "#### dim_annual_ntd_agency_information" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "20505f79-5721-4789-b10a-61bbdbb57b9e", + "metadata": {}, + "outputs": [], + "source": [ + "# has UZA data for operator\n", + "\n", + "#ntd_agency_info = (tbls.mart_ntd.dim_annual_ntd_agency_information()\n", + "# >> filter(_._is_current == True,\n", + "# _.year == 2022\n", + "# )\n", + "# >> collect()\n", + "# )" + ] + }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 3, + "id": "1fac9f3a-3bd9-4c4c-9f83-86002a1e56f8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "RangeIndex: 2969 entries, 0 to 2968\n", + "Data columns (total 46 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 2969 non-null object \n", + " 1 year 2969 non-null int64 \n", + " 2 ntd_id 2969 non-null object \n", + " 3 number_of_state_counties 54 non-null float64 \n", + " 4 tam_tier 2969 non-null object \n", + " 5 personal_vehicles 1151 non-null float64 \n", + " 6 density 1125 non-null float64 \n", + " 7 uza_name 1125 non-null object \n", + " 8 tribal_area_name 138 non-null object \n", + " 9 service_area_sq_miles 1044 non-null float64 \n", + " 10 total_voms 2969 non-null float64 \n", + " 11 city 2944 non-null object \n", + " 12 fta_recipient_id 2242 non-null float64 \n", + " 13 region 2969 non-null float64 \n", + " 14 state_admin_funds_expended 54 non-null float64 \n", + " 15 zip_code_ext 2463 non-null float64 \n", + " 16 zip_code 2944 non-null float64 \n", + " 17 ueid 2575 non-null object \n", + " 18 address_line_2 323 non-null object \n", + " 19 number_of_counties_with_service 54 non-null float64 \n", + " 20 reporter_acronym 1499 non-null object \n", + " 21 original_due_date 2959 non-null float64 \n", + " 22 sq_miles 1124 non-null float64 \n", + " 23 address_line_1 2914 non-null object \n", + " 24 p_o__box 833 non-null object \n", + " 25 fy_end_date 2969 non-null int64 \n", + " 26 reported_by_ntd_id 1775 non-null object \n", + " 27 population 1125 non-null float64 \n", + " 28 reporting_module 2969 non-null object \n", + " 29 service_area_pop 1045 non-null float64 \n", + " 30 subrecipient_type 1770 non-null object \n", + " 31 state 2969 non-null object \n", + " 32 volunteer_drivers 1151 non-null float64 \n", + " 33 primary_uza 0 non-null object \n", + " 34 doing_business_as 584 non-null object \n", + " 35 reporter_type 2969 non-null object \n", + " 36 legacy_ntd_id 2093 non-null object \n", + " 37 voms_do 1747 non-null float64 \n", + " 38 url 2876 non-null object \n", + " 39 reported_by_name 1775 non-null object \n", + " 40 voms_pt 671 non-null float64 \n", + " 41 organization_type 2915 non-null object \n", + " 42 agency_name 2969 non-null object \n", + " 43 _valid_from 2969 non-null datetime64[ns, UTC]\n", + " 44 _valid_to 2969 non-null datetime64[ns, UTC]\n", + " 45 _is_current 2969 non-null bool \n", + "dtypes: bool(1), datetime64[ns, UTC](2), float64(18), int64(2), object(23)\n", + "memory usage: 1.0+ MB\n" + ] + } + ], + "source": [ + "#ntd_agency_info.info()" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "31591169-cdb1-45db-880a-6cdbcc013fb2", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "ntd_id\n", + "80285 1\n", + "40225 1\n", + "40223 1\n", + "40222 1\n", + "40221 1\n", + "Name: uza_name, dtype: int64" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ntd_agency_info.groupby(\"ntd_id\")[\"uza_name\"].nunique().sort_values(ascending=False).head()" + ] + }, + { + "cell_type": "code", + "execution_count": 8, "id": "702e1dba-17a5-4364-a5c5-0c04955f9fab", "metadata": {}, "outputs": [ @@ -555,7 +571,7 @@ "Name: organization_type, dtype: int64" ] }, - "execution_count": 65, + "execution_count": 8, "metadata": {}, "output_type": "execute_result" } @@ -566,7 +582,7 @@ }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 9, "id": "2901be3a-5c7f-4eaa-a45c-509d32bef198", "metadata": {}, "outputs": [ @@ -582,7 +598,7 @@ "Name: primary_uza_name, dtype: int64" ] }, - "execution_count": 63, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } @@ -594,14 +610,16 @@ { "cell_type": "markdown", "id": "623ce1ce-bb7a-450a-b139-6f2cf6ac375b", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ "#### dim_annual_service_mode_time_periods" ] }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 10, "id": "ae899649-d9fa-4652-884b-f8912a768ff1", "metadata": {}, "outputs": [ @@ -690,18 +708,18 @@ } ], "source": [ - "ntd_service_mode = (tbls.mart_ntd.dim_annual_service_mode_time_periods()\n", - " >> filter(_.report_year == \"2023\",\n", - " # _.year == 2022\n", - " )\n", - " >> collect()\n", - " )\n", - "ntd_service_mode.info()" + "#ntd_service_mode = (tbls.mart_ntd.dim_annual_service_mode_time_periods()\n", + "# >> filter(_.report_year == \"2023\",\n", + "# # _.year == 2022\n", + "# )\n", + "# >> collect()\n", + "# )\n", + "#ntd_service_mode.info()" ] }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 11, "id": "9373a629-6506-4dcd-af77-c068947bd017", "metadata": {}, "outputs": [ @@ -717,7 +735,7 @@ "Name: primary_uza_code, dtype: int64" ] }, - "execution_count": 50, + "execution_count": 11, "metadata": {}, "output_type": "execute_result" } @@ -729,14 +747,16 @@ { "cell_type": "markdown", "id": "424fa017-17db-4bf9-8456-db7036fb656e", - "metadata": {}, + "metadata": { + "tags": [] + }, "source": [ "#### dim_monthly_ridership_with_adjustments" ] }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 12, "id": "b76deddb-d6f6-4305-8a92-66c2092b88ed", "metadata": {}, "outputs": [ @@ -777,18 +797,18 @@ } ], "source": [ - "ntd_monthly_ridership = (tbls.mart_ntd.dim_monthly_ridership_with_adjustments ()\n", - " >> filter(_.period_year == \"2023\",\n", - " # _.year == 2022\n", - " )\n", - " >> collect()\n", - " )\n", - "ntd_monthly_ridership.info()" + "#ntd_monthly_ridership = (tbls.mart_ntd.dim_monthly_ridership_with_adjustments ()\n", + "# >> filter(_.period_year == \"2023\",\n", + "# # _.year == 2022\n", + "# )\n", + "# >> collect()\n", + "# )\n", + "#ntd_monthly_ridership.info()" ] }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 13, "id": "31f8ce6d-a7a9-4fb9-b33f-605cd38ee183", "metadata": {}, "outputs": [ @@ -804,13 +824,13 @@ "Name: primary_uza_code, dtype: int64" ] }, - "execution_count": 55, + "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "ntd_monthly_ridership.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=False).head()" + "#ntd_monthly_ridership.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=False).head()" ] }, { @@ -831,7 +851,7 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 14, "id": "b9f386ff-59fa-4eec-add5-b8c16a820cb3", "metadata": {}, "outputs": [ @@ -885,7 +905,7 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 15, "id": "d34f841d-110d-4294-886f-61f9e836836f", "metadata": {}, "outputs": [ @@ -1141,7 +1161,7 @@ "4 True 2023-05-25 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 " ] }, - "execution_count": 62, + "execution_count": 15, "metadata": {}, "output_type": "execute_result" } From 6a63cf5cf4bf7ee19a39a4c6ec38c26bcf22bb66 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Mon, 25 Nov 2024 18:44:02 +0000 Subject: [PATCH 11/13] reading in UZA data from feature server --- ntd/proposed_changes_25-26.ipynb | 543 +++++++++++++++---------------- 1 file changed, 264 insertions(+), 279 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index e040d285e..2b60e7807 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -32,7 +32,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 2, "id": "b5679a16-211e-400a-bf08-4c377b4b27ed", "metadata": {}, "outputs": [], @@ -40,6 +40,7 @@ "import pandas as pd\n", "\n", "from calitp_data_analysis.tables import tbls\n", + "import geopandas as gpd\n", "from siuba import _, collect, count, filter, show_query\n", "\n", "pd.set_option(\"display.max_columns\", None)\n", @@ -423,130 +424,6 @@ "ntd_service_agencies.info()" ] }, - { - "cell_type": "markdown", - "id": "80ff21e9-de8b-4dc4-b431-0a57f3919308", - "metadata": { - "tags": [] - }, - "source": [ - "#### dim_annual_ntd_agency_information" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "20505f79-5721-4789-b10a-61bbdbb57b9e", - "metadata": {}, - "outputs": [], - "source": [ - "# has UZA data for operator\n", - "\n", - "#ntd_agency_info = (tbls.mart_ntd.dim_annual_ntd_agency_information()\n", - "# >> filter(_._is_current == True,\n", - "# _.year == 2022\n", - "# )\n", - "# >> collect()\n", - "# )" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "id": "1fac9f3a-3bd9-4c4c-9f83-86002a1e56f8", - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "RangeIndex: 2969 entries, 0 to 2968\n", - "Data columns (total 46 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 key 2969 non-null object \n", - " 1 year 2969 non-null int64 \n", - " 2 ntd_id 2969 non-null object \n", - " 3 number_of_state_counties 54 non-null float64 \n", - " 4 tam_tier 2969 non-null object \n", - " 5 personal_vehicles 1151 non-null float64 \n", - " 6 density 1125 non-null float64 \n", - " 7 uza_name 1125 non-null object \n", - " 8 tribal_area_name 138 non-null object \n", - " 9 service_area_sq_miles 1044 non-null float64 \n", - " 10 total_voms 2969 non-null float64 \n", - " 11 city 2944 non-null object \n", - " 12 fta_recipient_id 2242 non-null float64 \n", - " 13 region 2969 non-null float64 \n", - " 14 state_admin_funds_expended 54 non-null float64 \n", - " 15 zip_code_ext 2463 non-null float64 \n", - " 16 zip_code 2944 non-null float64 \n", - " 17 ueid 2575 non-null object \n", - " 18 address_line_2 323 non-null object \n", - " 19 number_of_counties_with_service 54 non-null float64 \n", - " 20 reporter_acronym 1499 non-null object \n", - " 21 original_due_date 2959 non-null float64 \n", - " 22 sq_miles 1124 non-null float64 \n", - " 23 address_line_1 2914 non-null object \n", - " 24 p_o__box 833 non-null object \n", - " 25 fy_end_date 2969 non-null int64 \n", - " 26 reported_by_ntd_id 1775 non-null object \n", - " 27 population 1125 non-null float64 \n", - " 28 reporting_module 2969 non-null object \n", - " 29 service_area_pop 1045 non-null float64 \n", - " 30 subrecipient_type 1770 non-null object \n", - " 31 state 2969 non-null object \n", - " 32 volunteer_drivers 1151 non-null float64 \n", - " 33 primary_uza 0 non-null object \n", - " 34 doing_business_as 584 non-null object \n", - " 35 reporter_type 2969 non-null object \n", - " 36 legacy_ntd_id 2093 non-null object \n", - " 37 voms_do 1747 non-null float64 \n", - " 38 url 2876 non-null object \n", - " 39 reported_by_name 1775 non-null object \n", - " 40 voms_pt 671 non-null float64 \n", - " 41 organization_type 2915 non-null object \n", - " 42 agency_name 2969 non-null object \n", - " 43 _valid_from 2969 non-null datetime64[ns, UTC]\n", - " 44 _valid_to 2969 non-null datetime64[ns, UTC]\n", - " 45 _is_current 2969 non-null bool \n", - "dtypes: bool(1), datetime64[ns, UTC](2), float64(18), int64(2), object(23)\n", - "memory usage: 1.0+ MB\n" - ] - } - ], - "source": [ - "#ntd_agency_info.info()" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "id": "31591169-cdb1-45db-880a-6cdbcc013fb2", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "ntd_id\n", - "80285 1\n", - "40225 1\n", - "40223 1\n", - "40222 1\n", - "40221 1\n", - "Name: uza_name, dtype: int64" - ] - }, - "execution_count": 4, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "ntd_agency_info.groupby(\"ntd_id\")[\"uza_name\"].nunique().sort_values(ascending=False).head()" - ] - }, { "cell_type": "code", "execution_count": 8, @@ -607,6 +484,53 @@ "ntd_service_agencies.groupby(\"ntd_id\")[\"primary_uza_name\"].nunique().sort_values(ascending=False).head()" ] }, + { + "cell_type": "markdown", + "id": "80ff21e9-de8b-4dc4-b431-0a57f3919308", + "metadata": { + "tags": [] + }, + "source": [ + "#### dim_annual_ntd_agency_information" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "20505f79-5721-4789-b10a-61bbdbb57b9e", + "metadata": {}, + "outputs": [], + "source": [ + "# has UZA data for operator\n", + "\n", + "#ntd_agency_info = (tbls.mart_ntd.dim_annual_ntd_agency_information()\n", + "# >> filter(_._is_current == True,\n", + "# _.year == 2022\n", + "# )\n", + "# >> collect()\n", + "# )" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1fac9f3a-3bd9-4c4c-9f83-86002a1e56f8", + "metadata": {}, + "outputs": [], + "source": [ + "#ntd_agency_info.info()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "31591169-cdb1-45db-880a-6cdbcc013fb2", + "metadata": {}, + "outputs": [], + "source": [ + "#ntd_agency_info.groupby(\"ntd_id\")[\"uza_name\"].nunique().sort_values(ascending=False).head()" + ] + }, { "cell_type": "markdown", "id": "623ce1ce-bb7a-450a-b139-6f2cf6ac375b", @@ -619,94 +543,10 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": null, "id": "ae899649-d9fa-4652-884b-f8912a768ff1", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "RangeIndex: 9114 entries, 0 to 9113\n", - "Data columns (total 70 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 key 9114 non-null object \n", - " 1 report_year 9114 non-null object \n", - " 2 type_of_service 9114 non-null object \n", - " 3 ntd_id 9114 non-null object \n", - " 4 agency 9114 non-null object \n", - " 5 mode 9114 non-null object \n", - " 6 mode_name 9114 non-null object \n", - " 7 time_period 9114 non-null object \n", - " 8 time_service_begins 4821 non-null object \n", - " 9 time_service_ends 4808 non-null object \n", - " 10 reporter_type 9114 non-null object \n", - " 11 city 9093 non-null object \n", - " 12 state 9114 non-null object \n", - " 13 actual_vehicles_passenger_car_deadhead_hours 9114 non-null float64\n", - " 14 actual_vehicles_passenger_car_hours 9114 non-null float64\n", - " 15 actual_vehicles_passenger_car_miles 9114 non-null float64\n", - " 16 actual_vehicles_passenger_car_revenue_hours 9114 non-null float64\n", - " 17 actual_vehicles_passenger_car_revenue_miles 9114 non-null float64\n", - " 18 actual_vehicles_passenger_deadhead_miles 9114 non-null float64\n", - " 19 ada_upt 0 non-null object \n", - " 20 agency_voms 9114 non-null float64\n", - " 21 aptl_questionable 833 non-null object \n", - " 22 average_passenger_trip_length_aptl_ 3133 non-null float64\n", - " 23 average_speed 3133 non-null float64\n", - " 24 average_speed_questionable 336 non-null object \n", - " 25 brt_non_statutory_mixed_traffic 0 non-null object \n", - " 26 charter_service_hours 0 non-null object \n", - " 27 days_of_service_operated 1908 non-null float64\n", - " 28 days_not_operated_strikes 1850 non-null float64\n", - " 29 days_not_operated_emergencies 1850 non-null float64\n", - " 30 deadhead_hours_questionable 77 non-null object \n", - " 31 deadhead_miles_questionable 98 non-null object \n", - " 32 directional_route_miles 9114 non-null float64\n", - " 33 directional_route_miles_questionable 42 non-null object \n", - " 34 mixed_traffic_right_of_way 0 non-null object \n", - " 35 mode_voms 0 non-null object \n", - " 36 mode_voms_questionable 56 non-null object \n", - " 37 organization_type 9114 non-null object \n", - " 38 passenger_miles 9114 non-null float64\n", - " 39 passenger_miles_questionable 784 non-null object \n", - " 40 passengers_per_hour 9114 non-null float64\n", - " 41 passengers_per_hour_questionable 476 non-null object \n", - " 42 primary_uza_area_sq_miles 9114 non-null float64\n", - " 43 primary_uza_code 9114 non-null float64\n", - " 44 primary_uza_name 9114 non-null object \n", - " 45 primary_uza_population 9114 non-null float64\n", - " 46 scheduled_revenue_miles_questionable 7 non-null object \n", - " 47 scheduled_vehicles_passenger_car_revenue_miles 9114 non-null float64\n", - " 48 school_bus_hours 0 non-null object \n", - " 49 service_area_population 9114 non-null float64\n", - " 50 service_area_sq_miles 9114 non-null float64\n", - " 51 sponsored_service_upt 0 non-null object \n", - " 52 train_deadhead_hours 770 non-null float64\n", - " 53 train_deadhead_miles 770 non-null float64\n", - " 54 train_hours 9114 non-null float64\n", - " 55 train_hours_questionable 0 non-null object \n", - " 56 trains_in_operation 9114 non-null float64\n", - " 57 trains_in_operation_questionable 0 non-null object \n", - " 58 train_miles 9114 non-null float64\n", - " 59 train_miles_questionable 0 non-null object \n", - " 60 train_revenue_hours 9114 non-null float64\n", - " 61 train_revenue_hours_questionable 0 non-null object \n", - " 62 train_revenue_miles 9114 non-null float64\n", - " 63 train_revenue_miles_questionable 0 non-null object \n", - " 64 unlinked_passenger_trips_upt 9114 non-null float64\n", - " 65 unlinked_passenger_trips_questionable 385 non-null object \n", - " 66 vehicle_hours_questionable 84 non-null object \n", - " 67 vehicle_miles_questionable 98 non-null object \n", - " 68 vehicle_revenue_hours_questionable 252 non-null object \n", - " 69 vehicle_revenue_miles_questionable 301 non-null object \n", - "dtypes: float64(29), object(41)\n", - "memory usage: 4.9+ MB\n" - ] - } - ], + "outputs": [], "source": [ "#ntd_service_mode = (tbls.mart_ntd.dim_annual_service_mode_time_periods()\n", "# >> filter(_.report_year == \"2023\",\n", @@ -719,29 +559,12 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "id": "9373a629-6506-4dcd-af77-c068947bd017", "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "ntd_id\n", - "1 1\n", - "60032 1\n", - "60024 1\n", - "60022 1\n", - "60019 1\n", - "Name: primary_uza_code, dtype: int64" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], + "outputs": [], "source": [ - "ntd_service_mode.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=True).head()" + "#ntd_service_mode.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=True).head()" ] }, { @@ -756,46 +579,10 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "id": "b76deddb-d6f6-4305-8a92-66c2092b88ed", "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "RangeIndex: 27600 entries, 0 to 27599\n", - "Data columns (total 22 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 key 27600 non-null object \n", - " 1 ntd_id 27600 non-null object \n", - " 2 legacy_ntd_id 26568 non-null object \n", - " 3 agency 27600 non-null object \n", - " 4 reporter_type 27600 non-null object \n", - " 5 period_year_month 27600 non-null object \n", - " 6 period_year 27600 non-null object \n", - " 7 period_month 27600 non-null object \n", - " 8 primary_uza_name 27528 non-null object \n", - " 9 primary_uza_code 27528 non-null object \n", - " 10 _3_mode 27600 non-null object \n", - " 11 mode 27600 non-null object \n", - " 12 mode_name 27576 non-null object \n", - " 13 service_type 27600 non-null object \n", - " 14 mode_type_of_service_status 27600 non-null object \n", - " 15 tos 27600 non-null object \n", - " 16 upt 15650 non-null float64 \n", - " 17 vrm 15614 non-null float64 \n", - " 18 vrh 15556 non-null float64 \n", - " 19 voms 15556 non-null float64 \n", - " 20 _dt 27600 non-null object \n", - " 21 execution_ts 27600 non-null datetime64[ns, UTC]\n", - "dtypes: datetime64[ns, UTC](1), float64(4), object(17)\n", - "memory usage: 4.6+ MB\n" - ] - } - ], + "outputs": [], "source": [ "#ntd_monthly_ridership = (tbls.mart_ntd.dim_monthly_ridership_with_adjustments ()\n", "# >> filter(_.period_year == \"2023\",\n", @@ -808,29 +595,227 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "id": "31f8ce6d-a7a9-4fb9-b33f-605cd38ee183", "metadata": {}, + "outputs": [], + "source": [ + "#ntd_monthly_ridership.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=False).head()" + ] + }, + { + "cell_type": "markdown", + "id": "ac729467-8a82-4b52-836f-453eaa140355", + "metadata": {}, + "source": [ + "## get UZA geojson data from rest server" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "8c636d17-5819-45e1-94ec-9a4e20549441", + "metadata": {}, + "outputs": [], + "source": [ + "rest_server_link = \"https://services.arcgis.com/xOi1kZaI0eWDREZv/ArcGIS/rest/services/FTA_Administrative_Boundaries/FeatureServer/5/query?where=1%3D1&objectIds=&geometry=&geometryType=esriGeometryEnvelope&inSR=&spatialRel=esriSpatialRelIntersects&resultType=none&distance=0.0&units=esriSRUnit_Meter&relationParam=&returnGeodetic=false&outFields=*&returnGeometry=true&returnCentroid=false&returnEnvelope=false&featureEncoding=esriDefault&multipatchOption=xyFootprint&maxAllowableOffset=&geometryPrecision=&outSR=&defaultSR=&datumTransformation=&applyVCSProjection=false&returnIdsOnly=false&returnUniqueIdsOnly=false&returnCountOnly=false&returnExtentOnly=false&returnQueryGeometry=false&returnDistinctValues=false&cacheHint=false&collation=&orderByFields=&groupByFieldsForStatistics=&outStatistics=&having=&resultOffset=&resultRecordCount=&returnZ=false&returnM=false&returnTrueCurves=false&returnExceededLimitFeatures=true&quantizationParameters=&sqlFormat=none&f=pgeojson&token=\"" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "bc95b052-812d-49aa-8d4f-bf50e5f5f3b5", + "metadata": {}, + "outputs": [], + "source": [ + "uza_data = gpd.read_file(rest_server_link)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5ba7b042-a925-4c86-ae81-ca85c7226a97", + "metadata": {}, + "outputs": [], + "source": [ + "display(\n", + " uza_data.info(),\n", + " uza_data.crs\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "677a3de4-307d-4522-8f6e-c47d45cb2feb", + "metadata": { + "tags": [] + }, + "source": [ + "#### get CA UZAs" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "10504da7-54db-48f9-9128-cb07d96cd0ef", + "metadata": {}, + "outputs": [], + "source": [ + "ca_uza = uza_data[uza_data['NAMELSAD'].str.contains(\", CA\")].reset_index(drop=True)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "317ccaca-1b4b-4f48-8b2f-0ce66100ab2d", + "metadata": {}, + "outputs": [], + "source": [ + "display(\n", + " ca_uza.info(),\n", + " ca_uza.crs\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "674a7714-4c74-4f94-b777-970396b49e59", + "metadata": {}, "outputs": [ { "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
OBJECTIDUACENAMELSADPOPAREALANDSQMIAREAWATERSQMIPOPDENShape__AreaShape__Lengthgeometry
0266279417Santa Maria, CA Urban Area14360927.060.085306.841.047746e+089.341327e+04MULTIPOLYGON (((-120.43534 34.98668, -120.4352...
1266338215Hemet, CA Urban Area17319437.060.124673.611.396604e+081.241736e+05MULTIPOLYGON (((-116.86277 33.74268, -116.8627...
2266451445Los Angeles--Long Beach--Anaheim, CA Urban Area122373761636.8318.647476.286.251949e+091.263335e+06MULTIPOLYGON (((-117.99338 34.16800, -117.9929...
3266578661San Diego, CA Urban Area3070300674.7214.614550.452.540300e+099.807105e+05MULTIPOLYGON (((-117.03839 32.80671, -117.0382...
4266675340Riverside--San Bernardino, CA Urban Area2276703608.562.583741.102.308176e+091.118771e+06MULTIPOLYGON (((-117.15890 33.92832, -117.1584...
\n", + "
" + ], "text/plain": [ - "ntd_id\n", - "00001 1\n", - "60012 1\n", - "50522 1\n", - "55311 1\n", - "55312 1\n", - "Name: primary_uza_code, dtype: int64" + " OBJECTID UACE NAMELSAD POP \\\n", + "0 2662 79417 Santa Maria, CA Urban Area 143609 \n", + "1 2663 38215 Hemet, CA Urban Area 173194 \n", + "2 2664 51445 Los Angeles--Long Beach--Anaheim, CA Urban Area 12237376 \n", + "3 2665 78661 San Diego, CA Urban Area 3070300 \n", + "4 2666 75340 Riverside--San Bernardino, CA Urban Area 2276703 \n", + "\n", + " AREALANDSQMI AREAWATERSQMI POPDEN Shape__Area Shape__Length \\\n", + "0 27.06 0.08 5306.84 1.047746e+08 9.341327e+04 \n", + "1 37.06 0.12 4673.61 1.396604e+08 1.241736e+05 \n", + "2 1636.83 18.64 7476.28 6.251949e+09 1.263335e+06 \n", + "3 674.72 14.61 4550.45 2.540300e+09 9.807105e+05 \n", + "4 608.56 2.58 3741.10 2.308176e+09 1.118771e+06 \n", + "\n", + " geometry \n", + "0 MULTIPOLYGON (((-120.43534 34.98668, -120.4352... \n", + "1 MULTIPOLYGON (((-116.86277 33.74268, -116.8627... \n", + "2 MULTIPOLYGON (((-117.99338 34.16800, -117.9929... \n", + "3 MULTIPOLYGON (((-117.03839 32.80671, -117.0382... \n", + "4 MULTIPOLYGON (((-117.15890 33.92832, -117.1584... " ] }, - "execution_count": 13, + "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "#ntd_monthly_ridership.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=False).head()" + "ca_uza.head()" ] }, { From a13ebf6ff37c3d86c41e6618a06affa7b9756dff Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Mon, 25 Nov 2024 23:59:47 +0000 Subject: [PATCH 12/13] merged 2 tables to find agencies that meet section G criteria. cleaned up notebook cells and formatting --- ntd/proposed_changes_25-26.ipynb | 1268 ++++++++++++++++++++++++------ 1 file changed, 1014 insertions(+), 254 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 2b60e7807..742e19f11 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -51,7 +51,6 @@ "cell_type": "markdown", "id": "afdfd59c-d9c1-4d48-b6e4-55e9af08be29", "metadata": { - "jp-MarkdownHeadingCollapsed": true, "tags": [] }, "source": [ @@ -97,7 +96,10 @@ { "cell_type": "markdown", "id": "70b0e807-80d6-49d7-bad3-eb2df8b7663b", - "metadata": {}, + "metadata": { + "jp-MarkdownHeadingCollapsed": true, + "tags": [] + }, "source": [ "### Notes from Proposed Changes document\n", "\n", @@ -155,14 +157,19 @@ "Sections G concerns changing rural operators with full reporter responsibilities to be reduced reporters. FTA aims to decrease the reporting burden, but this change affects an estimated 10-15 operators. going from full reporter to reduced reporters would mean the operator does not need to report data related to passenger miles or monthly service or safety stats\n", "\n", "\n", - "\n", - "---\n", - "\n", "Section H proposes a change to the NTD reporting platform to include a field that identify Voluntary reporters. This slightly increases the reporting burden for all NTD reporters. Sections E and F may conditionally increase the reporting burden for some operators, if the operator experiences cyber security or safety events. Finally, Section D slightly incresses the reporting burden by proposing new categories in to A-20 form.\n", "\n", "\n" ] }, + { + "cell_type": "markdown", + "id": "fb1166dc-1a56-40dd-98d5-708484c9b009", + "metadata": {}, + "source": [ + "---" + ] + }, { "cell_type": "markdown", "id": "97f07bb1-4fd2-4ad0-bb65-2419928d5326", @@ -183,7 +190,8 @@ "id": "d96201a0-75f0-4d6e-b9fb-26d6849e428d", "metadata": {}, "source": [ - "### Understanding the difference between urban Full Reporters and urban Reduced Reporters\n", + "### Understanding the difference between urban Full Reporters and urban Reduced Reporters\n", + "\n", "Per NTD reporting manual\n", ">Full Reporters must provide the Annual Report, as well as Monthly Ridership (MR) and monthly Safety and Security reports. All other reporter types file their reports on an annual basis.\n", "\n", @@ -224,18 +232,46 @@ "1. S&S-60 Safety Data Form\n", "2. Reduced Reporting Form (Form RR-20)\n", "3. Transit Asset Management Performance Measure Targets (Form A-90)\n", - "---\n", - "So if an operator, under this proposed change, goes from Full to Reduced reporter, we can expect to miss data from 11 forms. However, those unique forms dont look familar in the ntd validation report pipeline so im unsure what kind of impact the data science branch would see.\n", + "---\n" + ] + }, + { + "cell_type": "markdown", + "id": "15a76a53-1c43-4feb-994e-978bf7223578", + "metadata": {}, + "source": [ + "### FTAs proposed solution for these rural operator, full reporters\n", "\n", - "Will need to see if theres equivilant forms between Full and Reduced reporters that report similar data but in different forms." + ">FTA proposes a waiver process in which reporters that predominantly serve rural areas may request an exemption from filing as a Full Reporter. Effectively, this would mean that operators receiving the waiver would report as Reduced Reporters instead.\n", + ">\n", + ">FTA proposes to use data from the most recent year's validated and accepted data to evaluate eligibility for this waiver, and FTA would grant the waiver if each of the above criteria are met. Based on current available data, **FTA estimates that approximately 10-15 agencies would be eligible for this waiver.**\n", + ">\n", + ">FTA would automatically identify agencies that qualify for this waiver ... All eligible reporters then would be presented with the option to request the waiver annually during the Report Year Kick-Off (RYKO) process\n", + "\n" ] }, { "cell_type": "markdown", - "id": "8cf55bc0-8bfc-451f-a663-5dbe44a14c81", + "id": "9cac279c-0075-4c2f-83e8-7acf9e289fd4", "metadata": {}, "source": [ - "### Query the warehouse to get find all the reporters that meet Sec G criteria\n", + "## Comments for Area 2\n", + "\n", + "If an operator, under this proposed change, goes from Full to Reduced reporter, we can expect to miss data from 11 forms. However, those unique forms dont look familar in the ntd validation report pipeline so im unsure what kind of impact the data science branch would see.\n", + "\n", + "Will need to see if theres equivilant forms between Full and Reduced reporters that report similar data but in different forms.\n", + "\n", + "FTA's method for presenting elibible reporters of the waiver process seem a little unclear. I assume FTA will notify only the reporters that meet all the criteria in section G, and NOT all reporters. " + ] + }, + { + "cell_type": "markdown", + "id": "8cf55bc0-8bfc-451f-a663-5dbe44a14c81", + "metadata": { + "tags": [] + }, + "source": [ + "### Query the warehouse to get find all the reporters that meet Sec G criteria\n", "\n", "Sec G criteria:\n", "- Receives funding under 49 U.S.C. `5311`,\n", @@ -252,7 +288,7 @@ "tags": [] }, "source": [ - "### Any operators that operate in more than 1 UZA?" + "#### Identify Agencies that meet Section G criteria" ] }, { @@ -262,12 +298,14 @@ "tags": [] }, "source": [ - "#### dim_annual_funding_sources" + "##### dim_annual_funding_sources\n", + "- for 5311 agencies (rural operators)\n", + "- Also has UZA and VOMS" ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 62, "id": "fc8f9407-af22-4805-bb0e-571cb7acd495", "metadata": {}, "outputs": [ @@ -277,39 +315,22 @@ "text": [ "\n", "RangeIndex: 134 entries, 0 to 133\n", - "Data columns (total 28 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 funding_source 134 non-null object \n", - " 1 agency 134 non-null object \n", - " 2 agency_voms 134 non-null float64\n", - " 3 city 134 non-null object \n", - " 4 fuel_tax 0 non-null object \n", - " 5 fta_capital_program_5309 134 non-null float64\n", - " 6 fta_rural_progam_5311 134 non-null float64\n", - " 7 fta_urbanized_area_formula 134 non-null float64\n", - " 8 general_funds 0 non-null object \n", - " 9 income_tax 0 non-null object \n", - " 10 ntd_id 134 non-null object \n", - " 11 organization_type 134 non-null object \n", - " 12 other_dot_funds 134 non-null float64\n", - " 13 other_federal_funds 134 non-null float64\n", - " 14 other_fta_funds 134 non-null float64\n", - " 15 other_funds 0 non-null object \n", - " 16 other_taxes 0 non-null object \n", - " 17 primary_uza_population 134 non-null float64\n", - " 18 property_tax 0 non-null object \n", - " 19 reduced_reporter_funds 0 non-null object \n", - " 20 report_year 134 non-null object \n", - " 21 reporter_type 134 non-null object \n", - " 22 sales_tax 0 non-null object \n", - " 23 state 134 non-null object \n", - " 24 tolls 0 non-null object \n", - " 25 transportation_funds 0 non-null object \n", - " 26 primary_uza_code 134 non-null object \n", - " 27 primary_uza_name 134 non-null object \n", - "dtypes: float64(8), object(20)\n", - "memory usage: 29.4+ KB\n" + "Data columns (total 11 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 funding_source 134 non-null object \n", + " 1 agency 134 non-null object \n", + " 2 agency_voms 134 non-null float64\n", + " 3 fta_rural_progam_5311 134 non-null float64\n", + " 4 ntd_id 134 non-null object \n", + " 5 organization_type 134 non-null object \n", + " 6 primary_uza_population 134 non-null float64\n", + " 7 report_year 134 non-null object \n", + " 8 reporter_type 134 non-null object \n", + " 9 primary_uza_code 134 non-null object \n", + " 10 primary_uza_name 134 non-null object \n", + "dtypes: float64(3), object(8)\n", + "memory usage: 11.6+ KB\n" ] } ], @@ -318,35 +339,30 @@ "ntd_funding_sources = (tbls.mart_ntd.dim_annual_funding_sources()\n", " >> filter(_.report_year == \"2023\",\n", " _.fta_rural_progam_5311 > 0,\n", - " _.reporter_type == \"Full Reporter\"\n", + " _.reporter_type == \"Full Reporter\",\n", + " _.primary_uza_code is not None\n", " )\n", " >> collect()\n", " )\n", + "\n", + "keep_cols_0=[\n", + " \"funding_source\",\n", + " \"agency\",\n", + " \"agency_voms\",\n", + " \"fta_rural_progam_5311\",\n", + " \"ntd_id\",\n", + " \"organization_type\",\n", + " \"primary_uza_population\",\n", + " \"report_year\",\n", + " \"reporter_type\",\n", + " \"primary_uza_code\",\n", + " \"primary_uza_name\"\n", + "]\n", + "\n", + "ntd_funding_sources = ntd_funding_sources[keep_cols_1]\n", "ntd_funding_sources.info()" ] }, - { - "cell_type": "code", - "execution_count": 20, - "id": "f18b8c63-6052-4849-b1be-721a5cf54ecf", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "Full Reporter 134\n", - "Name: reporter_type, dtype: int64" - ] - }, - "execution_count": 20, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "ntd_funding_sources[\"reporter_type\"].value_counts()" - ] - }, { "cell_type": "markdown", "id": "28e56afc-2d63-43a6-871b-b6554035ed7e", @@ -354,12 +370,13 @@ "tags": [] }, "source": [ - "#### dim_annual_service_agencies" + "##### dim_annual_service_agencies\n", + "- for UZA, VOMS and VRM" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 63, "id": "cf4cc544-785a-4476-bdda-9019e6a5e6de", "metadata": {}, "outputs": [ @@ -368,239 +385,951 @@ "output_type": "stream", "text": [ "\n", - "RangeIndex: 2201 entries, 0 to 2200\n", - "Data columns (total 36 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 key 2201 non-null object \n", - " 1 report_year 2201 non-null object \n", - " 2 ntd_id 2201 non-null object \n", - " 3 agency 2201 non-null object \n", - " 4 reporter_type 2201 non-null object \n", - " 5 organization_type 2201 non-null object \n", - " 6 city 2173 non-null object \n", - " 7 state 2201 non-null object \n", - " 8 agency_voms 2201 non-null float64\n", - " 9 primary_uza_code 1089 non-null float64\n", - " 10 primary_uza_name 1089 non-null object \n", - " 11 primary_uza_area_sq_miles 2201 non-null object \n", - " 12 primary_uza_population 1089 non-null float64\n", - " 13 service_area_sq_miles 1059 non-null float64\n", - " 14 service_area_population 1060 non-null float64\n", - " 15 actual_vehicles_passenger_car_deadhead_hours 2201 non-null float64\n", - " 16 actual_vehicles_passenger_car_hours 2201 non-null float64\n", - " 17 actual_vehicles_passenger_car_miles 2201 non-null float64\n", - " 18 actual_vehicles_passenger_car_revenue_hours 2201 non-null float64\n", - " 19 actual_vehicles_passenger_car_revenue_miles 2201 non-null float64\n", - " 20 actual_vehicles_passenger_deadhead_miles 2201 non-null float64\n", - " 21 scheduled_vehicles_passenger_car_revenue_miles 2201 non-null float64\n", - " 22 charter_service_hours 380 non-null float64\n", - " 23 school_bus_hours 357 non-null float64\n", - " 24 trains_in_operation 2201 non-null float64\n", - " 25 directional_route_miles 2201 non-null float64\n", - " 26 passenger_miles 2201 non-null float64\n", - " 27 train_miles 2201 non-null float64\n", - " 28 train_revenue_miles 2201 non-null float64\n", - " 29 train_deadhead_miles 79 non-null float64\n", - " 30 train_hours 2201 non-null float64\n", - " 31 train_revenue_hours 2201 non-null float64\n", - " 32 train_deadhead_hours 79 non-null float64\n", - " 33 ada_upt 382 non-null float64\n", - " 34 sponsored_service_upt 1741 non-null float64\n", - " 35 unlinked_passenger_trips_upt 2201 non-null float64\n", - "dtypes: float64(26), object(10)\n", - "memory usage: 619.2+ KB\n" + "RangeIndex: 65 entries, 0 to 64\n", + "Data columns (total 13 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 65 non-null object \n", + " 1 report_year 65 non-null object \n", + " 2 ntd_id 65 non-null object \n", + " 3 agency 65 non-null object \n", + " 4 reporter_type 65 non-null object \n", + " 5 organization_type 65 non-null object \n", + " 6 city 65 non-null object \n", + " 7 state 65 non-null object \n", + " 8 agency_voms 65 non-null float64\n", + " 9 primary_uza_code 63 non-null float64\n", + " 10 primary_uza_name 63 non-null object \n", + " 11 primary_uza_population 63 non-null float64\n", + " 12 actual_vehicles_passenger_car_revenue_miles 65 non-null float64\n", + "dtypes: float64(4), object(9)\n", + "memory usage: 6.7+ KB\n" ] } ], "source": [ - "# Has UZA, VRM and VOMS. No 5311 funds\n", + "# Has UZA, VRM and VOMS. \n", + "\n", "ntd_service_agencies = (tbls.mart_ntd.dim_annual_service_agencies ()\n", " >> filter(_.report_year == \"2023\",\n", - " _.agency_voms > 30\n", + " _.agency_voms > 30,\n", + " _.state == \"CA\",\n", + " _.primary_uza_code is not None\n", " )\n", " >> collect()\n", " )\n", + "\n", + "keep_col_1 =[\n", + " \"key\",\n", + " \"report_year\",\n", + " \"ntd_id\",\n", + " \"agency\",\n", + " \"reporter_type\",\n", + " \"organization_type\",\n", + " \"city\",\n", + " \"state\",\n", + " \"agency_voms\",\n", + " \"primary_uza_code\",\n", + " \"primary_uza_name\",\n", + " \"primary_uza_population\",\n", + " \"actual_vehicles_passenger_car_revenue_miles\"\n", + "]\n", + "\n", + "ntd_service_agencies = ntd_service_agencies[keep_col_1]\n", + "\n", "ntd_service_agencies.info()" ] }, - { - "cell_type": "code", - "execution_count": 8, - "id": "702e1dba-17a5-4364-a5c5-0c04955f9fab", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "City, County or Local Government Unit or Department of Transportation 1103\n", - "Independent Public Agency or Authority of Transit Service 422\n", - "Private-Non-Profit Corporation 358\n", - "Tribe 113\n", - "MPO, COG or Other Planning Agency 63\n", - "Area Agency on Aging 43\n", - "State Government Unit or Department of Transportation 27\n", - "Private-For-Profit Corporation 23\n", - "University 20\n", - "Other Publicly-Owned or Privately Chartered Corporation 15\n", - "Private Provider Reporting on Behalf of a Public Entity 7\n", - "Subsidiary Unit of a Transit Agency, Reporting Separately 7\n", - "Name: organization_type, dtype: int64" - ] - }, - "execution_count": 8, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "ntd_service_agencies.organization_type.value_counts()" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "2901be3a-5c7f-4eaa-a45c-509d32bef198", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "ntd_id\n", - "1 1\n", - "44907 1\n", - "80302 1\n", - "80303 1\n", - "85 1\n", - "Name: primary_uza_name, dtype: int64" - ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "ntd_service_agencies.groupby(\"ntd_id\")[\"primary_uza_name\"].nunique().sort_values(ascending=False).head()" - ] - }, { "cell_type": "markdown", - "id": "80ff21e9-de8b-4dc4-b431-0a57f3919308", - "metadata": { - "tags": [] - }, - "source": [ - "#### dim_annual_ntd_agency_information" - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "id": "20505f79-5721-4789-b10a-61bbdbb57b9e", + "id": "6680f16e-0074-45a1-a15e-d0c49278b59f", "metadata": {}, - "outputs": [], "source": [ - "# has UZA data for operator\n", - "\n", - "#ntd_agency_info = (tbls.mart_ntd.dim_annual_ntd_agency_information()\n", - "# >> filter(_._is_current == True,\n", - "# _.year == 2022\n", - "# )\n", - "# >> collect()\n", - "# )" + "##### Merge dataframes to get 5311 agencies in CA with >30 VOMS with UZA names" ] }, { "cell_type": "code", - "execution_count": null, - "id": "1fac9f3a-3bd9-4c4c-9f83-86002a1e56f8", + "execution_count": 74, + "id": "da7fcca3-3d8e-43ec-875e-1aefffb27f63", "metadata": {}, "outputs": [], "source": [ - "#ntd_agency_info.info()" + "on_list =[\n", + " \"report_year\",\n", + " \"agency\",\n", + " \"reporter_type\",\n", + " \"organization_type\",\n", + " \"organization_type\",\n", + " \"agency_voms\",\n", + " \"primary_uza_name\",\n", + " \"ntd_id\",\n", + " \"primary_uza_population\"\n", + "]\n", + "merge = ntd_service_agencies.merge(\n", + " ntd_funding_sources, \n", + " how=\"inner\", \n", + " on= on_list, \n", + " indicator=True )" ] }, { "cell_type": "code", - "execution_count": null, - "id": "31591169-cdb1-45db-880a-6cdbcc013fb2", + "execution_count": 75, + "id": "3b5a9740-fecb-4d59-9765-bbbe7f5566fe", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Int64Index: 26 entries, 0 to 25\n", + "Data columns (total 17 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 26 non-null object \n", + " 1 report_year 26 non-null object \n", + " 2 ntd_id 26 non-null object \n", + " 3 agency 26 non-null object \n", + " 4 reporter_type 26 non-null object \n", + " 5 organization_type 26 non-null object \n", + " 6 city 26 non-null object \n", + " 7 state 26 non-null object \n", + " 8 agency_voms 26 non-null float64 \n", + " 9 primary_uza_code_x 26 non-null float64 \n", + " 10 primary_uza_name 26 non-null object \n", + " 11 primary_uza_population 26 non-null float64 \n", + " 12 actual_vehicles_passenger_car_revenue_miles 26 non-null float64 \n", + " 13 funding_source 26 non-null object \n", + " 14 fta_rural_progam_5311 26 non-null float64 \n", + " 15 primary_uza_code_y 26 non-null object \n", + " 16 _merge 26 non-null category\n", + "dtypes: category(1), float64(5), object(11)\n", + "memory usage: 3.6+ KB\n" + ] + } + ], "source": [ - "#ntd_agency_info.groupby(\"ntd_id\")[\"uza_name\"].nunique().sort_values(ascending=False).head()" + "merge.info()" ] }, { "cell_type": "markdown", - "id": "623ce1ce-bb7a-450a-b139-6f2cf6ac375b", - "metadata": { - "tags": [] - }, - "source": [ - "#### dim_annual_service_mode_time_periods" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "ae899649-d9fa-4652-884b-f8912a768ff1", + "id": "8b683dcd-ad9e-4157-a079-d74c9394a460", "metadata": {}, - "outputs": [], "source": [ - "#ntd_service_mode = (tbls.mart_ntd.dim_annual_service_mode_time_periods()\n", - "# >> filter(_.report_year == \"2023\",\n", - "# # _.year == 2022\n", - "# )\n", - "# >> collect()\n", - "# )\n", - "#ntd_service_mode.info()" + "##### Who are the agencies that match Section G Critera?" ] }, { "cell_type": "code", - "execution_count": null, - "id": "9373a629-6506-4dcd-af77-c068947bd017", + "execution_count": 76, + "id": "702e1dba-17a5-4364-a5c5-0c04955f9fab", "metadata": {}, - "outputs": [], - "source": [ - "#ntd_service_mode.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=True).head()" - ] - }, - { - "cell_type": "markdown", - "id": "424fa017-17db-4bf9-8456-db7036fb656e", - "metadata": { - "tags": [] - }, + "outputs": [ + { + "data": { + "text/plain": [ + "The Eastern Contra Costa Transit Authority, dba: Tri Delta Transit 1\n", + "Santa Cruz Metropolitan Transit District 1\n", + "San Joaquin Regional Transit District, dba: San Joaquin RTD 1\n", + "Kings County Area Public Transit Agency 1\n", + "San Luis Obispo Regional Transit Authority 1\n", + "Butte County Association of Governments, dba: Butte Regional Transit/B-Line 1\n", + "County of Sonoma , dba: Sonoma County Transit 1\n", + "Napa Valley Transportation Authority 1\n", + "Santa Clara Valley Transportation Authority, dba: Valley Transportation Authority 1\n", + "Livermore / Amador Valley Transit Authority 1\n", + "Antelope Valley Transit Authority 1\n", + "Monterey-Salinas Transit 1\n", + "Transit Joint Powers Authority for Merced County, dba: Merced The Bus 1\n", + "Victor Valley Transit Authority 1\n", + "County of Placer, dba: Placer County Transit/TART 1\n", + "Yolo County Transportation District 1\n", + "North County Transit District 1\n", + "San Diego Metropolitan Transit System 1\n", + "SunLine Transit Agency, dba: SunLine 1\n", + "Stanislaus Regional Transit Authority 1\n", + "Marin County Transit District, dba: Marin Transit 1\n", + "San Mateo County Transit District, dba: SamTrans 1\n", + "Ventura County Transportation Commission 1\n", + "Tulare County Regional Transit Agency 1\n", + "City of Visalia , dba: Visalia Transit 1\n", + "Riverside Transit Agency 1\n", + "Name: agency, dtype: int64" + ] + }, + "execution_count": 76, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "#### dim_monthly_ridership_with_adjustments" + "merge[\"agency\"].value_counts()" ] }, { "cell_type": "code", - "execution_count": null, - "id": "b76deddb-d6f6-4305-8a92-66c2092b88ed", + "execution_count": 71, + "id": "7871860f-0ec7-4704-b577-c3279dc7ac53", "metadata": {}, - "outputs": [], + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
keyreport_yearntd_idagencyreporter_typeorganization_typecitystateagency_vomsprimary_uza_code_xprimary_uza_nameprimary_uza_populationactual_vehicles_passenger_car_revenue_milesfunding_sourcefta_rural_progam_5311primary_uza_code_y_merge
07b92911062e8311df2aaefa803453213202390162The Eastern Contra Costa Transit Authority, db...Full ReporterIndependent Public Agency or Authority of Tran...AntiochCA98.02683.0Antioch, CA326205.03200288.0federal172914.02683.0both
1231d30a033adffad7c05ad22cdffc83a202390006Santa Cruz Metropolitan Transit DistrictFull ReporterIndependent Public Agency or Authority of Tran...Santa CruzCA93.079336.0Santa Cruz, CA169038.02975126.0federal263285.079336.0both
243d7e995716ec1e5bb708810a28f4ce2202390091City of Visalia , dba: Visalia TransitFull ReporterCity, County or Local Government Unit or Depar...VisaliaCA37.090946.0Visalia, CA160578.02197535.0federal491222.090946.0both
318a26ef2624bf2302b13c3da290e4a31202390310Tulare County Regional Transit AgencyFull ReporterIndependent Public Agency or Authority of Tran...VisaliaCA41.090946.0Visalia, CA160578.02105383.0federal3779135.090946.0both
40074aba1a1468f0ef01ba5bb8591f4ea202390164Ventura County Transportation CommissionFull ReporterIndependent Public Agency or Authority of Tran...CamarilloCA45.066673.0Oxnard--San Buenaventura (Ventura), CA376117.01618834.0federal390779.066673.0both
5c66136eb07990c683307924891da05a5202390009San Mateo County Transit District, dba: SamTransFull ReporterIndependent Public Agency or Authority of Tran...San CarlosCA331.078904.0San Francisco--Oakland, CA3515933.07793698.0federal783902.078904.0both
69011ecad1a04b79627d40c2e3ec4a918202390234Marin County Transit District, dba: Marin TransitFull ReporterIndependent Public Agency or Authority of Tran...San RafaelCA78.078904.0San Francisco--Oakland, CA3515933.03017303.0federal543303.078904.0both
72ee6240880777b119fcf8dd9da330af9202390306Stanislaus Regional Transit AuthorityFull ReporterIndependent Public Agency or Authority of Tran...ModestoCA102.058006.0Modesto, CA357301.04216882.0federal916161.058006.0both
8f34f1051a6c81d1a4149fb93da4ff392202390079SunLine Transit Agency, dba: SunLineFull ReporterIndependent Public Agency or Authority of Tran...Thousand PalmsCA82.041347.0Indio--Palm Desert--Palm Springs, CA361075.03938721.0federal1348953.041347.0both
99c81bfb86d1bbeaac2e00c6c0142c6cb202390026San Diego Metropolitan Transit SystemFull ReporterIndependent Public Agency or Authority of Tran...San DiegoCA774.078661.0San Diego, CA3070300.034095949.0federal878451.078661.0both
10732ac248345346f575122087bdbc3edd202390030North County Transit DistrictFull ReporterIndependent Public Agency or Authority of Tran...OceansideCA189.078661.0San Diego, CA3070300.08204114.0federal1665659.078661.0both
110129dd27f70c835fb794803cd65f48ce202390090Yolo County Transportation DistrictFull ReporterIndependent Public Agency or Authority of Tran...WoodlandCA40.077068.0Sacramento, CA1946618.01984049.0federal200210.077068.0both
12c99398cad6d9daad6b305d869ec1a780202390196County of Placer, dba: Placer County Transit/TARTFull ReporterCity, County or Local Government Unit or Depar...AuburnCA47.077068.0Sacramento, CA1946618.01920826.0federal1051362.077068.0both
132b062c505686f95e61a8714144a1758b202390148Victor Valley Transit AuthorityFull ReporterIndependent Public Agency or Authority of Tran...HesperiaCA277.090541.0Victorville--Hesperia--Apple Valley, CA355816.09070059.0federal1125644.090541.0both
1422c34fbd0b7d09bf89346cc62aad17d5202390173Transit Joint Powers Authority for Merced Coun...Full ReporterIndependent Public Agency or Authority of Tran...MercedCA45.056251.0Merced, CA150052.02027756.0federal1347802.056251.0both
15423019d2bdcac472b0cd1d81bc8380b7202390062Monterey-Salinas TransitFull ReporterIndependent Public Agency or Authority of Tran...MontereyCA165.080362.0Seaside--Monterey--Pacific Grove, CA123495.04249833.0federal2013121.080362.0both
16942e25509363518941fb783119b4d2ae202390121Antelope Valley Transit AuthorityFull ReporterIndependent Public Agency or Authority of Tran...LancasterCA86.067140.0Palmdale--Lancaster, CA359559.03530244.0federal2212409.067140.0both
1711e9e6577aa4afc22d4ed41b15e4479c202390144Livermore / Amador Valley Transit AuthorityFull ReporterIndependent Public Agency or Authority of Tran...LivermoreCA49.050533.0Livermore--Pleasanton--Dublin, CA240381.01328472.0federal106451.050533.0both
1891df0abfcbd3472d38a4224067582dc3202390013Santa Clara Valley Transportation Authority, d...Full ReporterIndependent Public Agency or Authority of Tran...San JoseCA512.079039.0San Jose, CA1837446.021779295.0federal380553.079039.0both
1906b3bf7a35f002aa4ed531265a0b63eb202390088Napa Valley Transportation AuthorityFull ReporterIndependent Public Agency or Authority of Tran...NapaCA44.061057.0Napa, CA84619.01345314.0federal1052042.061057.0both
2082dd8c5bb6b173b51e9e0b89a6bdd806202390089County of Sonoma , dba: Sonoma County TransitFull ReporterCity, County or Local Government Unit or Depar...Santa RosaCA47.079498.0Santa Rosa, CA297329.01737914.0federal550600.079498.0both
217bd3fc4c7aee7a83869619c0aa800a85202390208Butte County Association of Governments, dba: ...Full ReporterMPO, COG or Other Planning AgencyChicoCA37.016318.0Chico, CA111411.01182984.0federal1605128.016318.0both
22c4d6006dd6de8e33faf71155d16b3f68202390206San Luis Obispo Regional Transit AuthorityFull ReporterIndependent Public Agency or Authority of Tran...San Luis ObispoCA41.079147.0San Luis Obispo, CA56904.01542116.0federal762127.079147.0both
2356c88becd84cdf6874e9f3abe03eb6f0202390200Kings County Area Public Transit AgencyFull ReporterIndependent Public Agency or Authority of Tran...HanfordCA57.036703.0Hanford, CA66638.01522582.0federal586765.036703.0both
24b54cef7138d0057e9085da47db94506e202390012San Joaquin Regional Transit District, dba: Sa...Full ReporterIndependent Public Agency or Authority of Tran...StocktonCA93.085087.0Stockton, CA414847.02612286.0federal1269577.085087.0both
25e835f62b4fa6954b297ae0788141e1fe202390031Riverside Transit AgencyFull ReporterIndependent Public Agency or Authority of Tran...RiversideCA211.075340.0Riverside--San Bernardino, CA2276703.09238973.0federal682130.075340.0both
\n", + "
" + ], + "text/plain": [ + " key report_year ntd_id \\\n", + "0 7b92911062e8311df2aaefa803453213 2023 90162 \n", + "1 231d30a033adffad7c05ad22cdffc83a 2023 90006 \n", + "2 43d7e995716ec1e5bb708810a28f4ce2 2023 90091 \n", + "3 18a26ef2624bf2302b13c3da290e4a31 2023 90310 \n", + "4 0074aba1a1468f0ef01ba5bb8591f4ea 2023 90164 \n", + "5 c66136eb07990c683307924891da05a5 2023 90009 \n", + "6 9011ecad1a04b79627d40c2e3ec4a918 2023 90234 \n", + "7 2ee6240880777b119fcf8dd9da330af9 2023 90306 \n", + "8 f34f1051a6c81d1a4149fb93da4ff392 2023 90079 \n", + "9 9c81bfb86d1bbeaac2e00c6c0142c6cb 2023 90026 \n", + "10 732ac248345346f575122087bdbc3edd 2023 90030 \n", + "11 0129dd27f70c835fb794803cd65f48ce 2023 90090 \n", + "12 c99398cad6d9daad6b305d869ec1a780 2023 90196 \n", + "13 2b062c505686f95e61a8714144a1758b 2023 90148 \n", + "14 22c34fbd0b7d09bf89346cc62aad17d5 2023 90173 \n", + "15 423019d2bdcac472b0cd1d81bc8380b7 2023 90062 \n", + "16 942e25509363518941fb783119b4d2ae 2023 90121 \n", + "17 11e9e6577aa4afc22d4ed41b15e4479c 2023 90144 \n", + "18 91df0abfcbd3472d38a4224067582dc3 2023 90013 \n", + "19 06b3bf7a35f002aa4ed531265a0b63eb 2023 90088 \n", + "20 82dd8c5bb6b173b51e9e0b89a6bdd806 2023 90089 \n", + "21 7bd3fc4c7aee7a83869619c0aa800a85 2023 90208 \n", + "22 c4d6006dd6de8e33faf71155d16b3f68 2023 90206 \n", + "23 56c88becd84cdf6874e9f3abe03eb6f0 2023 90200 \n", + "24 b54cef7138d0057e9085da47db94506e 2023 90012 \n", + "25 e835f62b4fa6954b297ae0788141e1fe 2023 90031 \n", + "\n", + " agency reporter_type \\\n", + "0 The Eastern Contra Costa Transit Authority, db... Full Reporter \n", + "1 Santa Cruz Metropolitan Transit District Full Reporter \n", + "2 City of Visalia , dba: Visalia Transit Full Reporter \n", + "3 Tulare County Regional Transit Agency Full Reporter \n", + "4 Ventura County Transportation Commission Full Reporter \n", + "5 San Mateo County Transit District, dba: SamTrans Full Reporter \n", + "6 Marin County Transit District, dba: Marin Transit Full Reporter \n", + "7 Stanislaus Regional Transit Authority Full Reporter \n", + "8 SunLine Transit Agency, dba: SunLine Full Reporter \n", + "9 San Diego Metropolitan Transit System Full Reporter \n", + "10 North County Transit District Full Reporter \n", + "11 Yolo County Transportation District Full Reporter \n", + "12 County of Placer, dba: Placer County Transit/TART Full Reporter \n", + "13 Victor Valley Transit Authority Full Reporter \n", + "14 Transit Joint Powers Authority for Merced Coun... Full Reporter \n", + "15 Monterey-Salinas Transit Full Reporter \n", + "16 Antelope Valley Transit Authority Full Reporter \n", + "17 Livermore / Amador Valley Transit Authority Full Reporter \n", + "18 Santa Clara Valley Transportation Authority, d... Full Reporter \n", + "19 Napa Valley Transportation Authority Full Reporter \n", + "20 County of Sonoma , dba: Sonoma County Transit Full Reporter \n", + "21 Butte County Association of Governments, dba: ... Full Reporter \n", + "22 San Luis Obispo Regional Transit Authority Full Reporter \n", + "23 Kings County Area Public Transit Agency Full Reporter \n", + "24 San Joaquin Regional Transit District, dba: Sa... Full Reporter \n", + "25 Riverside Transit Agency Full Reporter \n", + "\n", + " organization_type city state \\\n", + "0 Independent Public Agency or Authority of Tran... Antioch CA \n", + "1 Independent Public Agency or Authority of Tran... Santa Cruz CA \n", + "2 City, County or Local Government Unit or Depar... Visalia CA \n", + "3 Independent Public Agency or Authority of Tran... Visalia CA \n", + "4 Independent Public Agency or Authority of Tran... Camarillo CA \n", + "5 Independent Public Agency or Authority of Tran... San Carlos CA \n", + "6 Independent Public Agency or Authority of Tran... San Rafael CA \n", + "7 Independent Public Agency or Authority of Tran... Modesto CA \n", + "8 Independent Public Agency or Authority of Tran... Thousand Palms CA \n", + "9 Independent Public Agency or Authority of Tran... San Diego CA \n", + "10 Independent Public Agency or Authority of Tran... Oceanside CA \n", + "11 Independent Public Agency or Authority of Tran... Woodland CA \n", + "12 City, County or Local Government Unit or Depar... Auburn CA \n", + "13 Independent Public Agency or Authority of Tran... Hesperia CA \n", + "14 Independent Public Agency or Authority of Tran... Merced CA \n", + "15 Independent Public Agency or Authority of Tran... Monterey CA \n", + "16 Independent Public Agency or Authority of Tran... Lancaster CA \n", + "17 Independent Public Agency or Authority of Tran... Livermore CA \n", + "18 Independent Public Agency or Authority of Tran... San Jose CA \n", + "19 Independent Public Agency or Authority of Tran... Napa CA \n", + "20 City, County or Local Government Unit or Depar... Santa Rosa CA \n", + "21 MPO, COG or Other Planning Agency Chico CA \n", + "22 Independent Public Agency or Authority of Tran... San Luis Obispo CA \n", + "23 Independent Public Agency or Authority of Tran... Hanford CA \n", + "24 Independent Public Agency or Authority of Tran... Stockton CA \n", + "25 Independent Public Agency or Authority of Tran... Riverside CA \n", + "\n", + " agency_voms primary_uza_code_x primary_uza_name \\\n", + "0 98.0 2683.0 Antioch, CA \n", + "1 93.0 79336.0 Santa Cruz, CA \n", + "2 37.0 90946.0 Visalia, CA \n", + "3 41.0 90946.0 Visalia, CA \n", + "4 45.0 66673.0 Oxnard--San Buenaventura (Ventura), CA \n", + "5 331.0 78904.0 San Francisco--Oakland, CA \n", + "6 78.0 78904.0 San Francisco--Oakland, CA \n", + "7 102.0 58006.0 Modesto, CA \n", + "8 82.0 41347.0 Indio--Palm Desert--Palm Springs, CA \n", + "9 774.0 78661.0 San Diego, CA \n", + "10 189.0 78661.0 San Diego, CA \n", + "11 40.0 77068.0 Sacramento, CA \n", + "12 47.0 77068.0 Sacramento, CA \n", + "13 277.0 90541.0 Victorville--Hesperia--Apple Valley, CA \n", + "14 45.0 56251.0 Merced, CA \n", + "15 165.0 80362.0 Seaside--Monterey--Pacific Grove, CA \n", + "16 86.0 67140.0 Palmdale--Lancaster, CA \n", + "17 49.0 50533.0 Livermore--Pleasanton--Dublin, CA \n", + "18 512.0 79039.0 San Jose, CA \n", + "19 44.0 61057.0 Napa, CA \n", + "20 47.0 79498.0 Santa Rosa, CA \n", + "21 37.0 16318.0 Chico, CA \n", + "22 41.0 79147.0 San Luis Obispo, CA \n", + "23 57.0 36703.0 Hanford, CA \n", + "24 93.0 85087.0 Stockton, CA \n", + "25 211.0 75340.0 Riverside--San Bernardino, CA \n", + "\n", + " primary_uza_population actual_vehicles_passenger_car_revenue_miles \\\n", + "0 326205.0 3200288.0 \n", + "1 169038.0 2975126.0 \n", + "2 160578.0 2197535.0 \n", + "3 160578.0 2105383.0 \n", + "4 376117.0 1618834.0 \n", + "5 3515933.0 7793698.0 \n", + "6 3515933.0 3017303.0 \n", + "7 357301.0 4216882.0 \n", + "8 361075.0 3938721.0 \n", + "9 3070300.0 34095949.0 \n", + "10 3070300.0 8204114.0 \n", + "11 1946618.0 1984049.0 \n", + "12 1946618.0 1920826.0 \n", + "13 355816.0 9070059.0 \n", + "14 150052.0 2027756.0 \n", + "15 123495.0 4249833.0 \n", + "16 359559.0 3530244.0 \n", + "17 240381.0 1328472.0 \n", + "18 1837446.0 21779295.0 \n", + "19 84619.0 1345314.0 \n", + "20 297329.0 1737914.0 \n", + "21 111411.0 1182984.0 \n", + "22 56904.0 1542116.0 \n", + "23 66638.0 1522582.0 \n", + "24 414847.0 2612286.0 \n", + "25 2276703.0 9238973.0 \n", + "\n", + " funding_source fta_rural_progam_5311 primary_uza_code_y _merge \n", + "0 federal 172914.0 2683.0 both \n", + "1 federal 263285.0 79336.0 both \n", + "2 federal 491222.0 90946.0 both \n", + "3 federal 3779135.0 90946.0 both \n", + "4 federal 390779.0 66673.0 both \n", + "5 federal 783902.0 78904.0 both \n", + "6 federal 543303.0 78904.0 both \n", + "7 federal 916161.0 58006.0 both \n", + "8 federal 1348953.0 41347.0 both \n", + "9 federal 878451.0 78661.0 both \n", + "10 federal 1665659.0 78661.0 both \n", + "11 federal 200210.0 77068.0 both \n", + "12 federal 1051362.0 77068.0 both \n", + "13 federal 1125644.0 90541.0 both \n", + "14 federal 1347802.0 56251.0 both \n", + "15 federal 2013121.0 80362.0 both \n", + "16 federal 2212409.0 67140.0 both \n", + "17 federal 106451.0 50533.0 both \n", + "18 federal 380553.0 79039.0 both \n", + "19 federal 1052042.0 61057.0 both \n", + "20 federal 550600.0 79498.0 both \n", + "21 federal 1605128.0 16318.0 both \n", + "22 federal 762127.0 79147.0 both \n", + "23 federal 586765.0 36703.0 both \n", + "24 federal 1269577.0 85087.0 both \n", + "25 federal 682130.0 75340.0 both " + ] + }, + "execution_count": 71, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "#ntd_monthly_ridership = (tbls.mart_ntd.dim_monthly_ridership_with_adjustments ()\n", - "# >> filter(_.period_year == \"2023\",\n", - "# # _.year == 2022\n", - "# )\n", - "# >> collect()\n", - "# )\n", - "#ntd_monthly_ridership.info()" + "merge" ] }, { - "cell_type": "code", - "execution_count": null, - "id": "31f8ce6d-a7a9-4fb9-b33f-605cd38ee183", + "cell_type": "markdown", + "id": "90c1d67a-50bd-49d5-a409-f69b4c983d2c", "metadata": {}, - "outputs": [], "source": [ - "#ntd_monthly_ridership.groupby(\"ntd_id\")[\"primary_uza_code\"].nunique().sort_values(ascending=False).head()" + "# TBD\n", + "- get UZA geometry data\n", + "- get list of bus stop point locations for the agencies identified above\n", + "- overlay bus stop point location on UZA geometry to find stops that are outside of the UZA for an agency to help answer\n", + ">- Operates fewer total VOMS in urbanized areas (UZAs) than rural (non-UZA) areas, and\n", + ">- Allocates more total Vehicle Revenue Miles (VRM) to non-UZAs than UZAs." ] }, { @@ -608,7 +1337,7 @@ "id": "ac729467-8a82-4b52-836f-453eaa140355", "metadata": {}, "source": [ - "## get UZA geojson data from rest server" + "### get UZA geojson data from FTA rest server" ] }, { @@ -818,6 +1547,37 @@ "ca_uza.head()" ] }, + { + "cell_type": "code", + "execution_count": 27, + "id": "bfde52af-be7c-4d89-aa93-6c5e1959914d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + }, + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "ca_uza.plot()" + ] + }, { "cell_type": "markdown", "id": "65592033-b6e9-4bf7-a24a-6b1d6f7a087b", From af2a0afae59ed8f7b74ff89405a3cb44aab5fbf2 Mon Sep 17 00:00:00 2001 From: csuyat-dot Date: Tue, 26 Nov 2024 00:16:34 +0000 Subject: [PATCH 13/13] more clean up --- ntd/proposed_changes_25-26.ipynb | 866 +++++++++++++++++++++++-------- 1 file changed, 648 insertions(+), 218 deletions(-) diff --git a/ntd/proposed_changes_25-26.ipynb b/ntd/proposed_changes_25-26.ipynb index 742e19f11..3ea7f9cc1 100644 --- a/ntd/proposed_changes_25-26.ipynb +++ b/ntd/proposed_changes_25-26.ipynb @@ -261,7 +261,7 @@ "\n", "Will need to see if theres equivilant forms between Full and Reduced reporters that report similar data but in different forms.\n", "\n", - "FTA's method for presenting elibible reporters of the waiver process seem a little unclear. I assume FTA will notify only the reporters that meet all the criteria in section G, and NOT all reporters. " + "FTA's method for presenting elibible reporters of the waiver process seem a little unclear. I assume FTA will notify only the reporters that meet all the criteria in section G, and NOT all reporters. Notifying all reporters of a possible waiver they might not be eligible for would cause a lot of confusion." ] }, { @@ -281,16 +281,6 @@ "- Allocates `more total Vehicle Revenue Miles (VRM) to non-UZAs` than `UZAs`.\n" ] }, - { - "cell_type": "markdown", - "id": "35ebc22e-ea54-452d-b6eb-c9b0e2c1fbdb", - "metadata": { - "tags": [] - }, - "source": [ - "#### Identify Agencies that meet Section G criteria" - ] - }, { "cell_type": "markdown", "id": "b6a189aa-2820-4efc-9d4f-5638c6d11379", @@ -298,7 +288,7 @@ "tags": [] }, "source": [ - "##### dim_annual_funding_sources\n", + "#### dim_annual_funding_sources\n", "- for 5311 agencies (rural operators)\n", "- Also has UZA and VOMS" ] @@ -370,7 +360,7 @@ "tags": [] }, "source": [ - "##### dim_annual_service_agencies\n", + "#### dim_annual_service_agencies\n", "- for UZA, VOMS and VRM" ] }, @@ -445,7 +435,7 @@ "id": "6680f16e-0074-45a1-a15e-d0c49278b59f", "metadata": {}, "source": [ - "##### Merge dataframes to get 5311 agencies in CA with >30 VOMS with UZA names" + "#### Merge dataframes to get 5311 agencies in CA with >30 VOMS with UZA names" ] }, { @@ -519,7 +509,7 @@ "id": "8b683dcd-ad9e-4157-a079-d74c9394a460", "metadata": {}, "source": [ - "##### Who are the agencies that match Section G Critera?" + "#### Who are the agencies that match Section G Critera?" ] }, { @@ -1319,6 +1309,17 @@ "merge" ] }, + { + "cell_type": "code", + "execution_count": 77, + "id": "13075ec9-4faf-4280-8930-97b6a28aac45", + "metadata": {}, + "outputs": [], + "source": [ + "# Export to GCS\n", + "merge.to_csv(\"gs://calitp-analytics-data/data-analyses/ntd/proposed_changes_agencies.csv\")" + ] + }, { "cell_type": "markdown", "id": "90c1d67a-50bd-49d5-a409-f69b4c983d2c", @@ -1596,7 +1597,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 86, "id": "b9f386ff-59fa-4eec-add5-b8c16a820cb3", "metadata": {}, "outputs": [ @@ -1605,53 +1606,111 @@ "output_type": "stream", "text": [ "\n", - "RangeIndex: 1186 entries, 0 to 1185\n", - "Data columns (total 24 columns):\n", - " # Column Non-Null Count Dtype \n", - "--- ------ -------------- ----- \n", - " 0 key 1186 non-null object \n", - " 1 source_record_id 1186 non-null object \n", - " 2 name 1186 non-null object \n", - " 3 organization_type 1186 non-null object \n", - " 4 roles 1186 non-null object \n", - " 5 itp_id 167 non-null float64 \n", - " 6 details 186 non-null object \n", - " 7 caltrans_district 508 non-null object \n", - " 8 website 1151 non-null object \n", - " 9 reporting_category 184 non-null object \n", - " 10 hubspot_company_record_id 279 non-null object \n", - " 11 gtfs_static_status 1186 non-null object \n", - " 12 gtfs_realtime_status 1186 non-null object \n", - " 13 _deprecated__assessment_status 1186 non-null bool \n", - " 14 manual_check__contact_on_website 665 non-null object \n", - " 15 alias 1186 non-null object \n", - " 16 is_public_entity 1186 non-null bool \n", - " 17 ntd_id 0 non-null object \n", - " 18 ntd_id_2022 0 non-null object \n", - " 19 public_currently_operating 1186 non-null bool \n", - " 20 public_currently_operating_fixed_route 1186 non-null bool \n", - " 21 _is_current 1186 non-null bool \n", - " 22 _valid_from 1186 non-null datetime64[ns, UTC]\n", - " 23 _valid_to 1186 non-null datetime64[ns, UTC]\n", - "dtypes: bool(5), datetime64[ns, UTC](2), float64(1), object(16)\n", - "memory usage: 182.0+ KB\n" + "RangeIndex: 36 entries, 0 to 35\n", + "Data columns (total 9 columns):\n", + " # Column Non-Null Count Dtype \n", + "--- ------ -------------- ----- \n", + " 0 key 36 non-null object\n", + " 1 source_record_id 36 non-null object\n", + " 2 name 36 non-null object\n", + " 3 organization_type 36 non-null object\n", + " 4 caltrans_district 34 non-null object\n", + " 5 reporting_category 27 non-null object\n", + " 6 is_public_entity 36 non-null bool \n", + " 7 ntd_id 0 non-null object\n", + " 8 public_currently_operating_fixed_route 36 non-null bool \n", + "dtypes: bool(2), object(7)\n", + "memory usage: 2.2+ KB\n" ] } ], "source": [ "dim_orgs = (tbls.mart_transit_database.dim_organizations()\n", " >> filter(_._is_current == True,\n", - " _.ntd_id.isna()\n", + " _.ntd_id.isna(),\n", + " _.public_currently_operating_fixed_route == True\n", " )\n", " >> collect()\n", " )\n", + "\n", + "keep_cols_2 =[\n", + " \"key\",\n", + " \"source_record_id\",\n", + " \"name\",\n", + " \"organization_type\",\n", + " \"caltrans_district\",\n", + " \"reporting_category\",\n", + " \"is_public_entity\",\n", + " \"ntd_id\",\n", + " \"public_currently_operating_fixed_route\", \n", + "]\n", + "\n", + "dim_orgs =dim_orgs[keep_cols_2]\n", + "\n", "dim_orgs.info()" ] }, { "cell_type": "code", - "execution_count": 15, - "id": "d34f841d-110d-4294-886f-61f9e836836f", + "execution_count": 87, + "id": "86359867-8b8a-41a9-8579-b38b7eb89bb3", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "POINT 1\n", + "Amtrak 1\n", + "City of Mission Viejo 1\n", + "City of Mountain View 1\n", + "City of Newport Beach 1\n", + "City of Rancho Cordova 1\n", + "Plumas Transit Systems 1\n", + "Yosemite National Park 1\n", + "Los Angeles World Airports 1\n", + "City of San Juan Capistrano 1\n", + "City of South San Francisco 1\n", + "San Diego International Airport 1\n", + "Solano Transportation Authority 1\n", + "San Joaquin Joint Powers Authority 1\n", + "San Francisco International Airport 1\n", + "Capitol Corridor Joint Powers Authority 1\n", + "Tahoe Truckee Area Regional Transportation 1\n", + "City of Laguna Niguel 1\n", + "Curry Public Transit 1\n", + "City of San Clemente 1\n", + "City of Banning 1\n", + "Commute.org 1\n", + "City of Ripon 1\n", + "City of Clovis 1\n", + "City of Irvine 1\n", + "Presidio Trust 1\n", + "City of Alameda 1\n", + "City of Beaumont 1\n", + "Cloverdale Transit 1\n", + "City of La Puente 1\n", + "City of Morro Bay 1\n", + "Santa Cruz Harbor 1\n", + "City of Dana Point 1\n", + "City of Menlo Park 1\n", + "City of Santa Cruz 1\n", + "Dumbarton Bridge Regional Operations Consortium 1\n", + "Name: name, dtype: int64" + ] + }, + "execution_count": 87, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "dim_orgs[\"name\"].value_counts()" + ] + }, + { + "cell_type": "code", + "execution_count": 88, + "id": "c053a7eb-c927-461a-8c3d-33281616610f", "metadata": {}, "outputs": [ { @@ -1679,246 +1738,617 @@ " source_record_id\n", " name\n", " organization_type\n", - " roles\n", - " itp_id\n", - " details\n", " caltrans_district\n", - " website\n", " reporting_category\n", - " hubspot_company_record_id\n", - " gtfs_static_status\n", - " gtfs_realtime_status\n", - " _deprecated__assessment_status\n", - " manual_check__contact_on_website\n", - " alias\n", " is_public_entity\n", " ntd_id\n", - " ntd_id_2022\n", - " public_currently_operating\n", " public_currently_operating_fixed_route\n", - " _is_current\n", - " _valid_from\n", - " _valid_to\n", " \n", " \n", " \n", " \n", " 0\n", - " eea7326fc87a575ce26e24cf56b8ff37\n", - " recsupkiKC6Y6fFfV\n", - " DAV\n", - " Non-Profit Organization\n", - " []\n", - " NaN\n", - " DAV operates a fleet of vehicles around the co...\n", - " 11 - San Diego\n", - " https://www.dav.org/veterans/i-need-a-ride/\n", + " 1d03fdbc2e4dfd044465b4437e70c9e0\n", + " recQRoE5mcCn6kdti\n", + " POINT\n", + " Independent Agency\n", + " 01 - Eureka\n", + " Other Public Transit\n", + " True\n", " None\n", + " True\n", + " \n", + " \n", + " 1\n", + " e8f13022ac8ecff976188979ac903d4a\n", + " recKsb5FnJy70up78\n", + " Amtrak\n", + " Federal Government\n", + " 03 - Marysville\n", + " Other Public Transit\n", + " True\n", " None\n", - " Static OK\n", - " RT Incomplete\n", - " False\n", - " Unknown\n", - " [Disabled Veterans of America]\n", - " False\n", + " True\n", + " \n", + " \n", + " 2\n", + " d4d53f5a85cec17582000a9e67a7d642\n", + " reczvlrgxLUDiBgAy\n", + " Commute.org\n", + " Joint Powers Agency\n", + " 04 - Oakland\n", + " Other Public Transit\n", + " True\n", " None\n", + " True\n", + " \n", + " \n", + " 3\n", + " 5c27763b1f9b74c1e0ef8a976c87da0e\n", + " reccEj7tecw0n60FO\n", + " City of Ripon\n", + " City/Town\n", + " 10 - Stockton\n", + " Other Public Transit\n", + " True\n", " None\n", - " False\n", - " False\n", " True\n", - " 2023-05-25 00:00:00+00:00\n", - " 2098-12-31 23:59:59.999999+00:00\n", " \n", " \n", - " 1\n", - " 45227f9806ad4ffd1e8d4309e07f707e\n", - " recmCZYY7aXn5MS9b\n", - " IBI\n", - " Company\n", - " []\n", - " NaN\n", + " 4\n", + " 17b1533e3db2090ee03baec634cecc2e\n", + " rec2JrGQTZZh54ieL\n", + " City of Clovis\n", + " City/Town\n", + " 06 - Fresno\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 5\n", + " de0dd501fabd3026dbda6de0946eaf35\n", + " recK4si1uIoj6HfrO\n", + " City of Irvine\n", + " City/Town\n", + " 12 - Irvine\n", " None\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 6\n", + " 4d66bee10854cee5940d2e62aebf657a\n", + " recsBfXgev9ICDCY1\n", + " Presidio Trust\n", + " Federal Government\n", + " 04 - Oakland\n", + " Other Transit\n", + " True\n", " None\n", - " https://www.ibigroup.com/\n", + " True\n", + " \n", + " \n", + " 7\n", + " 848ce6656bd68e56a1ccf915ae2eee23\n", + " reczluQLW1y5oQqF8\n", + " City of Alameda\n", + " City/Town\n", + " 04 - Oakland\n", + " Other Public Transit\n", + " True\n", " None\n", + " True\n", + " \n", + " \n", + " 8\n", + " 41f7aaa3446116fd1124b8ef1966ff14\n", + " recuGkFhN2WXGK67H\n", + " City of Banning\n", + " City/Town\n", " None\n", - " Static OK\n", - " RT Incomplete\n", - " False\n", - " Unknown\n", - " []\n", - " False\n", " None\n", + " True\n", " None\n", - " False\n", - " False\n", " True\n", - " 2024-05-04 00:00:00+00:00\n", - " 2098-12-31 23:59:59.999999+00:00\n", " \n", " \n", - " 2\n", - " 8a644e497439d25c01619c8ec6c85c44\n", - " recEN9M5vpVOk2JRI\n", - " SAP\n", - " Company\n", - " []\n", - " NaN\n", + " 9\n", + " 2452010e249a5f27fc5cc30910643214\n", + " reciWrBgYsAIm9eKK\n", + " City of Beaumont\n", + " City/Town\n", + " 08 - San Bernardino\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 10\n", + " 1c1a9850ea30294d4cdf272a80ea4f69\n", + " reczHeS54sIl4NOcF\n", + " City of La Puente\n", + " City/Town\n", + " 07 - Los Angeles\n", + " Other Public Transit\n", + " True\n", " None\n", + " True\n", + " \n", + " \n", + " 11\n", + " 48a96c9f6a0b599ea8207b39183fefc7\n", + " recH53ghrYpk4gKhe\n", + " City of Morro Bay\n", + " City/Town\n", + " 05 - San Luis Obispo\n", + " Other Public Transit\n", + " True\n", " None\n", - " crystalreports.com\n", + " True\n", + " \n", + " \n", + " 12\n", + " 79e42a2d6056cc685387ef9187daf794\n", + " recNSGNAyM91vsTU7\n", + " Santa Cruz Harbor\n", + " Independent Agency\n", + " 05 - San Luis Obispo\n", + " Other Public Transit\n", + " True\n", " None\n", + " True\n", + " \n", + " \n", + " 13\n", + " 97e128b56fff3421b4102df3c66d8f56\n", + " receSouvQI31vHz4D\n", + " City of Dana Point\n", + " City/Town\n", + " 12 - Irvine\n", + " Other Public Transit\n", + " True\n", " None\n", - " Static OK\n", - " RT Incomplete\n", - " False\n", - " Unknown\n", - " []\n", - " False\n", + " True\n", + " \n", + " \n", + " 14\n", + " 52a6eb3bb3ba5c35224f4f0e7088790e\n", + " recLHHpOMdW69jAsc\n", + " City of Menlo Park\n", + " City/Town\n", + " 04 - Oakland\n", " None\n", + " True\n", " None\n", - " False\n", - " False\n", " True\n", - " 2023-04-29 00:00:00+00:00\n", - " 2098-12-31 23:59:59.999999+00:00\n", " \n", " \n", - " 3\n", - " fa22729bf0698cc6d2ac4f4a41e861b0\n", - " recIAaOHjseoeNpTx\n", - " UTA\n", - " Company\n", - " []\n", - " NaN\n", + " 15\n", + " fa120b24b8ee045bdc28cbe0167038fd\n", + " reczbRiAs0zFytcvm\n", + " City of Santa Cruz\n", + " City/Town\n", + " 05 - San Luis Obispo\n", " None\n", + " True\n", " None\n", - " http://www.utatransit.net/\n", + " True\n", + " \n", + " \n", + " 16\n", + " 3a814fcf275575080638856737804f69\n", + " recRM3c9Zfaft4V2B\n", + " Cloverdale Transit\n", + " Independent Agency\n", + " 04 - Oakland\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 17\n", + " 313153a638c1f0f51deaa7a60e8001ba\n", + " recwtQ7m3C59jbnrc\n", + " City of San Clemente\n", + " City/Town\n", + " 12 - Irvine\n", + " Other Public Transit\n", + " True\n", " None\n", + " True\n", + " \n", + " \n", + " 18\n", + " a1e93b65b71053a93b8bad41a29a44e5\n", + " recfehHpcFaXUXhkt\n", + " Curry Public Transit\n", + " Independent Agency\n", + " 01 - Eureka\n", + " Other Public Transit\n", + " True\n", " None\n", - " Static OK\n", - " RT Incomplete\n", - " False\n", - " Unknown\n", - " []\n", - " False\n", + " True\n", + " \n", + " \n", + " 19\n", + " 2d6fbf795084dff514d88c3594f76e15\n", + " recwBSFrVmbeGqn0g\n", + " City of Laguna Niguel\n", + " City/Town\n", + " 12 - Irvine\n", " None\n", + " True\n", " None\n", - " False\n", - " False\n", " True\n", - " 2023-04-29 00:00:00+00:00\n", - " 2098-12-31 23:59:59.999999+00:00\n", " \n", " \n", - " 4\n", - " 3fd4d81306c49cb718c37b48fcbe585c\n", - " recveQ8PTsiKdT7RU\n", - " Aina\n", - " Company\n", - " []\n", - " NaN\n", - " Offices in Boston and Finland; no CA offices\n", + " 20\n", + " 25232b6a56d0264dcddb4913bf968434\n", + " reckGS8egMZryjbX7\n", + " City of Mission Viejo\n", + " City/Town\n", + " 12 - Irvine\n", + " None\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 21\n", + " 5dca73aae32d569ea5f40e417d45b72d\n", + " rec4pDiUorjWbUfvU\n", + " City of Mountain View\n", + " City/Town\n", + " 04 - Oakland\n", + " None\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 22\n", + " 0de6e548537b6a25aa8b873a2ea91519\n", + " rectXzoXm6gBuNBHK\n", + " City of Newport Beach\n", + " City/Town\n", + " 12 - Irvine\n", + " None\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 23\n", + " 26b06404aa62a3d05e9ffd0ffc1c2343\n", + " rec43oyrfhtPDdRHj\n", + " City of Rancho Cordova\n", + " City/Town\n", + " 03 - Marysville\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 24\n", + " 82701eb1de9e67b3fec7cb153d2c44c8\n", + " reccfMGlQeXIrHcad\n", + " Plumas Transit Systems\n", + " Independent Agency\n", + " 02 - Redding\n", + " Core\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 25\n", + " 4e807100eaf481d56bd883afda553331\n", + " recg58MziBRsVfavn\n", + " Yosemite National Park\n", + " Federal Government\n", + " 10 - Stockton\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 26\n", + " 227f25d23231d03b9e6d0827bedef573\n", + " recdLxGPqFmJLG21a\n", + " Los Angeles World Airports\n", + " Independent Agency\n", + " 07 - Los Angeles\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 27\n", + " 2868bbafaa4e330da76b27a34c9dec84\n", + " recEHMkhVmzSWO9nZ\n", + " City of San Juan Capistrano\n", + " City/Town\n", + " 12 - Irvine\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 28\n", + " f674be485552837860bfb987316fb170\n", + " recPtsCi89lKcXaTW\n", + " City of South San Francisco\n", + " City/Town\n", + " 04 - Oakland\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 29\n", + " 76677611969904bf1f4ad09f6dd631cf\n", + " recfbLFdDnCxgIfAB\n", + " San Diego International Airport\n", + " Independent Agency\n", + " 11 - San Diego\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 30\n", + " d14e900babddcfc79465fb18bae82826\n", + " rec7ShjfgRPLU0yjY\n", + " Solano Transportation Authority\n", + " Independent Agency\n", + " 04 - Oakland\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 31\n", + " 81b4e86d203f5f57fb689fadcb652b56\n", + " recvUlrKS1N2mvAwk\n", + " San Joaquin Joint Powers Authority\n", + " Independent Agency\n", + " 10 - Stockton\n", + " Core\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 32\n", + " 360e8fb158de1a3cc5638e3af7dcc8de\n", + " recd6X5l7vkBXk9hc\n", + " San Francisco International Airport\n", + " Independent Agency\n", + " 04 - Oakland\n", + " Other Public Transit\n", + " True\n", + " None\n", + " True\n", + " \n", + " \n", + " 33\n", + " 5df3b35b1c099c207e09fa6c81a207eb\n", + " recvEBkSBc7UxlarC\n", + " Capitol Corridor Joint Powers Authority\n", + " Independent Agency\n", + " 04 - Oakland\n", + " Other Public Transit\n", + " True\n", " None\n", - " https://www.ainaptt.com/\n", + " True\n", + " \n", + " \n", + " 34\n", + " 3d442fdd55188cad1f1e2206602bad31\n", + " reco3K1VuCBdGRhtV\n", + " Tahoe Truckee Area Regional Transportation\n", + " Independent Agency\n", + " 03 - Marysville\n", + " Core\n", + " True\n", " None\n", + " True\n", + " \n", + " \n", + " 35\n", + " 100b94a15d097469a628eb45e9b22bea\n", + " recn8zTmGbYZv1qxV\n", + " Dumbarton Bridge Regional Operations Consortium\n", + " Joint Powers Agency\n", " None\n", - " Static OK\n", - " RT Incomplete\n", - " False\n", - " Unknown\n", - " []\n", - " False\n", " None\n", + " True\n", " None\n", - " False\n", - " False\n", " True\n", - " 2023-05-25 00:00:00+00:00\n", - " 2098-12-31 23:59:59.999999+00:00\n", " \n", " \n", "\n", "" ], "text/plain": [ - " key source_record_id name \\\n", - "0 eea7326fc87a575ce26e24cf56b8ff37 recsupkiKC6Y6fFfV DAV \n", - "1 45227f9806ad4ffd1e8d4309e07f707e recmCZYY7aXn5MS9b IBI \n", - "2 8a644e497439d25c01619c8ec6c85c44 recEN9M5vpVOk2JRI SAP \n", - "3 fa22729bf0698cc6d2ac4f4a41e861b0 recIAaOHjseoeNpTx UTA \n", - "4 3fd4d81306c49cb718c37b48fcbe585c recveQ8PTsiKdT7RU Aina \n", - "\n", - " organization_type roles itp_id \\\n", - "0 Non-Profit Organization [] NaN \n", - "1 Company [] NaN \n", - "2 Company [] NaN \n", - "3 Company [] NaN \n", - "4 Company [] NaN \n", - "\n", - " details caltrans_district \\\n", - "0 DAV operates a fleet of vehicles around the co... 11 - San Diego \n", - "1 None None \n", - "2 None None \n", - "3 None None \n", - "4 Offices in Boston and Finland; no CA offices None \n", - "\n", - " website reporting_category \\\n", - "0 https://www.dav.org/veterans/i-need-a-ride/ None \n", - "1 https://www.ibigroup.com/ None \n", - "2 crystalreports.com None \n", - "3 http://www.utatransit.net/ None \n", - "4 https://www.ainaptt.com/ None \n", + " key source_record_id \\\n", + "0 1d03fdbc2e4dfd044465b4437e70c9e0 recQRoE5mcCn6kdti \n", + "1 e8f13022ac8ecff976188979ac903d4a recKsb5FnJy70up78 \n", + "2 d4d53f5a85cec17582000a9e67a7d642 reczvlrgxLUDiBgAy \n", + "3 5c27763b1f9b74c1e0ef8a976c87da0e reccEj7tecw0n60FO \n", + "4 17b1533e3db2090ee03baec634cecc2e rec2JrGQTZZh54ieL \n", + "5 de0dd501fabd3026dbda6de0946eaf35 recK4si1uIoj6HfrO \n", + "6 4d66bee10854cee5940d2e62aebf657a recsBfXgev9ICDCY1 \n", + "7 848ce6656bd68e56a1ccf915ae2eee23 reczluQLW1y5oQqF8 \n", + "8 41f7aaa3446116fd1124b8ef1966ff14 recuGkFhN2WXGK67H \n", + "9 2452010e249a5f27fc5cc30910643214 reciWrBgYsAIm9eKK \n", + "10 1c1a9850ea30294d4cdf272a80ea4f69 reczHeS54sIl4NOcF \n", + "11 48a96c9f6a0b599ea8207b39183fefc7 recH53ghrYpk4gKhe \n", + "12 79e42a2d6056cc685387ef9187daf794 recNSGNAyM91vsTU7 \n", + "13 97e128b56fff3421b4102df3c66d8f56 receSouvQI31vHz4D \n", + "14 52a6eb3bb3ba5c35224f4f0e7088790e recLHHpOMdW69jAsc \n", + "15 fa120b24b8ee045bdc28cbe0167038fd reczbRiAs0zFytcvm \n", + "16 3a814fcf275575080638856737804f69 recRM3c9Zfaft4V2B \n", + "17 313153a638c1f0f51deaa7a60e8001ba recwtQ7m3C59jbnrc \n", + "18 a1e93b65b71053a93b8bad41a29a44e5 recfehHpcFaXUXhkt \n", + "19 2d6fbf795084dff514d88c3594f76e15 recwBSFrVmbeGqn0g \n", + "20 25232b6a56d0264dcddb4913bf968434 reckGS8egMZryjbX7 \n", + "21 5dca73aae32d569ea5f40e417d45b72d rec4pDiUorjWbUfvU \n", + "22 0de6e548537b6a25aa8b873a2ea91519 rectXzoXm6gBuNBHK \n", + "23 26b06404aa62a3d05e9ffd0ffc1c2343 rec43oyrfhtPDdRHj \n", + "24 82701eb1de9e67b3fec7cb153d2c44c8 reccfMGlQeXIrHcad \n", + "25 4e807100eaf481d56bd883afda553331 recg58MziBRsVfavn \n", + "26 227f25d23231d03b9e6d0827bedef573 recdLxGPqFmJLG21a \n", + "27 2868bbafaa4e330da76b27a34c9dec84 recEHMkhVmzSWO9nZ \n", + "28 f674be485552837860bfb987316fb170 recPtsCi89lKcXaTW \n", + "29 76677611969904bf1f4ad09f6dd631cf recfbLFdDnCxgIfAB \n", + "30 d14e900babddcfc79465fb18bae82826 rec7ShjfgRPLU0yjY \n", + "31 81b4e86d203f5f57fb689fadcb652b56 recvUlrKS1N2mvAwk \n", + "32 360e8fb158de1a3cc5638e3af7dcc8de recd6X5l7vkBXk9hc \n", + "33 5df3b35b1c099c207e09fa6c81a207eb recvEBkSBc7UxlarC \n", + "34 3d442fdd55188cad1f1e2206602bad31 reco3K1VuCBdGRhtV \n", + "35 100b94a15d097469a628eb45e9b22bea recn8zTmGbYZv1qxV \n", "\n", - " hubspot_company_record_id gtfs_static_status gtfs_realtime_status \\\n", - "0 None Static OK RT Incomplete \n", - "1 None Static OK RT Incomplete \n", - "2 None Static OK RT Incomplete \n", - "3 None Static OK RT Incomplete \n", - "4 None Static OK RT Incomplete \n", + " name organization_type \\\n", + "0 POINT Independent Agency \n", + "1 Amtrak Federal Government \n", + "2 Commute.org Joint Powers Agency \n", + "3 City of Ripon City/Town \n", + "4 City of Clovis City/Town \n", + "5 City of Irvine City/Town \n", + "6 Presidio Trust Federal Government \n", + "7 City of Alameda City/Town \n", + "8 City of Banning City/Town \n", + "9 City of Beaumont City/Town \n", + "10 City of La Puente City/Town \n", + "11 City of Morro Bay City/Town \n", + "12 Santa Cruz Harbor Independent Agency \n", + "13 City of Dana Point City/Town \n", + "14 City of Menlo Park City/Town \n", + "15 City of Santa Cruz City/Town \n", + "16 Cloverdale Transit Independent Agency \n", + "17 City of San Clemente City/Town \n", + "18 Curry Public Transit Independent Agency \n", + "19 City of Laguna Niguel City/Town \n", + "20 City of Mission Viejo City/Town \n", + "21 City of Mountain View City/Town \n", + "22 City of Newport Beach City/Town \n", + "23 City of Rancho Cordova City/Town \n", + "24 Plumas Transit Systems Independent Agency \n", + "25 Yosemite National Park Federal Government \n", + "26 Los Angeles World Airports Independent Agency \n", + "27 City of San Juan Capistrano City/Town \n", + "28 City of South San Francisco City/Town \n", + "29 San Diego International Airport Independent Agency \n", + "30 Solano Transportation Authority Independent Agency \n", + "31 San Joaquin Joint Powers Authority Independent Agency \n", + "32 San Francisco International Airport Independent Agency \n", + "33 Capitol Corridor Joint Powers Authority Independent Agency \n", + "34 Tahoe Truckee Area Regional Transportation Independent Agency \n", + "35 Dumbarton Bridge Regional Operations Consortium Joint Powers Agency \n", "\n", - " _deprecated__assessment_status manual_check__contact_on_website \\\n", - "0 False Unknown \n", - "1 False Unknown \n", - "2 False Unknown \n", - "3 False Unknown \n", - "4 False Unknown \n", + " caltrans_district reporting_category is_public_entity ntd_id \\\n", + "0 01 - Eureka Other Public Transit True None \n", + "1 03 - Marysville Other Public Transit True None \n", + "2 04 - Oakland Other Public Transit True None \n", + "3 10 - Stockton Other Public Transit True None \n", + "4 06 - Fresno Other Public Transit True None \n", + "5 12 - Irvine None True None \n", + "6 04 - Oakland Other Transit True None \n", + "7 04 - Oakland Other Public Transit True None \n", + "8 None None True None \n", + "9 08 - San Bernardino Other Public Transit True None \n", + "10 07 - Los Angeles Other Public Transit True None \n", + "11 05 - San Luis Obispo Other Public Transit True None \n", + "12 05 - San Luis Obispo Other Public Transit True None \n", + "13 12 - Irvine Other Public Transit True None \n", + "14 04 - Oakland None True None \n", + "15 05 - San Luis Obispo None True None \n", + "16 04 - Oakland Other Public Transit True None \n", + "17 12 - Irvine Other Public Transit True None \n", + "18 01 - Eureka Other Public Transit True None \n", + "19 12 - Irvine None True None \n", + "20 12 - Irvine None True None \n", + "21 04 - Oakland None True None \n", + "22 12 - Irvine None True None \n", + "23 03 - Marysville Other Public Transit True None \n", + "24 02 - Redding Core True None \n", + "25 10 - Stockton Other Public Transit True None \n", + "26 07 - Los Angeles Other Public Transit True None \n", + "27 12 - Irvine Other Public Transit True None \n", + "28 04 - Oakland Other Public Transit True None \n", + "29 11 - San Diego Other Public Transit True None \n", + "30 04 - Oakland Other Public Transit True None \n", + "31 10 - Stockton Core True None \n", + "32 04 - Oakland Other Public Transit True None \n", + "33 04 - Oakland Other Public Transit True None \n", + "34 03 - Marysville Core True None \n", + "35 None None True None \n", "\n", - " alias is_public_entity ntd_id ntd_id_2022 \\\n", - "0 [Disabled Veterans of America] False None None \n", - "1 [] False None None \n", - "2 [] False None None \n", - "3 [] False None None \n", - "4 [] False None None \n", - "\n", - " public_currently_operating public_currently_operating_fixed_route \\\n", - "0 False False \n", - "1 False False \n", - "2 False False \n", - "3 False False \n", - "4 False False \n", - "\n", - " _is_current _valid_from _valid_to \n", - "0 True 2023-05-25 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", - "1 True 2024-05-04 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", - "2 True 2023-04-29 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", - "3 True 2023-04-29 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 \n", - "4 True 2023-05-25 00:00:00+00:00 2098-12-31 23:59:59.999999+00:00 " + " public_currently_operating_fixed_route \n", + "0 True \n", + "1 True \n", + "2 True \n", + "3 True \n", + "4 True \n", + "5 True \n", + "6 True \n", + "7 True \n", + "8 True \n", + "9 True \n", + "10 True \n", + "11 True \n", + "12 True \n", + "13 True \n", + "14 True \n", + "15 True \n", + "16 True \n", + "17 True \n", + "18 True \n", + "19 True \n", + "20 True \n", + "21 True \n", + "22 True \n", + "23 True \n", + "24 True \n", + "25 True \n", + "26 True \n", + "27 True \n", + "28 True \n", + "29 True \n", + "30 True \n", + "31 True \n", + "32 True \n", + "33 True \n", + "34 True \n", + "35 True " ] }, - "execution_count": 15, + "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "dim_orgs.head()" + "dim_orgs" ] }, { "cell_type": "code", "execution_count": null, - "id": "86359867-8b8a-41a9-8579-b38b7eb89bb3", + "id": "76aed8ed-9f26-45f9-9153-9ac1d8e31a86", "metadata": {}, "outputs": [], "source": []