You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the generate_reports_data.py script queries several different tables (e.g. fct_monthly_reports_site_organization_gtfs_vendors and fct_daily_reports_site_organization_scheduled_service_summary) which are processed and "joined" together by being written into the same output folders. Rather than try to combine these artifacts and/or add validation with something like Pydantic on top of these existing queries, It should be possible to create a single dbt model whose grain is year-month-itp_id so rows are 1:1 with final report pages. BigQuery rows can contain JSON and arrays to represent the nested nature of some of this data.
If this model is implemented, the "data generation" script could consist of just querying this single model and writing a single artifact (with some additional fields added post-query, such as RT feed URLs, that are more difficult to do in BigQuery).
The text was updated successfully, but these errors were encountered:
Currently, the
generate_reports_data.py
script queries several different tables (e.g.fct_monthly_reports_site_organization_gtfs_vendors
andfct_daily_reports_site_organization_scheduled_service_summary
) which are processed and "joined" together by being written into the same output folders. Rather than try to combine these artifacts and/or add validation with something like Pydantic on top of these existing queries, It should be possible to create a single dbt model whose grain isyear-month-itp_id
so rows are 1:1 with final report pages. BigQuery rows can contain JSON and arrays to represent the nested nature of some of this data.If this model is implemented, the "data generation" script could consist of just querying this single model and writing a single artifact (with some additional fields added post-query, such as RT feed URLs, that are more difficult to do in BigQuery).
The text was updated successfully, but these errors were encountered: