-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(tableau): set ingestion stage report and pertimers #12234
base: master
Are you sure you want to change the base?
chore(tableau): set ingestion stage report and pertimers #12234
Conversation
emit_custom_sql_datasources_timer: Dict[str, float] = dataclass_field( | ||
default_factory=TopKDict | ||
) | ||
emit_upstream_tables_timer: Dict[str, float] = dataclass_field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a huge fan of this - does it make sense to have a SiteReport
type, and then the TableauSourceReport
has a dict[str (site id), SiteReport]
?
how many sites are there reasonably going to be? the main limitation is that we don't want the report size to grow in an unbounded way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually maybe this is ok - we really only care about the ones that take a long time, right
yield from self.emit_dashboards() | ||
with PerfTimer() as timer: | ||
yield from self.emit_dashboards() | ||
self.report.emit_dashboards_timer[self.site_id] = round( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PerfTimer objects can go directly into the report, and will get automatically formatted nicely - it implements as_obj
, which is used here
if isinstance(some_val, SupportsAsObj): |
so we could do something like with self.report.emit_dashboards_timer.setdefault(self.site_id, PerfTimer()):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although we might need to make some tweaks to TopKDict to make that work, so may not be worth it
@@ -3457,33 +3493,88 @@ def _create_workbook_properties( | |||
return {"permissions": json.dumps(groups)} if len(groups) > 0 else None | |||
|
|||
def ingest_tableau_site(self): | |||
self.report.report_ingestion_stage_start( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eventually I want to make these stages context managers - explicitly reporting start/end feels pretty error-prone
emit_custom_sql_datasources_timer: Dict[str, float] = dataclass_field( | ||
default_factory=TopKDict | ||
) | ||
emit_upstream_tables_timer: Dict[str, float] = dataclass_field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually maybe this is ok - we really only care about the ones that take a long time, right
self._fetch_groups() | ||
with PerfTimer() as timer: | ||
self._fetch_groups() | ||
self.report.fetch_groups_timer[self.site_id] = round( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be nicer to have site names in here instead of site ids, assuming they're unique
Locally tested:
Checklist