-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ingestion/grafana): Add datasets and charts to dashboards with lineage and tags. Lineage back to source #12417
base: master
Are you sure you want to change the base?
feat(ingestion/grafana): Add datasets and charts to dashboards with lineage and tags. Lineage back to source #12417
Conversation
metadata-ingestion/src/datahub/ingestion/source/grafana/models.py
Outdated
Show resolved
Hide resolved
metadata-ingestion/src/datahub/ingestion/source/grafana/field_utils.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. I'd much appreciate a "Concept Mapping" section in docs.
metadata-ingestion/src/datahub/ingestion/source/grafana/grafana_source.py
Outdated
Show resolved
Hide resolved
metadata-ingestion/src/datahub/ingestion/source/grafana/grafana_source.py
Outdated
Show resolved
Hide resolved
) | ||
|
||
# Generate dashboard container first | ||
yield from gen_containers( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly this is new addition over only dashboard entity that was present earlier. so one container corresponding to each dashboard - to encompass panels (chart) and datasource (dataset) entity ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so wondering if datasources are dashboard scoped or global in grafana.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct. For data sources (the database name, type and the database) are global but the SQL itself for each visual is set in each chart.
platform_instance=self.source_config.platform_instance, | ||
|
||
yield from add_dataset_to_container( | ||
container_key=folder_key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can set parent_container_key = folder_key
in gen_containers for dashboard instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mayurinehate, would you be able to suggest the appropriate changes as this is not super clear what the proposal is?
metadata-ingestion/src/datahub/ingestion/source/grafana/grafana_source.py
Outdated
Show resolved
Hide resolved
return None | ||
|
||
ds_type, ds_uid = self._extract_datasource_info(panel.datasource) | ||
raw_sql = self._extract_raw_sql(panel.targets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious if we ingest raw query used in panel anywhere in model. I believe the language of query depends on the source it integrates with.
return None | ||
|
||
ds_type, ds_uid = self._extract_datasource_info(panel.datasource) | ||
raw_sql = self._extract_raw_sql(panel.targets) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious if we ingest raw query used in panel anywhere in datahub model? I believe the language of query depends on the source it integrates with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be able to expand upon this?
metadata-ingestion/src/datahub/ingestion/source/grafana/lineage.py
Outdated
Show resolved
Hide resolved
metadata-ingestion/src/datahub/ingestion/source/grafana/types.py
Outdated
Show resolved
Hide resolved
service_account = grafana_client.create_service_account( | ||
name="example-service-account", role="Viewer" | ||
name="example-service-account", role="Admin" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any specific reason for changing role here ? Do we require elevated permissions for changes in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need the admin permission to obtain access to the data source. A lot of the information used for lineage required admin access because otherwise you can't get compile the lineage.
from_start = sql.index("from") | ||
select_part = sql[select_start:from_start].strip() | ||
|
||
columns = [col.strip().split()[-1].strip() for col in select_part.split(",")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that split by ,
will actually split by columns may be wrong assumption, eg
: SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM users
Not sure if we need to address this complexity though. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some more logic that targets this type of SQL
Concept map has now been added |
…datahub into grafana-improvements
Adding functionality to the existing Grafana connector. The existing connector supports Dashboard identification only Changed implement the following:
Checklist