🔥🧠Usage of the data catalogue (DataHub) #256
timburke-hackit
started this conversation in
Firebreak April 24
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
🌵What is the problem or issue we're trying to address?
One of the main components we've talked about a lot but haven't used is the data catalogue tool Data Hub.
I and most of the rest of the team weren't around when it was evaluated, implemented and as a result it's become pretty neglected.
🎯How is this affecting producers, consumers or platform engineers?
Consumers
One of the stated aims of platform is to democratise access to data and make it easier for analysts to know what data is available, how it can be used, who owns it and how to get access to that data - all things the catalogue is meant to address but currently doesn't
Producers
Aren't routinely documenting the data they bring on to platform so they're missing the opportunity to document the data they own in a central managed location that they can refer other users to rather than doing this ad hoc every time there are no use cases or requests
Platform Engineers
Have to liaise with producers to understand the intended purpose of data on platform and land up with members of the team having siloed knowledge around this that could limit their ability to work across platform rather than specialising in a niche [I can probably add to this - just getting something down].
📝What is the proposed task?
No response
🤔How might this work be carried out?
No response
⌛How urgent is this work?
No response
💪How much effort do you think this will take?
No response
🛠️What skills are needed?
Kafka
AWS Glue
Terraform
📃Additional Info:
The original decision to use DataHub is logged here in ADR011 (2021-10-07)
Using DataHub as a Data Catalogue - ADR 011
Beta Was this translation helpful? Give feedback.
All reactions