-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve destination fingerprinting and info #751
Labels
tech-debt
Leftovers from previous sprint that should be fixed over time
Comments
rudolfix
added
tech-debt
Leftovers from previous sprint that should be fixed over time
and removed
devel
labels
Dec 23, 2023
This was referenced Jun 26, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Background
Building on #746 we want a robust fingerprinting and destination info to include in
LoadInfo
and pipeline traces. The fingerprinting is used to anonymously identify cloud destinations and allows to build proper tracing and data lineageTasks
filesystem
fingerprint by including only the schema and netloc. currently we also include path in the bucket which is too much. for local filesystem configuration return a hash of empty string.Implementation
Please do 3 PRs for 3 tasks
destination info contains:
local destination:
for destinations that may run locally (duckdb, postgres, waeviate, quadrant, filesystem etc.) we should start generating fingerprints
file://
on filesystem, localhost and 127.0.0.1 + ip6 localhost if connection string) a local flag must be set to true in info.anonymous_id
as used by telemetry and path to file/database when applicable.anonymous_id
should be detached from telemetry code and become independent (maybe part ofpaths.py
module)The text was updated successfully, but these errors were encountered: