Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: break Glue support into its own crate without rusoto #1825

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

rtyler
Copy link
Member

@rtyler rtyler commented Nov 8, 2023

This change also pilots a removal of Rusoto in favor of the AWS SDK for Rust which AWS is now supporting and funding the development of.

The API surface is largely the same, but this move I believe will aso ensure that we're much more consistent on handling AWS environment variables for some things.

Related to #1601

@cmackenzie1
Copy link
Contributor

I just have a comment about the crate name. Is the goal to have deltalake-catalog-<glue|unity|etc> or should it just be deltalake-catalog with feature flags for glue, unity and the like? I've never interacted with the catalogs and not sure if the more detailed crates going to be a pain or blessing.

@rtyler
Copy link
Member Author

rtyler commented Nov 9, 2023

@cmackenzie1 Yes the plan is to put each catalog provider into its own crate, there's not really any common code between something like Unity, which will likely have dependencies on Databricks SDKs, and Glue which has AWS SDK dependencies.

Some of the theory between parcelling these things out is that we can rev releases of these less frequently and use semver to make it easier to pull in updates.

@rtyler rtyler force-pushed the datacatalog-glue-1601 branch from 3e31e5b to 47cb7db Compare November 10, 2023 13:47
@github-actions github-actions bot added binding/rust Issues for the Rust crate crate/core labels Nov 10, 2023
@rtyler rtyler added this to the Rust v0.17 milestone Nov 10, 2023
@rtyler
Copy link
Member Author

rtyler commented Nov 10, 2023

The benchmark test failure I don't understand at all, but also don't think it has any relation to this pull request 😕

@rtyler rtyler marked this pull request as ready for review November 10, 2023 14:51
Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Exciting to see the split-up take shape!

Main question is arund the build scripts, while I don't feel too strongly about it, elsewhere we use makefiles rather then individual scripts. Then again locally I have a justfile 😆.

Comment on lines 30 to 33
/// Error from a specific catalog provider
#[error("Catalog implementation error: {0}")]
Error(Box<dyn std::error::Error + Send + Sync + 'static>),

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should integrations raise the above Generic error, which also contains an identifier from the specific catalogue? This pattern worked quite nicely in object store :).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link me to what you're referring to @roeap ? I can make the changes here once I understand what you're suggesting 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

certainly :).

Generic {
/// Name of the catalog
catalog: &'static str,
/// Error message
source: Box<dyn std::error::Error + Send + Sync + 'static>,
},

glue = ["deltalake-core/glue"]
glue-native-tls = ["deltalake-core/glue-native-tls"]
glue = ["deltalake-catalog-glue"]
glue-native-tls = ["deltalake-catalog-glue/native-tls"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was just wondering if there maybe is a way to propagate this from the general native-tls feature, and if this would even be wanted?.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not found a good way to propagate features, it's a little tedious book-keeping 😢

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not entirely sure, but I found the example below.

[dependencies]
serde = { version = "1.0.133", optional = true }
rgb = { version = "0.8.25", optional = true }

[features]
serde = ["dep:serde", "rgb?/serde"]

The docs there say..

In this example, enabling the serde feature will enable the serde dependency. It will also enable the serde feature for the rgb dependency, but only if something else has enabled the rgb dependency.

So if I read that correctly, we may be able to add "deltalake-catalog-glue?/native-tls" to the native-tls feature and it only takes effect, if glue is enabled?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming back to this, I think this is a good idea, but it will require a features rework in the metacarte since there we also have s3-native-tls for example. I'll tackle that in a follow up pull request

@rtyler rtyler force-pushed the datacatalog-glue-1601 branch from 47cb7db to 716f20e Compare November 18, 2023 23:32
@rtyler rtyler force-pushed the datacatalog-glue-1601 branch from 716f20e to a99c4d3 Compare November 18, 2023 23:33
@rtyler rtyler enabled auto-merge (rebase) November 18, 2023 23:33
Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may have one relevant typo, otherwise looking good!

crates/deltalake-catalog-glue/src/lib.rs Show resolved Hide resolved
@roeap roeap disabled auto-merge November 21, 2023 06:55
@roeap roeap enabled auto-merge (squash) November 21, 2023 06:56
roeap
roeap previously approved these changes Nov 21, 2023
@roeap
Copy link
Collaborator

roeap commented Nov 21, 2023

@rtyler - meant to resolve merge conflicts, but broke formatting :(

let catalog = GlueDataCatalog::from_env()
.await
.expect("Failed to load catalog from the environment");
println!("caralog: {catalog:?}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
println!("caralog: {catalog:?}");
println!("catalog: {catalog:?}");

🤷

This change also pilots a removal of Rusoto in favor of the AWS SDK for Rust
which AWS is now supporting and funding the development of.

The API surface is largely the same, but this move I believe will aso ensure
that we're much more consistent on handling AWS environment variables for some
things.

Related to delta-io#1601
@rtyler rtyler force-pushed the datacatalog-glue-1601 branch 2 times, most recently from bfd8d9b to 8ceca20 Compare December 11, 2023 19:04
@roeap roeap merged commit 6d07bc5 into delta-io:main Dec 11, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

4 participants