Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deltalake-* crates use different version than specified in Cargo.toml, leading to unexpected behavior #2847

Closed
yonipeleg33 opened this issue Sep 5, 2024 · 12 comments · Fixed by #2875
Assignees
Labels
binding/rust Issues for the Rust crate bug Something isn't working

Comments

@yonipeleg33
Copy link

yonipeleg33 commented Sep 5, 2024

Environment

Delta-rs version:

0.18.2, 0.19.0 (you'll see below how 😄 )

Binding:

Rust

Environment:

  • Cloud provider: Any
  • OS: Linux
  • Other:

Bug

What happened:

Having these dependencies in Cargo.toml:

deltalake = { version = "0.18.2", default-features = false }
deltalake-azure = "0.1.4"
deltalake-aws = "0.1.4"

Results in two deltalake-core versions being used: 0.18.2, and 0.19.0.
From my Cargo.lock:

[[package]]
name = "deltalake"
version = "0.18.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "805d65924fabb84b02503ba2e2c0b1b53cdc4cf27c31305f3dd991b984a53593"
dependencies = [
 "deltalake-core 0.18.2", # <-------------- first version
]
# ...
[[package]]
name = "deltalake-aws"
version = "0.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8ae61c8427536aa2215b13109d10f4cc14ab036ad0bd3bb11ec7b54a4938f03f"
dependencies = [
  # ...
 "deltalake-core 0.19.0", # <-------------- second version
  # ...
]

This state leads to unexpected behavior. I can imagine there are other scenarios that it affects, but the one I stumbled upon is register_handlers:

When calling deltalake_*::register_handlers, it registers the handlers by calling deltalake_core::storage::factories - which will be version 0.19.0 in my case.
Then, calling deltalake::storage::store_for, goes to the 0.18.2 version, as specified in my Cargo.toml - this is a different source code, on which the handlers are not registered!
So I end up as if I didn't call register_handlers at all, and get InvalidTableLocation error 😞

Everything works fine if I bump my deltalake dependency to 0.19.0.

What you expected to happen:
This scenario wouldn't be possible - Either such side effects of using multiple versions of deltalake-core don't exists, or - preferably - I shouldn't end up with multiple versions of delalake-core in such a naive setup (unless I explicitly want to).

How to reproduce it:

  1. Create a cargo project:
$ cargo new --bin <something>
  1. Edit Cargo.toml - have the following dependencies:
[dependencies]
deltalake = { version = "0.18.2", default-features = false }
deltalake-azure = "0.1.4"
deltalake-aws = "0.1.4"
url = "2.5.2"
  1. Edit main.rs:
fn main() {
    deltalake_azure::register_handlers(None);
    deltalake_aws::register_handlers(None);
    let azure_url = url::Url::parse("abfss://[email protected]/example.txt").unwrap();
    let aws_url = url::Url::parse("s3://foo/bar").unwrap();
    let azure_err = deltalake::storage::store_for(&azure_url).unwrap_err();
    let aws_err = deltalake::storage::store_for(&aws_url).unwrap_err();

    assert!(matches!(azure_err, deltalake::DeltaTableError::InvalidTableLocation(_)));
    assert!(matches!(aws_err, deltalake::DeltaTableError::InvalidTableLocation(_)));
}
  1. Run it, and expect it to exit successfully (i.e. the error assertions pass):
$ cargo run

More details:

Maybe I'm stepping out of line here, but I think the deltalake-* crates shouldn't allow a range of versions as they do today (see here for example).

@yonipeleg33 yonipeleg33 added the bug Something isn't working label Sep 5, 2024
@yonipeleg33
Copy link
Author

yonipeleg33 commented Sep 5, 2024

Unfortunately, even when specifying the dependencies this way:

deltalake = { version = "0.18.2", default-features = false, features = ["azure", "deltalake-aws", "s3"] }

You'll end up with a similar Cargo.lock file, reproducing the same issue...
See here - Cargo is allowed to update version "1.1.1" to "1.1.4" when running cargo update (as this is a SemVer compatible update).
(I personally didn't run cargo update on this project, but I did try a clean build (after cargo clean and rm Cargo.lock) - so I'm guessing the same rules apply).

@rtyler rtyler self-assigned this Sep 5, 2024
@rtyler rtyler added the binding/rust Issues for the Rust crate label Sep 5, 2024
@rtyler
Copy link
Member

rtyler commented Sep 5, 2024

@yonipeleg33 I think the imprecise version range in the meta crate (deltalake) needs to be narrowed as you point out. I am curious why your Cargo.toml is specifying the meta crate and then the sub-crates in this manner?

@yonipeleg33
Copy link
Author

yonipeleg33 commented Sep 5, 2024

@rtyler thank you for the quick response!

I am curious why your Cargo.toml is specifying the meta crate and then the sub-crates in this manner?

I originally used the re-exported features from deltalake, as shown in my second comment (#2847 (comment)).
Is it unusual? Is there a better way to specify the versions?

@rtyler
Copy link
Member

rtyler commented Sep 5, 2024

When using the deltalake with the s3 feature, the deltalake-aws crate's symbols are re-exported under deltalake::aws so you should only need to add deltalake as a dependency if you're wanting all those functionalities

@yonipeleg33
Copy link
Author

I tried with this Cargo.toml:

deltalake = { version = "0.18.2", default-features = false, features = ["s3"] }

And this main.rs:

fn main() {
    deltalake::aws::register_handlers(None);
    let aws_url = url::Url::parse("s3://foo/bar").unwrap();
    let aws_err = deltalake::storage::store_for(&aws_url).unwrap_err();

    assert!(matches!(aws_err, deltalake::DeltaTableError::InvalidTableLocation(_)));
}

Then running:

$ cargo clean
$ rm Cargo.lock
$ cargo run

The issue sill reproduces, and I still see the same entries in my Cargo.lock as originally reported.

@rtyler
Copy link
Member

rtyler commented Sep 5, 2024

I was not able to reproduce with:

[package]
name = "issue-2847"
version = "0.1.0"
edition = "2021"

[dependencies]
deltalake = { version = "0.18.2", features = ["s3"] }
url = "2.5.2"
fn main() {
    println!("Start");
    deltalake::aws::register_handlers(None);
    let aws_url = url::Url::parse("s3://foo/bar").unwrap();
    let aws_err = deltalake::storage::store_for(&aws_url).unwrap_err();

    assert!(matches!(aws_err, deltalake::DeltaTableError::InvalidTableLocation(_)));
    println!("Go go go!");
}
plantain% cargo run
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.14s
     Running `target/debug/issue-2847`
Start
Go go go!
plantain%

@yonipeleg33
Copy link
Author

yonipeleg33 commented Sep 5, 2024

This might be confusing, maybe I should've provided a code that fails rather than passes to showcase the error 🙂
The "Go go go!" print is running after an assert! that the result is indeed an error - so you were, in fact, able to reproduce...

This code is probably less confusing - try this one instead:

fn main() -> Result<(), DeltaTableError> {
    deltalake::aws::register_handlers(None);
    let aws_url = url::Url::parse("s3://foo/bar").unwrap();
    deltalake::storage::store_for(&aws_url)?;
    println!("it works!"); // fails before this line :(
    Ok(())
}

@yonipeleg33 yonipeleg33 changed the title deltalake-* crates uses different version than specified in Cargo.toml, leading to unexpected behavior deltalake-* crates use different version than specified in Cargo.toml, leading to unexpected behavior Sep 5, 2024
@rtyler
Copy link
Member

rtyler commented Sep 5, 2024

Okay, I have replicated the issue. I did mess up some of the version ranges previously and the 0.19.0 release does fix the problem you describe

@yonipeleg33
Copy link
Author

Okay, I have replicated the issue. I did mess up some of the version ranges previously and the 0.19.0 release does fix the problem you describe

Yeah, but IIUC, it works because this is currently the latest version.
It can happen again, nothing in this setup is specific to the mentioned versions

@yonipeleg33
Copy link
Author

The same code with 0.17.3 fails as well, checked now

@rtyler
Copy link
Member

rtyler commented Sep 10, 2024

Looking at the cargo tree output with the reproduction case, what's interesting is that with

deltalake = { version = "0.18.2", default-features = false }
deltalake-aws = "0.1.4"

There's a root dependency of [email protected] which depends on [email protected] . The [email protected] dependency somehow pulls a [email protected] dependency.

Considering the version range in the deltalake-aws crates Cargo.toml I am surprised that Cargo is pulling redundant dependencies here, since 0.18.2 does fit in the version range of >=0.17.0, <0.20.0` 😕

This appears to be because Cargo treats all minor versions under 0.x as semantically incompatible for semver dependency resolution

That's annoying, and another good reason for us to consider pushing to 1.0 across the board 😄

rtyler added a commit to rtyler/delta-rs that referenced this issue Sep 10, 2024
Until we release 1.0, every 0.x release is going to be treated as a
"major" or incompatible change by Cargo

Fixes delta-io#2847
@yonipeleg33
Copy link
Author

yonipeleg33 commented Sep 10, 2024

@rtyler Thank you for handling this so quickly, I see the draft PR! 👀

This appears to be because Cargo treats all minor versions under 0.x as semantically incompatible for semver dependency resolution

I think this is more about semver compatibility rather than incompatibility: Cargo sees that deltalake-* depends on deltalake-core on a range of versions, and does a best effort to fetch the highest compatible version of deltalake-core for deltalake-* - which will be 0.19.0, because it doesn't change the left-most non-0 in the version.
From the link you attached:

Versions are considered compatible if their left-most non-zero major/minor/patch component is the same.

github-merge-queue bot pushed a commit that referenced this issue Sep 10, 2024
Until we release 1.0, every 0.x release is going to be treated as a
"major" or incompatible change by Cargo

Fixes #2847
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants