Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming functionality for list objects #54

Merged
merged 1 commit into from
Oct 5, 2023

Conversation

donatello
Copy link
Member

Currently, there is a problem in client.list_objects - we cannot access the prefixes within a path, only the objects (when not traversing recursively). This is because the result_fn only takes object properties:

minio-rs/src/s3/args.rs

Lines 1115 to 1135 in 8ecabea

pub struct ListObjectsArgs<'a> {
pub extra_headers: Option<&'a Multimap>,
pub extra_query_params: Option<&'a Multimap>,
pub region: Option<&'a str>,
pub bucket: &'a str,
pub delimiter: Option<&'a str>,
pub use_url_encoding_type: bool,
pub marker: Option<&'a str>, // only for ListObjectsV1.
pub start_after: Option<&'a str>, // only for ListObjectsV2.
pub key_marker: Option<&'a str>, // only for GetObjectVersions.
pub max_keys: Option<u16>,
pub prefix: Option<&'a str>,
pub continuation_token: Option<&'a str>, // only for ListObjectsV2.
pub fetch_owner: bool, // only for ListObjectsV2.
pub version_id_marker: Option<&'a str>, // only for GetObjectVersions.
pub include_user_metadata: bool, // MinIO extension for ListObjectsV2.
pub recursive: bool,
pub use_api_v1: bool,
pub include_versions: bool,
pub result_fn: &'a dyn Fn(Vec<Item>) -> bool,
}

Results function is called as follows:

minio-rs/src/s3/client.rs

Lines 2810 to 2835 in 8ecabea

while !stop {
if args.include_versions {
let resp = self.list_object_versions(&lov_args).await?;
stop = !resp.is_truncated;
if resp.is_truncated {
lov_args.key_marker = resp.next_key_marker;
lov_args.version_id_marker = resp.next_version_id_marker;
}
stop = stop || !(args.result_fn)(resp.contents);
} else if args.use_api_v1 {
let resp = self.list_objects_v1(&lov1_args).await?;
stop = !resp.is_truncated;
if resp.is_truncated {
lov1_args.marker = resp.next_marker;
}
stop = stop || !(args.result_fn)(resp.contents);
} else {
let resp = self.list_objects_v2(&lov2_args).await?;
stop = !resp.is_truncated;
if resp.is_truncated {
lov2_args.start_after = resp.start_after;
lov2_args.continuation_token = resp.next_continuation_token;
}
stop = stop || !(args.result_fn)(resp.contents);
}
}

So looks like no way to iterate over prefixes in this function interface.

Of course we can use the lower level list_objects_v2 in the client to do this, but maybe we can make a different interface that is easier to use.

This PR is currently in draft for discussion.

src/s3/args.rs Outdated Show resolved Hide resolved
src/s3/client/paginated.rs Outdated Show resolved Hide resolved
tests/tests.rs Outdated Show resolved Hide resolved
@balamurugana
Copy link
Member

So looks like no way to iterate over prefixes in this function interface.

Of course we can use the lower level list_objects_v2 in the client to do this, but maybe we can make a different interface that is easier to use.

@donatello You have to use lower level S3 APIs for this specific needs. The objective of higher level list_objects() is to provide one view for all lower level object listing APIs. As you are interested only with prefixes and there is no way to fetch prefixes without object information as per S3 specification, using lower level S3 APIs is the right choice.

@donatello
Copy link
Member Author

So looks like no way to iterate over prefixes in this function interface.

Of course we can use the lower level list_objects_v2 in the client to do this, but maybe we can make a different interface that is easier to use.

@donatello You have to use lower level S3 APIs for this specific needs. The objective of higher level list_objects() is to provide one view for all lower level object listing APIs. As you are interested only with prefixes and there is no way to fetch prefixes without object information as per S3 specification, using lower level S3 APIs is the right choice.

@balamurugana I want to actually list folders and objects at a prefix and get all of them. Adding a streaming API for this should be ok. We already have it in other SDKs.

@donatello donatello force-pushed the list-improv branch 2 times, most recently from d957f08 to 4122be8 Compare September 29, 2023 16:15
@donatello donatello marked this pull request as ready for review September 29, 2023 16:15
@donatello donatello force-pushed the list-improv branch 2 times, most recently from 71d800f to 7cbdbca Compare September 29, 2023 16:41
@donatello
Copy link
Member Author

@balamurugana This is ready.

@donatello donatello requested a review from balamurugana October 3, 2023 23:00
@donatello donatello changed the title Add streaming functionality for list objects v2 Add streaming functionality for list objects Oct 3, 2023
@donatello donatello force-pushed the list-improv branch 2 times, most recently from df09d8d to 490a4fb Compare October 3, 2023 23:12
src/s3/args.rs Outdated Show resolved Hide resolved
src/s3/types.rs Outdated Show resolved Hide resolved
tests/tests.rs Outdated Show resolved Hide resolved
src/s3/client/list_objects.rs Outdated Show resolved Hide resolved
src/s3/client/list_objects.rs Outdated Show resolved Hide resolved
src/s3/client/list_objects.rs Outdated Show resolved Hide resolved
- High level streaming API - `list_objects`

- Three lower level streaming APIs - `list_objects_v1_stream`,
`list_objects_v2_stream` and `list_object_versions_stream`

- `encoding_type` parameter is set by default to "url".
@balamurugana balamurugana merged commit 8fb211a into minio:master Oct 5, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants