Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing file greater than 2GB doesn't work #136

Open
ixchelchakchel opened this issue Apr 17, 2023 · 3 comments
Open

Storing file greater than 2GB doesn't work #136

ixchelchakchel opened this issue Apr 17, 2023 · 3 comments

Comments

@ixchelchakchel
Copy link

Hi,
I am trying the library to store large file > 2GB and have trouble doing the same, here is my code for reference:

use google_cloud_default::WithAuthExt;
use google_cloud_storage::client::{Client, ClientConfig};
use google_cloud_storage::http::objects::upload::{Media, UploadObjectRequest, UploadType};

#[tokio::main]
async fn main() {
    let config = ClientConfig::default().with_auth().await
        .expect("failed getting credential");
    let client = Client::new(config);
    let data = vec![1; 3_000_000_000];
    println!("Storing {:.2}GB of file", data.len() as f64 / 1073741824f64);
    let upload_type = UploadType::Simple(Media::new("test_file.bin"));
    client
        .upload_object(
            &UploadObjectRequest {
                bucket: "my_bucket_name".to_string(),
                ..Default::default()
            },
            data,
            &upload_type,
            None
        )
        .await.expect("failed storing file");
    println!("Stored successfully");
}

this doesn't work the upload seems to be hung, not network activity have waited for over 2hrs.
whereas if I reduce use this data to less than 2GB let data = vec![1; 2_000_000_000]; it works?

Any ideas to troubleshoot this would be really helpful?

@yoshidan
Copy link
Owner

You may be able to handle this with reqwest options, etc., but I don't think you will be uploading large amounts of size data in bulk, so it is better to use upload_streamed_object. You can send more than 2GB of data in chunks of a few MB as shown in the sample code below.

        let chunks = vec![vec![1 as u8; 3_000_000]; 1_000];
        let chunks: Vec<Result<Vec<u8>, io::Error>> = chunks.into_iter().map(Ok).collect();
        let stream = futures_util::stream::iter(chunks);

        let mut media = Media::new("test_file.bin");
        media.content_length = Some(3_000_000_000);
        let upload_type = UploadType::Simple(media);

        let result = client.upload_streamed_object(&UploadObjectRequest {
            bucket: bucket_name.to_string(),
            ..Default::default()
        }, stream, &upload_type).await;

This method does not allow resuming from the middle of the process if an error occurs during the process, so resumable_upload is recommended.

@ixchelchakchel
Copy link
Author

Thanks will try this.
what's the recommended way to do fixed number of retries for upload/download/remove method in case of network failure, 503 error or other errors?
In the meantime i would try to see if I can use the backon crate for the retries - https://docs.rs/backon/latest/backon/

@yoshidan
Copy link
Owner

We do not prescribe a recommended method of performing a fixed number of retries; using backon, this can be accomplished with the following code.

async fn upload(client: StorageClient) {
        let metadata = Object {
            name: "testfile".to_string(),
            content_type: Some("video/mp4".to_string()),
            ..Default::default()
        };
        
        // start uploading
        let uploader = client
            .prepare_resumable_upload(
                &UploadObjectRequest {
                    bucket: bucket_name.to_string(),
                    ..Default::default()
                },
                &UploadType::Multipart(Box::new(metadata)),
            )
            .await
            .unwrap();

        // split chunk
        let mut chunk1_data: Vec<u8> = (0..256 * 1024).map(|i| (i % 256) as u8).collect();
        let chunk2_data: Vec<u8> = (1..256 * 1024 + 50).map(|i| (i % 256) as u8).collect();
        let total_size = Some(chunk1_data.len() as u64 + chunk2_data.len() as u64);
        let chunk1 = ChunkSize::new(0, chunk1_data.len() as u64 - 1, total_size);

        // upload chunk1 with retry
        let upload_chunk1 = || async {
            uploader.clone().upload_multiple_chunk(chunk1_data.clone(), &chunk1).await
        };
        upload_chunk1
            .retry(&ExponentialBuilder::default())
            .when(|e: &Error| match e {
                Error::Response(e) => e.is_retriable(),
                _ => false
            }).await.unwrap();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants