Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce S3OutputStream that uploads data to S3 directly #45

Closed
wants to merge 1 commit into from

Conversation

twz123
Copy link

@twz123 twz123 commented Jun 29, 2015

  • S3OutputStream uses multipart uploads to transfer data to S3 without storing it on local disk first
  • FileSystemProvider returns this implementation for new OutputStreams instead of S3SeekableByteChannel if possible

  - S3OutputStream uses multipart uploads to transfer data to S3 without storing it on local disk first
  - FileSystemProvider returns this implementation for new OutputStreams instead of S3SeekableByteChannel if possible
@jarnaiz
Copy link
Member

jarnaiz commented Aug 14, 2015

Looks great, thanks!. Do you do some performance tests?

@twz123
Copy link
Author

twz123 commented Aug 17, 2015

No, I didn't measure performance. The intention for this was very constrained local disk space. So, when producing large amounts of data to be stored on S3 in a streaming-like fashion, that data doesn't have to be written to disk before. The local disk would have been too small for this anyways.

@twz123
Copy link
Author

twz123 commented Aug 31, 2015

@jarnaiz How do you usually measure performance. Do you have any framework, or a best practice? What measures would you like to see here?

@jarnaiz
Copy link
Member

jarnaiz commented Sep 6, 2015

Hi @twz123 I dont have experience with any benchmark framework. I usually do manual testing printing the response time of the operation and compare the result with the old version.
In my opinion with the results of the FileOperationsIT tests is enough.
The class FileOperationsIT have all the basic operations, if you want, use @after to print the results with this branch and the master and if you see big differences try to investigate the test :)

PD: quit any downloader and you need a stable internet connection ;)

Thanks!

@marksteele
Copy link

👍
A default putObject S3 operation has a max size of 5GB, so making this a streaming operation makes a lot of sense. Also removes buffering to local drive, which is also a big plus.

@carlspring
Copy link

Hi,

@twz123 , we would like to thank you for your work on this and are sad to see that it was not accepted and merged in this project.

However, for anyone interested, we've used this as inspiration of how to implement it in our spin-off (rebranded fork) -- s3fs-nio where we have now also migrated our code to AWS SDK v2 (see carlspring/s3fs-nio#63) . As this upstream is now clearly dead (#135), please feel free to check out our work in carlspring/s3fs-nio#95.

We would like to rebuild a community around the s3fs-nio and would be thrilled, if you'd like to join us in our efforts to build a stable library that is well-maintained, documented and regularly released.

Kind regards,

Martin Todorov

@twz123
Copy link
Author

twz123 commented Jan 18, 2021

Closing this, as it's solved in the successor project. 👍

@twz123 twz123 closed this Jan 18, 2021
@carlspring
Copy link

Thanks, @twz123 ! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants