-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BufWriter for Adapative Put / Multipart Upload #5431
Conversation
object_store/src/buffered.rs
Outdated
Self { | ||
capacity, | ||
store, | ||
state: BufWriterState::Buffer(path, Vec::with_capacity(1024)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's by design to keep a 1024
B buffer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a habit, as typically you want to avoid small bump allocations when pushing individual records. In this case poll_write is called with a slice of values, so this is not necessary. Will change
|
||
impl BufWriter { | ||
/// Create a new [`BufWriter`] from the provided [`ObjectStore`] and [`Path`] | ||
pub fn new(store: Arc<dyn ObjectStore>, path: Path) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An off-topic question: I remember we planned to support content_type and custom metadata. How can we accommodate this use case in our current API design?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot, filed #5435
self.multipart_id = Some(id); | ||
} | ||
BufWriterState::Buffer(p, b) => { | ||
let buf = std::mem::take(b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In severe cases, poll_shutdown
might return an error. Should we ensure that retrying poll_shutdown
functions is safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by safe? What is the issue if poll_shutdown returns an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the confusion caused by my choice of the word "safe". I'm talking about this case:
- write some data in buffer.
- calling
poll_shutdown
, return error like500 Internal Server
. - calling
poll_shutdown
again for retrying.
But the buf
has been take in the first call, so the final content could be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #5437 to track this, as I think there is a broader issue here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Which issue does this PR close?
Closes #5205
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?