-
Notifications
You must be signed in to change notification settings - Fork 839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RecordBatchWriter trait can't be used in dynamic context due to close(self) #6326
Comments
Ok trying to reimplement it myself, looks like the problem is that |
Well good news is that now we have finish(&mut self), added in March in #5471, so maybe we can switch to that? |
It is designed to be used with a generic type constraint. IIRC we switched to taking self to make it easier when working with non-static Write, and to prevent calling close multiple times So you would instead have something like
|
That works in simple cases, yes. I've tried this in my slightly more complex code and then spent unreasonable amount of time trying to make it work. Specifically, the writer in my case is stateful, so I created a struct with generic struct MyWriter<W: RecordBatchWriter> {
writer: W,
...
}
let writer: Box<dyn MyWriter> = match args.format {
"parquet" => Box::new(MyWriter::new(parquet::arrow::ArrowWriter::try_new(file, schema, None), ...)?),
...
}
use_my_writer_as_part_of_a_pipeline(writer); Now we can't really do that ^^ because MyWriter is not a trait, so we need to introduce another higher-level trait. Here's alternative proposal - can we add a |
One other consideration is the |
I believe the trait would still not be object safe, due to the presence of the close method
An iterator / stream can compute the next batch on demand
Or you can just define your own trait over the writers... |
It won't be object-safe, but we can use both write and finish methods dynamically - see playground.
Uffh I'd need to restructure the whole program to support that :)
That's what I'm doing now and it's sad that my own trait is so close to existing one and I have to reimplement basically the same things. |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I'm trying to use
RecordBatchWriter
trait to introduce dynamism in the type of output - e.g. depending on the command line options, I'd like to write parquet, csv, json or ipc. I know that corresponding writers all implement this trait, thanks to #4206 and #4228 by @alexandreyc.Specifically, I was planning to use
Box<dyn RecordBatchWriter>
and pass it to the writing code, something like this:Unfortunately, due to
close
method in that trait consumingself
, we can't really use this approach, as now close() needs to know the exact type of the writer. Calling close() on the trait object gives compilation error "the size ofdyn RecordBatchWriter
cannot be statically determined". See example in Rust playground.Describe the solution you'd like
Add a
finish(&mut self)
method to the trait. This is backwards-compatible, as well as an established pattern e.g. see parquet writers.This will not make the trait object-safe, but will unblock usage as
Box<dyn RecordBatchWriter>
.Describe alternatives you've considered
Manually downcasting to specific types and then calling close() on them, but that doesn't seem to work, as RecordBatchWriter needs to explicitly have
as_any
method or similar.The text was updated successfully, but these errors were encountered: