-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove from_tabular_file
in ArrowDataset
#273
Comments
Agreed but I think the difference is that .from_dataset has the examples in-memory, while from_tabular_file doesn't actually load them (they are lazily loaded via a genexp). Correct me if I'm wrong @ivansmokovic Btw; what's the purpose of having a disk backed dataset if the dataset is already loaded in memory (e.g., using |
@mttk You are right. The reason to reimplement this in |
I'm fine with abstacting the logic out to avoid duplication or we can wait (to follow the Rule of three)
Not if we keep it private. |
@mariosasko @ivansmokovic @FilipBolt please take a look at #274 |
ArrowDataset.from_tabular_file is similar to TabularDataset's
__init__
. I think it's safe to remove this function. The same effect can be achieved with:E.g. #267 introduced some changes to TabularDataset's
__init__
and this logic was not added to ArrowDataset. By deleting from_tabular_file, we won't have such problems in the future.cc @ivansmokovic @mttk @FilipBolt
The text was updated successfully, but these errors were encountered: