You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
As the number of datasets increase, with many containing the same records, it would be convenient for a way to transparently deduplicate records across multiple datasets.
Describe the solution you'd like
Essentially, something that will aggregate datasets for the client. Something along the lines of aggregate_ds = ds1 + ds2 + ds3
or aggregate_ds = client.get_collection(["ds1","ds2","ds3"], "torsiondrivedataset")
Describe alternatives you've considered
Right now, a pairwise comparison of record IDs for equality can solve this, but it would be great to make this more transparent.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
As the number of datasets increase, with many containing the same records, it would be convenient for a way to transparently deduplicate records across multiple datasets.
Describe the solution you'd like
Essentially, something that will aggregate datasets for the client. Something along the lines of
aggregate_ds = ds1 + ds2 + ds3
or
aggregate_ds = client.get_collection(["ds1","ds2","ds3"], "torsiondrivedataset")
Describe alternatives you've considered
Right now, a pairwise comparison of record IDs for equality can solve this, but it would be great to make this more transparent.
The text was updated successfully, but these errors were encountered: