Client-side aggregation of datasets #590

trevorgokey · 2020-04-30T23:59:32Z

Is your feature request related to a problem? Please describe.
As the number of datasets increase, with many containing the same records, it would be convenient for a way to transparently deduplicate records across multiple datasets.

Describe the solution you'd like
Essentially, something that will aggregate datasets for the client. Something along the lines of
aggregate_ds = ds1 + ds2 + ds3
or
aggregate_ds = client.get_collection(["ds1","ds2","ds3"], "torsiondrivedataset")

Describe alternatives you've considered
Right now, a pairwise comparison of record IDs for equality can solve this, but it would be great to make this more transparent.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client-side aggregation of datasets #590

Client-side aggregation of datasets #590

trevorgokey commented Apr 30, 2020 •

edited

Loading

Client-side aggregation of datasets #590

Client-side aggregation of datasets #590

Comments

trevorgokey commented Apr 30, 2020 • edited Loading

trevorgokey commented Apr 30, 2020 •

edited

Loading