Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

howto define subset of source dataset #60

Open
coret opened this issue Jan 17, 2022 · 1 comment
Open

howto define subset of source dataset #60

coret opened this issue Jan 17, 2022 · 1 comment

Comments

@coret
Copy link
Contributor

coret commented Jan 17, 2022

When for example an aggregator (like NOB) publishes a dataset which originates from another organisation and is a subset of the data, like only the WO2-related images from the entire image collection of an organisation, how should this "subset" be described in the datasetdescription?

The property schema:isBasedOn makes the link with the source dataset (provenance), but this will probably be a link to the total, as the organisation just has the whole of the image collection described as dataset and not the subset.

The fact that the aggregator only used a selection (and may transformed / enriched this subset) is of interest to the user of the dataset user. Just put it in schema:description ?

@bencomp
Copy link

bencomp commented Sep 21, 2022

First thing that comes to mind is <LargeDataset> void:subset <SubsetOfLargeDataset> ., though that links two void:Datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants