Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explaining the data #4

Open
TeunvanSprundel opened this issue Feb 4, 2016 · 14 comments
Open

Explaining the data #4

TeunvanSprundel opened this issue Feb 4, 2016 · 14 comments

Comments

@TeunvanSprundel
Copy link
Contributor

One of the issues we're looking into at the moment is what we need to do to make sure all the different users understand the data. The authority of a dataset uses formal definitions of objects. Some of these definitions can be unclear for the users. E.g. an end-user and authority can talk about the same thing, but use different definitions. How can we make sure it is clear for everyone what the data means.

@TeunvanSprundel
Copy link
Contributor Author

Some comments from the municipality of Bergen op Zoom: Explaining geo-data to end-users can be a hard task. In the municipality of Bergen op Zoom they arrange some meetings where the GIS experts explain to the users what the data is they are dealing with.
Obviously this is no solution for our platform, but it shows the importance of this issue

@TeunvanSprundel
Copy link
Contributor Author

An other authority, the "Omgevingsdient Noordzeekanaalgebied" does not have this problem. They make sure that the only persons who are working with their data are people who already know what the data definitions mean.

@TeunvanSprundel
Copy link
Contributor Author

Some idea's we have are the following:

  • Work with aliases. The formal definition stays the one and only formal definition, but users can propose an alias.
  • Explain the definition using metadata. We can show this data with an (i) button.

@TeunvanSprundel
Copy link
Contributor Author

Other ideas are welcome ofcourse

@lvdbrink
Copy link

lvdbrink commented Feb 4, 2016

What about using linked data principles to link definitions to data objects and properties? @liekeverhelst can you comment?

For example it would be useful to link to definitions like are present here: http://definities.geostandaarden.nl/concepten/imgeo/id/begrip/abri

Or here: http://imgeo.geostandaarden.nl/def/imgeo-object/paal/verkeersregelinstallatiepaal-niet-bgt
http://imgeo.geostandaarden.nl/def/imgeo-object/paal/lichtmast-niet-bgt

@liekeverhelst
Copy link

Yes @lvdbrink @TeunvanSprundel making Linked Data links to definitions is a very good option. Using dcterms:subject is probably the most easy way.

@ndkv
Copy link

ndkv commented Feb 4, 2016

I think aliases are a great idea, especially when users are able to easily add their own.

Linking to definitions is a good idea, too. I do wonder whether those might be too specialistic/complex for the audience @TeunvanSprundel is addressing.

@TeunvanSprundel do you have an overview of the problems that users run into? What is it they don't understand? Is it merely an ontology (terms) mismatch or are there more fundamental misunderstandings? In other words, who are the users you speak of and what are their needs?

I recently helped hackers at an open data challenge make sense of the geo data in atlasnatuurlijkkapitaal.nl. The datasets in ANK are, compared to other geo outlets, described in great detail. Each dataset comes with a bijsluiter (i.e. a leaflet) that explains what the dataset is about, how it was created, who owns it, where to look for more information, etc. See e.g. http://www.atlasnatuurlijkkapitaal.nl/web/ank/kaarten?x=160000&y=450000&zoom=2&l=4018&al=4018&bl=2&mapLayout=1&location=false&library=false

While the app builders didn't need (or want) more information, the participating data journalists and designers indicated that the provided information isn't enough for them to be able to create compelling stories and visualizations.

This extremely anecdotal story is meant to illustrate that different users require different pieces of information to complete their quest.

@ndkv
Copy link

ndkv commented Feb 4, 2016

BTW, the mismatch in ontologies also makes data difficult to find.

Yesterday I needed the areas/outlines of Dutch cities i.e. a the urban areas in an average land use dataset. Searching for 'urban areas', 'cities', 'city area' and a bunch of others in NGR didn't result in anything useful. After a while, purely by luck, I discovered the CBS Bevolkingskernen (rough translation: inhabited kernels) dataset.

@lvdbrink
Copy link

lvdbrink commented Feb 4, 2016

Wow, I wouldn't have been able to find that one either (Bevolkingskernen...)

@pvgenuchten
Copy link

@lvdbrink linking skos concepts as keywords to a dataset cataloguerecord (aka bijsluiter or metadata) is common practice in both dcat and iso19139. It's up to the catalogue search algorythm to make use of skos principles as related, broader, narrower to improve the search result. Also consider multilingual keywords such as elevation vs hoogte, which is a similar challenge.

@lvdbrink
Copy link

lvdbrink commented Feb 4, 2016

Yes, using skos for aliases is a good idea!

@TeunvanSprundel Open Annotations could also be a solution for capturing user comments, aliases, etc.

@matthew-spotzi
Copy link
Contributor

@pvgenuchten skos sounds great, we are going to have a good look at it! Do you maybe have an example of the usage of skos in combination with geo-data?

@matthew-spotzi
Copy link
Contributor

@ndkv As you already mentioned different users require different pieces of information to complete their quest. Since we don't know our users (gis experts to citizens) we want to have as many posibilities available.

@ndkv
Copy link

ndkv commented Feb 5, 2016

Hmm... I'd argue you need to find out who the users are. Including as many possibilities as possible is a recipe for creating an overly complex "thing" that doesn't fit any of the users' needs in a useful way. You'd be pushing a solution instead of letting users pull one that works for them. We have enough 'pushed' stuff in the geo/government world already. It might be fruitful (and exciting!) to turn the process around this time.

Dealing with users is difficult and takes a lot of time. For the purpose of the testbed you can also limit yourself to one group of users (municipal workers, citizens, web developers, data scientist, etc.) and try to satisfy their needs in a robust manner. In case the discovery process proves fruitful you can document it and eventually apply it to the other groups.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants