Allow more types (schemes) of URLs (URIs) in metadata fields #7117
Labels
Feature: Metadata
Type: Suggestion
an idea
User Role: Depositor
Creates datasets, uploads data, etc.
Disclaimer: I'm very aware of #6030 but this can't wait for us. If IQSS is unhappy with this, it can reside in our mini-fork.
tl;dr: Add a new metadata field
uri_no
representing absolute, non-opaque URIs, being the correct term for "URL". Usingurl
type is not sufficient as HTTP/S only.Context:
For Jülich DATA, we want our contributors to provide URLs to data or at least documentation of whereabouts, when they don't or can't upload the data. (Which is our major use case...)
Lots of our data references will reside on windows network shares (
smb://foo.bar/share/folder
) and other obscure places (thinkrsync://...
,ipfs://...
,s3://
,gpfs://...
,git+xxx://
,http://
,ftp://...
, ...). Thus we need a broader support for likely any kind of URL to come no matter if a browser understands it.Our use case is also described at https://jugit.fz-juelich.de/fdm/schemas/-/issues/2 and will be documented in depth in our guide.
Technical Background:
Please keep in mind that a "URL" (uniform resource locator) is only a colloquial term. It's a common practice to use it, but strictly speaking, URLs are a subset of URIs (uniform resource identifiers), defined in RFC 3986.
URIs become URLs by adding an "authority" part - usually meaning a network resource - which is anything after the schema (e.g.
http:
), like//dataverse.org
and before the "path", like/login.xhtml
.URIs without an authority are "opaque" (leaving out some other cases), URLs are always non-opaque.
Some good examples for commonly known opaque URIs:
tel:+13930303
,isbn:1292-92219-1212
,mailto:[email protected]
.That could be even more formalised into URN (uniform resource name):
urn:isbn:1292-92219-1212
In Java both concepts exist as
java.net.URL
andjava.net.URI
. The key difference is that a URL object in Java always has to be backed by a scheme handler, as the API promises you can open a stream for it.Problem:
Currently, when using a
url
typed metadata field in a (custom) metadata block, this will only support URLs withhttp, https, file
andjar
scheme. (See topic protocol handlers in Java Docs)Also, the current field type
url
might be used to implicitly rely on being HTTP/S only. There are a lot of places in upstream metadata blocks where placeholder tell people they should provide a "full URL, starting with http://".Suggestion:
Instead of changing the current
url
type and any logic beyond that, I propose to add a new typeuri_no
, meaning "non-opaque URI" being an alias for URLs. This will exclude URNs and URIs without the authority part, leaving any kind of URL as allowed usage for the field.This is a surprisingly small change, PR forthcoming. It hasn't much UI impact, it's more on the metadata side of things.
The text was updated successfully, but these errors were encountered: