Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Related publication identifier URL does not correspond to the identifier #8657

Open
ErykKul opened this issue Apr 28, 2022 · 8 comments
Open
Labels
Component: JSF Involves modifying JSF (Jakarta Server Faces) code, which is being replaced with React. Feature: Metadata Type: Suggestion an idea User Role: Depositor Creates datasets, uploads data, etc.

Comments

@ErykKul
Copy link
Collaborator

ErykKul commented Apr 28, 2022

What steps does it take to reproduce the issue?
In the "Related publication" filed of a dataset metadata enter an identifier (type and id number) and fill in the URL with a link that does not refer to the identifier (some other web link to a page related to the publication).

  • Which page(s) does it occurs on?
    When viewing a datasets.

  • What happens?
    The identifier link at the end of the text content of the "Related publication" does not contain the identifier URL, but it contains the provided link instead, like this:
    image

  • To whom does it occur (all users, curators, superusers)?
    All users.

  • What did you expect to happen?
    I would expect that de identifier has a correct URL, like this:
    image

Which version of Dataverse are you using?
5.10.1

Any related open or closed issues to this bug report?

Screenshot of the field content:
image

@jggautier
Copy link
Contributor

jggautier commented Apr 28, 2022

Hi @ErykKul. I opened one of the issues you marked as related to this issue so I was interested in what you've done in your pull request.

I think it's an improvement but wanted to share what I found when I spun up your branch in an AWS instance, including some bugs that may not be related to your pull request but to how the AWS instance was created, in which case apologies for the noise:

  • The changes work only for DOIs, which I think is okay since DOIs are probably what's most entered in the related publication fields.

  • If other IDs are entered, the URL field isn't used to create a link when that ID is displayed on the dataset page. For example, when I enter a Handle, there's no longer a way to have that Handle point to a URL when the metadata is displayed. I think this should be fixed.

  • When I create a dataset and hover over or click any metadata field tooltip icons, the tooltip text no longer appears. It does appear when editing an already created dataset. Thinking about it more, maybe this bug was already in the develop branch? This bug isn't in two of the Dataverse installations running the latest released version of the Dataverse software (v5.10.1)

    Edit: This bug with the tooltip text wasn't introduced by your pull request. Sorry for the noise. I'll see if developers are already aware of it or open a different GitHub issue about it.

@qqmyers
Copy link
Member

qqmyers commented Apr 28, 2022

This seems tricky - I could argue as a user that if I put in a URL, I'd be surprised when it doesn't get used/shown at all. (Some DOIs point behind paywalls so I could have a reason to cite the DOI but provide a link to an alternate ~copy/preprint)

FWIW: I could see a couple medium/longer term solutions that go in different directions:

  • Eventually, it seems like we'd either want to restructure the related publications field (also desired to allow one to specify what the relationship with the dataset is to help with DataCite reporting), e.g. to only have one field for number/url, and/or connect that with an external vocab script which could help assure the DOI/URL match in that case (and do nice things like retrieve the citation for DOIs, etc). (I think an external vocab script could be written now that would auto-suggest/auto-enter a URL for a DOI given the number, or raise a popup if you use a URL that doesn't match, etc.). More could be done but this could be a lightweight way to do something DOI specific and/or site-specific without forking the codebase).
  • In the near term, adding more description about how to use/not use the fields could be done in the tool tip. In a month or two, the DataCommons project will also be adding the ability to add custom per-field instructions in templates, so one could make a more visible note right above the related publications field to e.g. suggest using the URL form of DOIs and leaving the number field blank, etc.

@jggautier
Copy link
Contributor

Very glad you mentioned these points!

I could argue as a user that if I put in a URL, I'd be surprised when it doesn't get used/shown at all. (Some DOIs point behind paywalls so I could have a reason to cite the DOI but provide a link to an alternate ~copy/preprint)

In #5277 I started to write about the mismatch between the design of these fields and how they're actually used, and the problems this mismatch can cause.

For example the URL field was added only for creating a clickable link when the ID Type and ID Number fields are displayed on dataset pages. I agree that there's value in giving depositors a way to add a kind of alternative URL, as you wrote, but I'd lobby for making the original function of the existing URL field more explicit and exploring options for turning that URL field into an "alternative URL" field.

In the near term, adding more description about how to use/not use the fields could be done in the tool tip

The changes made for #8127 about improving tooltips include editing the tooltips to clarify the purpose of some of the related publication fields. But like I think you've implied, that might not be enough if many depositors don't look at the tooltips, and something else might be better, like what's been discussed in the DataCommons project.

@ErykKul
Copy link
Collaborator Author

ErykKul commented Apr 29, 2022

Hello,

Thank you for all suggestions.

As for the tooltips not working any more, I have traced it back to the commit bd7634f as done in the context of #7565
@sekmiller, can you take a look at it? I am new to AJAX and bootstrap, it would take some time to figure it out for me. Reverting the mentioned commit fixes the issue (not working tooltips when adding new dataset), so I am quite sure it comes from there.

The URL generation for identifiers in this pull request is done by already existing code for global identifiers. As for now, only two types of identifiers are supported: doi and hdl (is that the Handle you mentioned?). If the generation fails, a warning is logged. Example of supported identifier as can be found in the code:

    /** 
     *   Parse a Persistent Id and set the protocol, authority, and identifier
     * 
     *   Example 1: doi:10.5072/FK2/BYM3IW
     *       protocol: doi
     *       authority: 10.5072
     *       identifier: FK2/BYM3IW
     * 
     *   Example 2: hdl:1902.1/111012
     *       protocol: hdl
     *       authority: 1902.1
     *       identifier: 111012
     *
     * @param identifierString
     * @param separator the string that separates the authority from the identifier.
     * @param destination the global id that will contain the parsed data.
     * @return {@code destination}, after its fields have been updated, or
     *         {@code null} if parsing failed.
     */

The space after ":" is allowed, and you can see what is parsed by looking and the link text. @jggautier, if you give me an example, I could check what went wrong. Also, I can implement URLs for additional identifier types, for that I would need to have the specification how to generate the URL from the identifier.

As for the URL provided in the field of related publication, I will add code to show it after the identifier, as it is done in the expanded metadata:

image

I will then add that link only if it is different from the identifier URL, to prevent doubling it.

Greetings,
Eryk

@pdurbin
Copy link
Member

pdurbin commented Apr 29, 2022

@ErykKul huh. Thanks for the heads up about the broken tooltips when creating a dataset. I just demoed it to @sekmiller. Please feel free to open a separate issue for this as it definitely seems like a bug we should try to fix before it makes it into a release. Thankfully, it's only in the "develop" branch right now. (I just confirmed the demo server running 5.10.1 is fine.)

@ErykKul
Copy link
Collaborator Author

ErykKul commented Apr 29, 2022

Hi,

I have worked out the rules for rendering of the identifier URL and/or the publication URL.

Identifier URL:

  • by default, we render the identifier URL if it is not empty and is not contained in the citation
  • however, if we do not render publication URL, we render the identifier URL such that at list one clickable link is present

Publication URL:

  • we render the relative publication URL if it is not empty, not equal to identifier URL and is not contained in the citation
  • we also render the publication URL as fallback when the identifier URL is empty such that at list one clickable link is present

This is quite complex, however, I expect two main simple scenarios.

  1. The citation already contains the identifier, and the provided publication URL is the same as in the identifier (or is empty). In this case, we have a clickable identifier added after the citation:

image

  1. The citation contains the identifier, and the added publication URL contains the unique URL to the free version of the publication. In this case the publication URL is added after the citation while the identifier is not rendered as it is already present in the citation:

image

Does it seem to be sensible? I would appreciate some feedback.

Greetings,
Eryk

@ErykKul
Copy link
Collaborator Author

ErykKul commented Apr 29, 2022

The issue for the tooltips:

@jggautier
Copy link
Contributor

jggautier commented Apr 29, 2022

Hi @ErykKul. I like that in the second scenario, it's clear where the clickable URL is pointing. Definitely an improvement over how things have been displayed so far when the Related Publication's URL field is misused.

In the first scenario I think it would be more desirable to display the DOI URL in the Citation field as a clickable link and exclude the clickable identifier ("doi: 10.3390/v1309184"), although I kind of understand why that might be much more difficult.

Like you wrote this is complex. I think it's because the Related Publication's Citation field may or may not contain some of the information that's already in the other Related Publication fields. There can be overlap. So trying to display this metadata by combining the values in all of those fields in a way that makes sense to people can be complex, especially when users use some of the fields in ways that the system, and so their designers, don't expect.

Regarding supporting the two types of identifiers you mentioned, doi and hdl, yeah by "Handle" I meant "hdl". Sorry I didn't realize that there was support for both types. In an earlier version of your branch I chose "handle" as the ID Type and put "1902.1/00519" in the ID Number field, as in the screenshot below, and after saving the dataset and viewing the metadata on the dataset page, a link wasn't created for that hdl, like it had been when I did the same for a doi.

Screen Shot 2022-04-29 at 12 44 55 PM

More broadly, this is looking to me like an effort to account for the misuse of the Related Publication's URL field. Do you agree?

You offered to implement URLs for additional identifier types, and if you did that, that would eliminate the need for the Related Publication's current URL field. In that case I think the community could, maybe in another effort, consider how to redefine that URL field as a field that depositors could use to add an alternative URL, like the "free version of the publication" that you mentioned, or maybe add a URL in cases where one can't be generated from the given PID.

But I'm worried about continuing to expand the scope of your great and timely efforts here, because as @qqmyers wrote there are some unexplored ideas about how to address more issues with the Related Publication fields, and these fields are being discussed in other efforts, like the mentioned Harvard Data Commons effort and NIH grant funded work that will include better Make Data Count support (which might include a redesign of the Related Publication fields).

I'd like to chat with some colleagues that have more perspective about all of these moving parts, too.

@pdurbin pdurbin added Type: Suggestion an idea Feature: Metadata User Role: Depositor Creates datasets, uploads data, etc. Component: JSF Involves modifying JSF (Jakarta Server Faces) code, which is being replaced with React. labels Oct 13, 2022
@DieuwertjeBloemen DieuwertjeBloemen moved this to Interesting/To keep an eye on in KU Leuven RDR Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: JSF Involves modifying JSF (Jakarta Server Faces) code, which is being replaced with React. Feature: Metadata Type: Suggestion an idea User Role: Depositor Creates datasets, uploads data, etc.
Projects
Status: Interested
Status: Interesting/To keep an eye on
Development

Successfully merging a pull request may close this issue.

4 participants