Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjust sword api to handle new subject logic #1430

Closed
scolapasta opened this issue Feb 5, 2015 · 11 comments
Closed

Adjust sword api to handle new subject logic #1430

scolapasta opened this issue Feb 5, 2015 · 11 comments
Assignees

Comments

@scolapasta
Copy link
Contributor

@posixeleni suggested we use dcterms:subject for both keyword and subject. The json parser will handle this logic, but I think these means that sword no longer has to prefill N/A.

@pdurbin
Copy link
Member

pdurbin commented Feb 9, 2015

@posixeleni suggested we use dcterms:subject for both keyword and subject

I would love to see dcterms:subject be used for "Subject" in Dataverse.

However, because "Subject" in Dataverse is a controlled vocabulary, I'm concerned that enforcing a controlled vocabulary on dcterms:subject has the potential to cause the OJS plugin to stop working. We used to let people put anything in dcterms:subject. (Based on some feedback from @jwhitney we've continued to work on OJS compatibility: #805 (comment) ).

What would we write at http://guides.dataverse.org/en/latest/api/sword.html#backward-incompatible-changes ? "You can still use dcterms:rights but now it maps to both Keyword and Subject and a controlled vocabulary is enforced."

@posixeleni
Copy link
Contributor

@pdurbin: Given our concerns about SWORD with OJS, I have already spoken with @scolapasta that we should still allow backwards compatibility for the SWORD API. So we should still allow the ability to enter N/A as a subject which is something we have to do for the 3.6 to 4.0 migration anyway.

@pdurbin
Copy link
Member

pdurbin commented Feb 9, 2015

So we should still allow the ability to enter N/A as a subject

@posixeleni I was actually happier when we autopopulated the subject with "Other" in #921. With the "N/A" thing we end up allowing the creation of what seems like not a fully legitimate dataset.

Let's say I create a dataset with SWORD and later I go in to the GUI to change the title. I change the title and click "Save Changes" at the top of the page and I get a validation error (which I can't even see until I scroll down) that says "Subject is required." All I can think is, "If Subject is required, why was I allowed to create a dataset via SWORD?"

My design principle with SWORD is that SWORD should allow you to create a completely legitimate dataset. You shouldn't see an error if you want to change the title with the GUI.

But let's talk about solutions. What are we proposing here? If dcterms:subject is filled in with values that match the controlled vocabulary we pass those values through to both Subject and Keyword? And if dcterms:subject is filled in with random words that don't fit it our controlled vocabulary... we autopopulate Subject with "N/A"? Is that the plan?

@posixeleni
Copy link
Contributor

From @pdurbin

But let's talk about solutions. What are we proposing here? If dcterms:subject is filled in with values that match the controlled vocabulary we pass those values through to both Subject and Keyword? And if dcterms:subject is filled in with random words that don't fit it our controlled vocabulary... we autopopulate Subject with "N/A"? Is that the plan?

The plan from what I know is that if dcterms:subject matches any of the controlled vocab terms we have for subject then we map those terms to subject. ELSE we map "random" non-controlled vocabulary terms to keyword.

As far as what happens if there are no matching controlled vocabulary terms for subject, it depends what workflow we are talking about:

  1. For OJS SWORD (for backwards compatibility): assign "Other" or N/A (IMHO I think "Other" is not a helpful term for users so we should just put N/A and if they ever go to edit in the UI they would have to pick a relevant controlled vocabulary subject term.
    2 ) If someone deposits a Dataset with the Native API, then consistent with the UI, we should return an error if someone does not supply a Subject term that matches our controlled vocabulary (but this can be clearly documented in our documentation and in the error message).
  2. For harvesting: We would need to mimic what happens in the OJS SWORD workflow (see point 1) since we shouldnt expect that people will have these subject terms but IF they happen to then we populate them.

Does this sound right @scolapasta ?

@scolapasta
Copy link
Contributor Author

Most of this sounds right. However, while I agree that backwards compatibility is generally important, I think with 4.0, we should be able to say, you need to make some modifications. Let's discuss this in person at the beginning of the beta 14 time frame.

@scolapasta scolapasta assigned posixeleni and unassigned pdurbin Feb 11, 2015
@scolapasta scolapasta modified the milestones: Beta 14 - Dataverse 4.0, In Review - Dataverse 4.0 Feb 11, 2015
@scolapasta scolapasta modified the milestones: Beta 14 - Dataverse 4.0, In Review - Dataverse 4.0 Feb 20, 2015
@pdurbin
Copy link
Member

pdurbin commented Feb 21, 2015

"Decision for 4.0 - we will remove subject as required and also remove from the UI and API." -- @scolapasta at #1452 (comment)

@posixeleni
Copy link
Contributor

@pdurbin I believe @scolapasta comment in #1452 is for Subject at the Dataverse level and not the Dataset level. Subject will still be required at the Dataset level as of the discussions during our meetings this past week.

@posixeleni posixeleni assigned scolapasta and unassigned posixeleni Feb 25, 2015
@posixeleni
Copy link
Contributor

@scolapasta I spoke with OJS about their plugin and we will still need in 4.0 to support allowing N/A for Subject in the SWORD API. However, this is something that OJS will plan to work on in the future to have a dropdown list of our Subject Terms in their plugin.

Re-assigning to Gustavo to assign to a developer to work on this ticket.

@scolapasta
Copy link
Contributor Author

I don't think there's any dev work needed, as the Json Parser logic will handle populating subject and keyword, and then sword will add N/A if no subject. (until we get OJS to add subject, then we will turn this off, so anyone using our API needs to add subject correctly, or it will break when we deprecate this.

Passing to @pdurbin to just do a sanity check that I am right about it being all set.

@scolapasta scolapasta assigned pdurbin and unassigned scolapasta Feb 26, 2015
@pdurbin
Copy link
Member

pdurbin commented Feb 26, 2015

However, this is something that OJS will plan to work on in the future to have a dropdown list of our Subject Terms in their plugin.

That's why I stubbed out something in #1510 based on chat with @posixeleni

pdurbin added a commit that referenced this issue Mar 2, 2015
Discovered editing metadata via SWORD isn't working. Opened #1554

The new behavior for subject/keyword is not documented. Opened #1553
@pdurbin
Copy link
Member

pdurbin commented Mar 2, 2015

the Json Parser logic will handle populating subject and keyword, and then sword will add N/A if no subject

As @scolapasta and I discussed SWORD was always adding a subject field even if one exists, resulting in a corrupt dataset. (Should the CreateDatasetCommand refuse to create a corrupt dataset? Probably.) As of 1bc8772 SWORD checks first if there's an existing subject field before adding one.

I'm not sure what the behavior is if you try to edit the subject and/or keyword via SWORD and when I tried to test this I found that the "replacing metadata for a dataset" function of SWORD doesn't work at all, unfortunately, at least on my machine. I created #1554 and marked it critical, noting that I have no idea when this was last tested or how long it's been broken. This seems rather important.

I'll pass this ticket to QA but unfortunately how subject should be handled has not been documented, though there is now a ticket for this: #1553 . I played around with combinations of strings that match and don't match our controlled vocabulary for "subject" and watching subject and keyword get populated:

   <dcterms:subject>Chemistry</dcterms:subject>
   <dcterms:subject>Engineering</dcterms:subject>
   <dcterms:subject>aircraft</dcterms:subject>
   <dcterms:subject>planes</dcterms:subject>

Finally, I noticed that in #1246 that we don't want to display N/A values in UI, but as of this writing it's easy to create a dataset like this via SWORD by simply not including any dcterms:subject elements in the XML:

na

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants