Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some minor RIF-CS feed modifications #400

Open
1 of 4 tasks
jangari opened this issue Jan 29, 2015 · 9 comments
Open
1 of 4 tasks

Some minor RIF-CS feed modifications #400

jangari opened this issue Jan 29, 2015 · 9 comments
Assignees

Comments

@jangari
Copy link

jangari commented Jan 29, 2015

We need to modify the code that generates the rif-cs feed to include more fields, and to account for some changes to our collections in RDA.

There are three four main changes:

  1. Include Data access details
  2. Add elements for related registry objects
  3. Modify ethnologue links to reflect new URL
  4. Change PARADISEC address in several RIF-CS elements

I'll go through these in detail below

  • Include Data access details

The OAI-PMH feed currently maps the Data access rights field (of collections) to the RIF-CS element /OAI-PMH/ListRecords/record/metadata/registryObjects/registryObject/collection/rights/accessRights. We need to include the next field in the database, Data access details (which is the free text form of the access restrictions, like 'access by permission of depositor only', or whatever) in the element /rights/rightsStatement. As such, the rights element should look like this:

    <rights>
      <accessRights>[content of Data access conditions]</accessRights>
      <rightsStatement>[content of Data access details]<rightsStatement>
    </rights>

Importantly, this should only happen if the Data access details field has content. If not, then do not supply the rightsStatement element.

  • Add elements for related registry objects

I've created records in RDA for the NABU catalog, and one for the whole Paradisec collection which will have all collections as parts. As such, this chunk needs to be added to the OAI feed for every collection:

    <relatedObject>
      <key>catalog.paradisec.org.au/collection</key>
      <relation type="isPartOf"/>
    </relatedObject>
    <relatedObject>
      <key>catalog.paradisec.org.au</key>
      <relation type="isAvailableThrough"/>
    </relatedObject>
  • Modify ethnologue links to reflect new URL

Ethnologue have changed their permalink structure and while the old structure automatically redirects to the new links, we shouldn't always expect that to be the case, and should update our links to reflect it.
Ethnologue links are generated from the ethnologue codes in the catalogue, as well as the standard language name (for the link title). The current structure is:

    <relatedInfo type="website">
      <title>Ethnologue entry for [language name]</title>
      <identifier type="uri">http://www.ethnologue.com/show_language.asp?code=[ethnologue code]</identifier>
    </relatedInfo>

This should change to:

    <relatedInfo type="website">
      <title>Ethnologue entry for [language name]</title>
      <identifier type="uri">http://www.ethnologue.com/language/[ethnologue code]</identifier>
    </relatedInfo>
  • Change PARADISEC address in several RIF-CS elements

As we've moved offices from Sydney University to the Conservatorium, all records are actually outdated. The chunk below (which is in /OAI-PMH/ListRecords/record/metadata/registryObjects/registryObject/collection/location/address)needs to be updated to our current address. So change:

    <physical type="postalAddress">
      <addressPart type="text">PARADISEC Sydney, Department of Linguistics, second floor Transient Building F12, Fisher Road, The University of Sydney, Camperdown Campus, NSW 2006, AUSTRALIA, Phone: +61 2 9351 2002</addressPart>
    </physical>

to:

    <physical type="postalAddress">
      <addressPart type="addressLine">PARADISEC</addressPart>
      <addressPart type="addressLine">Sydney Conservatorium of Music, C41</addressPart>
      <addressPart type="addressLine">The University of Sydney</addressPart>
      <addressPart type="addressLine">NSW 2006</addressPart>
      <addressPart type="telephoneNumber">+61 2 9351 1279</addressPart>
    </physical>

Similarly for the postal addresses of depositors(/OAI-PMH/ListRecords/record/metadata/registryObjects/registryObject/party/location/address/physical), which currently look like:

    <physical type="postalAddress">
      <addressPart type="text">[depositor name] c/o PARADISEC, Department of Linguistics, The University of Sydney</addressPart>
    </physical>

Should be changed to look like:

    <physical type="postalAddress">
      <addressPart type="addressLine">[depositor name]</addressPart>
      <addressPart type="addressLine">c/o PARADISEC</addressPart>
      <addressPart type="addressLine">Sydney Conservatorium of Music, C41</addressPart>
      <addressPart type="addressLine">The University of Sydney</addressPart>
      <addressPart type="addressLine">NSW 2006</addressPart>
    </physical>

All of the above is generated by collection.rb, and once you find the rif-cs section of that file (line 243 onwards) it should all be fairly straightforward.

If I come across any more changes, I'll add them to this issue.

@silviapfeiffer
Copy link
Contributor

Is this something for the new company or is it urgent and needs my attention?

@jangari
Copy link
Author

jangari commented Feb 1, 2015

Nope, not urgent. Terem can take care of it.

@nthieberger
Copy link
Collaborator

Jono, I noticed that RDA isstill picking up only 245 of our 338 open collections. Can you please confirm that the above work in this ticket has been done and that al our collections are being served via RIF-CS?

@enwardy
Copy link

enwardy commented Feb 15, 2018

Can you please confirm that the above work in this ticket has been done and that al our collections are being served via RIF-CS?

@nthieberger
Copy link
Collaborator

RDA is picking up all of our collections now

@tom-h
Copy link
Contributor

tom-h commented Sep 7, 2018

The RIF-CS feed is currently broken as far as I can tell. The feed at:

http://catalog.paradisec.org.au/oai/collection?verb=ListRecords&metadataPrefix=rif

...currently returns a 500 Internal Error Code. The harvest of records into research data Australia has been failing for quite a while now because of this.

(Started poking around because RDA records are marked up with schema.org, and hence crawled by the new Google Datasets search. Unfortunately, not all PARADISEC records are showing up. So it would be good to fix this so that latest PARADISEC records show up.)

@enwardy
Copy link

enwardy commented Feb 26, 2019

supplanted by new issue #679

@tom-h
Copy link
Contributor

tom-h commented Feb 26, 2019

The suggestions that @jangari makes at the top of this issue still hold has sensible additions to the RIF-CS feed and are unrelated (as far as I can tell) to the issue of the feed breaking. There are more that could be added to this too. For instance, the inclusion of the DOI as an identifier for the collection.

@johnf
Copy link
Member

johnf commented May 8, 2024

@nthieberger Are the initial improvements still worth implementing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants