Affiliations not captured from crossref data #96

seasidesparrow · 2024-03-12T11:52:24Z

Describe the bug
For at least some crossref records, the affiliation information for each author is returned in a structure tagged as <affiliations><institution><institution_name>. (See e.g. 10.1364/AO.505607). Currently, the crossref parser is looking for the tag <affiliation> (not affiliations) and extracting the contents with .get_text(). This misses the structure above entirely.

To Reproduce
Steps to reproduce the behavior: harvest the crossref xml from their api, and parse with adsingestp.parsers.crossref. Authors 1 and 5 will have ORCIDs, but there will not be any additional affiliation information.

Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

seasidesparrow · 2024-03-12T14:02:29Z

The crossref parser is actually parsing crossref xml data that has passed through the Habanero Content Negotiation method, and so it needs to be able to read data in the UNIXREF-XML query return format, documented here: https://www.crossref.org/schema/unixref1.1.xsd.

seasidesparrow added the bug Something isn't working label Mar 12, 2024

seasidesparrow self-assigned this Mar 12, 2024

seasidesparrow mentioned this issue May 3, 2024

Improves affiliation capture from Crossref #105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Affiliations not captured from crossref data #96

Affiliations not captured from crossref data #96

seasidesparrow commented Mar 12, 2024

seasidesparrow commented Mar 12, 2024

Affiliations not captured from crossref data #96

Affiliations not captured from crossref data #96

Comments

seasidesparrow commented Mar 12, 2024

seasidesparrow commented Mar 12, 2024