Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOI search - not working #236

Open
paulineobps opened this issue Aug 8, 2022 · 8 comments
Open

DOI search - not working #236

paulineobps opened this issue Aug 8, 2022 · 8 comments
Labels
ingest Related to the Ingest component

Comments

@paulineobps
Copy link
Collaborator

on the new interface I tried 2 DOI searches- both of which are in the database - neither worked

http://dx.doi.org/10.25607/OBP-561
10.26198/gfgr-fq47

the search help indicates using format: 10.26198/gfgr-fq47

@paulineobps
Copy link
Collaborator Author

sorry Paul I just cannot get this to work on any DOI search

@paulpilone
Copy link
Collaborator

@paulineobps I'm now able to find the document in your original comment using the search value 10.25607/OBP-561. When I use the value 10.26198/gfgr-fq47 I get 33 results ... so I'm not sure if that's correct or not. e.g. here is one of my searches:

image

@paulineobps
Copy link
Collaborator Author

paulineobps commented Aug 23, 2022 via email

@paulpilone
Copy link
Collaborator

Ok I see it now. It actually looks like it's matching because of the 10.25607 and you can even find it because of OBP. I think Elasticsearch is parsing that URI on punctuation. This is tricky because it's a URI but we don't actually want to require the user to search for the entire URI - just a portion of it - but a portion we want to define. I'll try looking into this more when I can.

@paulpilone paulpilone reopened this Aug 23, 2022
@paulpilone
Copy link
Collaborator

paulpilone commented Aug 24, 2022

@paulineobps for the DOI - do we only care about the last 2 parts of the path? e.g. the metadata field can be stored as the full URL http://dx.doi.org/10.25607/OBP-561 and then we can store 10.25607/OBP-561 separately so the user can find that portion.

The other option is I handle DOI searches and append a wildcard to them so the actual search becomes *10.25607/OBP-561 so the user can find a DOI using just the path of it.

I'm trying to understand if we want to support both options of searching or just require the user to follow the instructions in the search tips.

@paulineobps
Copy link
Collaborator Author

everything before 10.25607/OBP-561 might not be unique, but every DOI will have a 10.xxxxx/nr, so go with that. It is in the Search Tips but also in future work, we have asked for a short tip in the search box when you choose a particular search parameter so for the DOI search it would have text in the search box that the search should be in the format 10xxxx (eg. 10.25607/OBP-561 or 10.1021/acssensors.1c01685 etc)
The 10.25607 is a unique id in the DOI for IOC (our parent organization) and the number after the following/ is a unique numbering sequence within that org, so OBP-561 is unique for OBPS.
10.1021/acssensors.1c01685 is the unique numbering : 10.1021 is the American Chemical Society and acssensors.1c01685 is the unique DOI number sequence for articles in their journal ACS Sensors.
does that help?

@paulpilone
Copy link
Collaborator

This has been temporarily fixed and I'm going to remove myself as the assignee. This should still be looked at and fixed in a more permanent way. See the linked PR for more information or look at this line:

return field === 'dc_identifier_doi' ? `*${encodedTerm}` : encodedTerm;

@paulpilone paulpilone removed their assignment Sep 23, 2022
@paulpilone paulpilone added the ingest Related to the Ingest component label Sep 23, 2022
@paulineobps
Copy link
Collaborator Author

format 10.1021/acssensors.1c01685 is working now.

_This should still be looked at and fixed in a more permanent way. Is this robust enough now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingest Related to the Ingest component
Projects
None yet
Development

No branches or pull requests

2 participants