Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate OAIPMH server to Elasticsearch / Spring MVC #7583

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

josegar74
Copy link
Member

@josegar74 josegar74 commented Dec 28, 2023

This change requests migrates the OAIPMH server to Elasticsearch, previously in GeoNetwork 4.x using the search services, an exception was returned: ES search does not support OAI yet.

The API has been migrated from the old Jeeves API to Spring MVC with the endpoint http://localhost:8080/geonetwork/srv/api/oaipmh

Checklist

  • I have read the contribution guidelines
  • Pull request provided for main branch, backports managed with label
  • Good housekeeping of code, cleaning up comments, tests, and documentation
  • Clean commit history broken into understandable chucks, avoiding big commits with hundreds of files, cautious of reformatting and whitespace changes
  • Clean commit messages, longer verbose messages are encouraged
  • API Changes are identified in commit messages
  • Testing provided for features or enhancements using automatic tests)
  • User documentation provided for new features or enhancements in manual
  • Build documentation provided for development instructions in README.md files
  • Library management using pom.xml dependency management. Update build documentation with intended library use and library tutorials or documentation

@josegar74 josegar74 added this to the 4.4.2 milestone Dec 28, 2023
@josegar74 josegar74 requested a review from fxprunayre December 28, 2023 10:39
@josegar74 josegar74 added the api change Indicate a change in the API label Dec 28, 2023
@josegar74 josegar74 changed the title Migrate OAIPMH server to Elasticsearch Migrate OAIPMH server to Elasticsearch / Spring MVC Dec 28, 2023
@jahow
Copy link
Contributor

jahow commented Dec 28, 2023

Hi @josegar74, thanks a lot for the contribution! One of our customers was asking for this feature to be brought back in V4, so I think it would be fair if someone from Camptocamp was doing a review.

Can you please let us know if/when the feature is ready for review? Thanks

@josegar74
Copy link
Member Author

@jahow the feature is ready for review, thanks.

@fxprunayre fxprunayre modified the milestones: 4.4.2, 4.4.3 Jan 3, 2024
Copy link
Contributor

@jahow jahow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this work! The changes in general look OK to me, I think it's good that you are reusing most of the existing logic. I still noticed some bugs, see comments.

Here if a few things I have noticed while testing:

Optionally, since we're moving away from Jeeves, I've identifier a few things that could help reduce even more the dependency to this framework:


ObjectMapper objectMapper = new ObjectMapper();

String schema = (params.getChild(PARAM_METADATAPREFIX) != null) ? params.getChild(PARAM_METADATAPREFIX).getValue() : ISO19139SchemaPlugin.IDENTIFIER;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fallback seems useless to me since the request will give an error if no metadataPrefix parameter is given

@josegar74
Copy link
Member Author

Thanks for the review @jahow, I'll take a look in the coming days to fix the issues.

Related to this:

The query parameters (both their name and value) are case-sensitive: http://localhost:8080/geonetwork/srv/api/oaipmh?verb=listrecords will give an error, http://localhost:8080/geonetwork/srv/api/oaipmh?verb=ListRecords won't; I couldn't find whether this is spec-compliant, but it is definitely inconvenient (at least for verb)

I haven't change anything about this afaik. For sure, if it's not correct can be fixed, but not something related to the update to work with Elasticsearch.

@jahow
Copy link
Contributor

jahow commented Jan 9, 2024

I haven't change anything about this afaik. For sure, if it's not correct can be fixed, but not something related to the update to work with Elasticsearch.

Yes indeed, I understand it's not something that was touched by the migration. I was just mentioning it in case removing this inconvenience would be trivial, but I wouldn't say it's mandatory for this PR.

@josegar74
Copy link
Member Author

@jahow I have fixed the base url and also handled the identifier parameter (refactored the code to avoid usage of JDOM Element):


About the method toJeevesException is used in a harvester service: https://github.com/geonetwork/core-geonetwork/blob/main/harvesters/src/main/java/org/fao/geonet/services/harvesting/Info.java#L331-L332.

It is mapped in Jeeves with the name admin.harvester.info and apparently returns all sorts of information that can be relevant for harvesters, but to be honest I have no idea why OAIPMH server is used there. There is a OAIPHM harvester, but I don't see the relation between the harvester and the OAIPMH server.

Anyway, I prefer not to change this for now as it is not used in the OAIPMH server code.

Copy link
Contributor

@jahow jahow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @josegar74, everything works great now! SonarCloud is giving an error, not sure why it wasn't there before, but I guess this can be merged once the error is resolved.

@josegar74
Copy link
Member Author

@jahow and @juanluisrp I've added this change: 727e328, that restricts the allowed characters, but seems CodeQL, just checks the original value is from a user entry.

I've tried that change directly in the Lib.java methods, but same result.

I guess probably we can dismiss the alert as a false positive, but if you know a better way that can solve the issue?

@jahow
Copy link
Contributor

jahow commented Jan 12, 2024

@jahow and @juanluisrp I've added this change: 727e328, that restricts the allowed characters, but seems CodeQL, just checks the original value is from a user entry.

I've tried that change directly in the Lib.java methods, but same result.

I guess probably we can dismiss the alert as a false positive, but if you know a better way that can solve the issue?

I agree, we can probably flag this as false positive with your added check. In theory it would be better to not use user input at all, but seeing the process which leads to this error (44 steps!) I fear it would take this PR way too far.

@jahow
Copy link
Contributor

jahow commented Feb 6, 2024

@josegar74 could you please share the status of this PR? Do you still intend to merge this eventually? Thanks

@josegar74
Copy link
Member Author

@jahow I plan to merge it, but would like to try to solve the CodeQL issue reported.

It has been also done in my personal time, so need to fine time out of my work to finish it. But hopefully it should be available for 4.4.3

@fvanderbiest
Copy link

fvanderbiest commented Feb 6, 2024

Thanks Jose for your work on the subject !

At Camptocamp, we are expecting one of our customer to support this work.
If positive, we could probably help getting it merged.

@fxprunayre fxprunayre modified the milestones: 4.4.3, 4.4.4 Mar 13, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.4, 4.4.5 Apr 16, 2024
@fxprunayre fxprunayre modified the milestones: 4.4.5, 4.4.6 Jun 4, 2024
@aurisnoctis
Copy link

Hello people, do I understand right: if this pull request won't be merged, GeoNetwork 4.* will not be able to be harvested by the OAIPHM / OAI-PMH protocol?

I need to understand this as we are still using GeoNetwork 3 for compatibility with OAIPHM, which is the protocol used to harvest our GeoNetwork metadata catalogue and provide metadata to the public.

Thanks in advance!

@josegar74
Copy link
Member Author

josegar74 commented Aug 28, 2024

@aurisnoctis you are right, to be able to harvest GeoNetwork 4.2 / 4.4 using the OAIPHM / OAI-PMH protocol is required this pull request.

For now it seems no resources to review it.

@aurisnoctis
Copy link

@josegar74 @fvanderbiest

At Camptocamp, we are expecting one of our customer to support this work. If positive, we could probably help getting it merged.

Is it tied to financial issues? Can anybody put a price tag on this one so we can bring this up with our management at the German Meteorological Service?

@ticheler
Copy link
Member

Is it tied to financial issues? Can anybody put a price tag on this one so we can bring this up with our management at the German Meteorological Service?

Hi @aurisnoctis, Please contact me so we can discuss options around this code review Jose has been working on. You can send me an email at [email protected]

Kind regards,
Jeroen

@fvanderbiest
Copy link

@aurisnoctis feel free to drop us an email at [email protected] if you need an alternate quotation

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Indicate a change in the API backport 4.2.x changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants