Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi data source search #102

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open

Conversation

khaledk2
Copy link
Collaborator

@khaledk2 khaledk2 commented Sep 30, 2024

This PR introduces the basis for searching the data from multi-data sources.
The current status:

  • It can restore the database from a backup.
  • It queries the databases and pushes the data to the Elatsicsearch indices
  • The unit test has been modified to restore and index two OMERO databases and then run all tests inside the test units.
  • There is a new endpoint to return the available data sources i.e./api/v1/resources/data_resources/.
  • It can return the results from all sources or limit the search to a specific data source
  • It has been deployed successfully on pilot-es1-omeroreadwrite

The indexing process (getting the data into the elasticsearch indices) supports querying the OMERO databases and then indexing the returned data.
We should account for cases in which we do not have access to the databases or backups.

  • The searchengine supports indexing the data from CSV files.
    • This feature has not been used for a long time and needs to be maintained and fixed to cope with modifications.
  • In addition, we should support accepting the data directly from JSON or maybe calling API to retrieve the data and then index it.

# set up the seond database
rm app_data/omero_db_searchengine.zip
rm app_data/omero.pgdump
wget https://github.com/khaledk2/ice-archh-64/releases/download/new_re_db/omero_train.zip -P app_data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-term I assume this is an issue. I wonder to what extent we couldn't try omero-test-infra for the github tests and then do something larger on our own systems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants