-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solr support in AtoM #1817
Open
anvit
wants to merge
106
commits into
qa/2.x
Choose a base branch
from
dev/solr-plugin-wip
base: qa/2.x
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Solr support in AtoM #1817
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
anvit
force-pushed
the
dev/solr-plugin-wip
branch
from
May 16, 2024 17:36
81d9072
to
aa74860
Compare
anvit
force-pushed
the
dev/solr-plugin-wip
branch
from
May 16, 2024 17:42
aa74860
to
2ea4787
Compare
melaniekung
force-pushed
the
dev/solr-plugin-wip
branch
4 times, most recently
from
May 23, 2024 14:20
5102508
to
db7a3b7
Compare
anvit
force-pushed
the
dev/solr-plugin-wip
branch
from
May 23, 2024 19:02
bfeeaad
to
9b2384c
Compare
melaniekung
force-pushed
the
dev/solr-plugin-wip
branch
4 times, most recently
from
May 31, 2024 09:30
ce00129
to
7e5d6d7
Compare
melaniekung
force-pushed
the
dev/solr-plugin-wip
branch
6 times, most recently
from
June 6, 2024 07:10
2443cec
to
2990cdb
Compare
anvit
force-pushed
the
dev/solr-plugin-wip
branch
4 times, most recently
from
June 15, 2024 00:17
5f1265a
to
b663d51
Compare
anvit
force-pushed
the
dev/solr-plugin-wip
branch
2 times, most recently
from
June 18, 2024 22:55
da343ae
to
ca14a5b
Compare
melaniekung
force-pushed
the
dev/solr-plugin-wip
branch
from
June 19, 2024 13:33
2fd485a
to
3d605c3
Compare
Update arSolrRangeQuery and ArSolrRangeQueryTest to account for types
Update arSolrBoolQuery to use the query params for each of the clauses instead of using edismax queries to extend support for query types that do not use edismax. Also change the _addQuery method to allow all queries that are instances of arSolrAbstractQuery instead of just arSolrQuery
Add support for sorting and aggregations to arSolrBoolQuery. TODO: - Add tests for arSolrBoolQuery
Add arSolrTermsQuery and associated tests. Also fix typo in a property name in arSolrTermQuery.
Add arSolrIdsQuery and associated tests
Add a method to arSolrBoolQuery that sets the types for its child queries. Also add a method for setting filters for bool queries.
Add a method that appends types to the aggregations before the query params are generated.
Add a metod to remove any term queries with a given field from the must clause in arSolrBoolQuery.
Change the generateQueryString method in arSolrPluginUtil to create a solr query from the input string and rename the method to generateQuery.
Set the version param from hit in arSolrResult
Change arSolrQuery to arSolrStringQuery to avoid confusion with Elastica's Query
Refactor code in arSolrPlugin that talked to solr into arSolrClient. Also renamed the query folder to client for clarity, and fixed a bug that was skipping over autocomplete fields in arSolrResult.
anvit
force-pushed
the
dev/solr-plugin-wip
branch
from
August 29, 2024 20:59
fd9068d
to
d1ceb6c
Compare
anvit
force-pushed
the
dev/solr-plugin-wip
branch
from
August 29, 2024 21:58
23f69d1
to
f0876e4
Compare
Update arSolrPluginQuery to use a single bool query directly instead of using a query container and a separate boolean query.
Add updateDocument and updateDocumentById to arSolrClient which enable updating existing documents in the solr index. Also add functions to arSolrPlugin to call these from AtoM.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Work in progress branch for adding support for Solr for searching within AtoM
Completed:
http://localhost:8983/solr
. The solr dashboard also allows searching the indexed data.arSolrPlugin/lib/client
folder. The query classes essentially set up query parameters for API requests to Solr,arSolrClient
accepts configuration which would allow it to communicate with Solr, and has methods which allow sending different API requests to Solr.Work in progress:
arSolrSearchTask
is CLI task allows searching the solr index for a few query types. Since queries can get fairly complicated, especially with Boolean queries, this was meant for quick cli testing until Solr was officially supported by the AtoM interface, an so it isn't very customizable. However this could potentially be useful for writing tests in the future.TODO
Within arSolrPlugin
High priority (essential for browse or search actions):
arSolrPlugin
): Currently username and password are ignored as the current solr setup doesn't set those up either.arSolrPluginQuery
): Since there is no nested query class for solr yet, this will need to be updated once that functionality is in place.Medium priority (not essential for basic search but still important):
arSolrPlugin.class
): This class will need a method to handle updating specific documents by query.arSolrPlugin.class
)arSolrPlugin.class
): Solr doesn't have a default pt_BR analyzer but has specific filter classes we can use.arSolrPlugin.class
): Will need to use Apache Tika to work with external docs.Low priority (used by CLI tasks or other non search specific actions within AtoM):
Elastica\Scroll
(arSolrPluginUtil
) : This doesn't have a solr equivalent and will need to be handled.apps/qubit/modules/search/actions/autocompleteAction.class.php
)lib/job/arUpdatePublicationStatusJob.class.php
, https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-scripting.html)Lowest priority (good to have features):
Outside arSolrPlugin
AtoM extensively references Elastica, and the arElasticSearchPlugin is also deeply integrated into it. As of now, this is a list of all of the places outside the plugin itself that would need updates:
apps/qubit/modules/digitalobject/actions/imageflowComponent.class.php
uses arElasticSearchPluginQuery, QubitSearch.apps/qubit/modules/clipboard/actions/viewAction.class.php
uses Elastica ResultSet, Response, Query, QueryTerms, QubitSearchPager, arElasticSearchPluginConfiguration.apps/qubit/modules/default/actions/moveAction.class.php
uses Elastica Query, BoolQuery, QueryTerm, QubitSearchPager, arElasticSearchPluginUtil, arElasticSearchPluginConfiguration.apps/qubit/modules/default/actions/fullTreeViewAction.class.php
uses Elastica QueryTerm, Elastica ResultSet (as arguments to methods), has several method names which reference ElasticSearch, arElasticSearchPluginQuery.apps/qubit/modules/default/actions/browseAction.class.php
uses arElasticSearchPluginQuery, arElasticSearchPluginConfiguration, QubitSearch.👆🏼 NOTE: replace L#134-L#147 (the section that essentially removes must clauses for i18n.languages queries) with a call to the
removeMustWithTermField
method inarSolrBoolQuery
apps/qubit/modules/repository/actions/holdingsAction.class.php
uses Elastica QueryBool, QueryMatchAll, QueryTerm, Query, QubitSearch, arElasticSearchPluginConfiguration.apps/qubit/modules/repository/actions/browseAction.class.php
uses Elastica QueryMatchAll, Query, QueryTerm, arElasticSearchPluginUtil, QubitSearch.apps/qubit/modules/repository/actions/maintainedActorsAction.class.php
uses Elastica Query, QueryTerm, QubitSearch, QubitSearchPager, arElasticSearchPluginConfiguration.apps/qubit/modules/taxonomy/actions/indexAction.class.php
uses Elastica Query, BoolQuery, QueryTerm, arElasticSearchPluginUtil, arElasticSearchPluginConfiguration, QubitSearch, QubitSearchPager.apps/qubit/modules/actor/actions/browseAction.class.php
uses Elastica BoolQuery, QueryTerm, QueryExists, NestedQuery, arElasticSearchPluginUtil, QubitSearch, QubitSearchPager.apps/qubit/modules/actor/actions/relatedInformationObjectsAction.class.php
uses Elastica Query, BoolQuery, QueryTerm, NestedQuery, QubitSearchPager, QubitSearch, arElasticSearchPluginConfiguration.apps/qubit/modules/search/actions/errorAction.class.php
uses Elastica Exception, references ElasticSearch in error message.apps/qubit/modules/search/actions/indexAction.class.php
uses Elastica QueryTerm, QubitSearch, arElasticSearchPluginUtil.apps/qubit/modules/search/actions/autocompleteAction.class.php
uses Elastica Search, MultiSearch, Query, BoolQuery, Match, Term, QubitSearch.apps/qubit/modules/search/actions/descriptionUpdatesAction.class.php
uses Elastica Query, BoolQuery, QueryTerm, QueryRange, QubitSearch, QubitSearchPager, arElasticSearchPluginConfiguration.apps/qubit/modules/term/actions/navigateRelatedComponent.class.php
uses Elastica QueryTerm, QubitSearch, arElasticSearchPluginQuery.apps/qubit/modules/term/actions/indexAction.class.php
uses Elastica QueryTerms, Query, BoolQuery, QueryTerm, QubitSearch, QubitSearchPager.apps/qubit/modules/informationobject/actions/inventoryAction.class.php
uses Elastica BoolQuery, Query, QueryTerm, QueryTerms, QubitSearch, QubitSearchPager, arElasticSearchPluginConfiguration.apps/qubit/modules/informationobject/actions/autocompleteAction.class.php
uses Elastica Query, BoolQuery, MatchAll, QueryTerm, arElasticSearchPluginUtil, QubitSearch, QubitSearchPager.lib/filter/QubitMeta.class.php
references Elastica Exception.lib/QubitLftSyncer.class.php
uses Elastica Bulk, QueryTerm, Document, QubitSearch, arElasticSearchPluginQuery.lib/search/QubitSearchPager.class.php
uses Elastica ResultSet.lib/helper/QubitHelper.php
references Elastica Result.lib/job/arUpdateEsActorRelationsJob.class.php
references Elastica exception, QubitSearch, arElasticSearchActorPdo.lib/job/arActorExportJob.class.php
uses Elastica QueryTerms, arElasticSearchPluginUtil, QubitSearch.lib/job/arRepositoryCsvExportJob.class.php
uses Elastica QueryTerms, arElasticSearchPluginQuery, arElasticSearchPluginUtil, QubitSearch.lib/job/arUpdatePublicationStatusJob.class.php
uses Elastica AbstractScript, QueryTerm, QubitSearch.lib/job/arInformationObjectExportJob.class.php
uses Elastica QueryTerm, QueryTerms, arElasticSearchPluginUtil, arElasticSearchPluginQuery, QubitSearch.lib/task/tools/updatePublicationStatusTask.class.php
uses Elastica AbstractScript, QueryTerm, QubitSearch.lib/task/propel/propelGenerateSlugsTask.class.php
uses Elastica Query, BoolQuery, QueryTerm, QubitSearch.lib/model/QubitInformationObject.php
uses Elastica BoolQuery, Query, QueryMatch, QubitSearch.lib/model/QubitTerm.php
uses Elastica BoolQuery, QueryTerm, QubitSearch.lib/task/search/arSearchStatusTask.class.php
uses arElasticSearchPluginConfiguration, looks for class names starting with arElasticSearch in objectsAvailableToIndex.lib/task/tools/installTask.class.php
uses arElasticSearchPluginConfiguration.lib/job/arUpdateEsIoDocumentsJob.class.php
uses arElasticSearchInformationObject.lib/job/arUpdateEsActorRelationsJob.class.php
uses arElasticSearchActorPdo.lib/job/arActorExportJob.class.php
uses arElasticSearchPluginUtil, arElasticSearchPluginQuery.lib/arInstall.class.php
references arElasticSearchPlugin's search.yml and uses arElasticSearchConfigHandler.lib/task/import/csvImportTask.class.php
uses arElasticSearchInformationObjectPdo, QubitSearch.lib/QubitMetsParser.class.php
uses arElasticSearchPluginUtil.lib/search/QubitSearch.class.php
uses arElasticSearchPlugin.lib/search/QubitSearchEngine.class.php
references ElasticSearch.lib/QubitFlatfileImport.class.php
references ElasticSearch.lib/task/propel/propelGenerateSlugsTask.class.php
references ElasticSearchconfig/ProjectConfiguration.class.php
sets up arElasticSearchPlugin.plugins/qbAclPlugin/lib/QubitAclSearch.class.php
uses Elastica Query, BoolQuery, QueryTerm.plugins/sfSkosPlugin/test/unit/importTest.php
uses Elastica Exception, QubitSearch.plugins/arRestApiPlugin/lib/QubitApiAction.class.php
uses Elastica Query.plugins/arRestApiPlugin/modules/api/actions/informationobjectsBrowseAction.class.php
uses arElasticSearchPluginConfiguration, arElasticSearchPluginQuery.plugins/qtAccessionPlugin/modules/accession/actions/browseAction.class.php
uses Elastica Query, BoolQuery, QueryMatchAll, QubitSearch, QubitSearchPager, arElasticSearchPluginUtil, arElasticSearchPluginConfiguration.test/unit/escapeTermTest.php
tests arElasticSearchPluginUtil::escapeTermIn addition to the list above, other tasks that would need to be completed in order to switch to Solr: