Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database/Solr - Audit task to ensure these are sync'd appropriately #104

Open
ewlarson opened this issue Oct 25, 2024 · 0 comments
Open
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@ewlarson
Copy link
Contributor

ewlarson commented Oct 25, 2024

A long-standing issue with Kithe and GeoBlacklight Admin (and GEOMG beforehand), is that there are situations where you can save an object to the database, but the background process that indexes that record into Solr can fail — usually because of locn_geometry parsing issues.

We need to write a little task to audit the database entries and the solr entries and produce a diff.

Solr's /export handler looks promising:
https://solr.apache.org/guide/solr/latest/query-guide/exporting-result-sets.html

This will require a Solr schema config change, to copy geomg_id_s into a docValues field. Afterwards you can run:
http://localhost:8983/solr/blacklight-core/export?q=*:*&sort=geomg_id_sdv+asc&fl=geomg_id_sdv

and results will be JSON like this:

{
  "responseHeader":{"status":0},
  "response":{
    "numFound":23568,
    "docs":[{
        "geomg_id_sdv":"000894F6-E513-4D7C-BF72-1CB52D29D5B1"}
      ,{
        "geomg_id_sdv":"00090357-0df0-4e33-9bc8-aa3ce425ef09"}
      ,{
        "geomg_id_sdv":"000bf346-0aa5-40f2-8bb4-291197264a5e"}
      ,{
        "geomg_id_sdv":"00168679-2f35-4e6d-94d6-9b63bbefe685"}
      ,{
        "geomg_id_sdv":"00203c7f-b08b-46bb-a650-7c6e7925a554"}
      ,{
        "geomg_id_sdv":"00343406b1164a4690f23c307c25d679_3"}
      ,{
        "geomg_id_sdv":"0035018d-63a8-4682-95e5-d1c3d4104a7d"}
      ,{
        "geomg_id_sdv":"003a2c591c554cf3a116a113aa3c134a_0"}
      ,{
        "geomg_id_sdv":"003aa8db-4594-44bc-90f6-ebbac01d40de"}
      ,{
        "geomg_id_sdv":"003e5438-86a7-4cef-8a94-af364e25fd97"}
      ,{
        "geomg_id_sdv":"004156e4-173b-45b8-b9c4-5106a7deffbb"}

Similarly, we'll want to produce a sorted list of geomg_id_s values from PostgreSQL for comparison.

select friendlier_id
from kithe_models
where kithe_models.type = 'Document'
order by friendlier_id asc
limit 1000

Which returns...

000894F6-E513-4D7C-BF72-1CB52D29D5B1
00090357-0df0-4e33-9bc8-aa3ce425ef09
000a642584aa4b5c9485b6f17dc977a1_0
000b18d6-63b9-4314-9d4f-17c945ea09b7
000bf346-0aa5-40f2-8bb4-291197264a5e
0011D7A3-0EC0-4B1D-AF20-C055274B6DAE
00138f08-1327-4ae1-9b9e-9794140059eb
00140c37d72141eea4917e22817fe364_10
0015a391-7b23-4e4a-9b21-fff782c96e01
...
@ewlarson ewlarson added bug Something isn't working enhancement New feature or request labels Oct 25, 2024
@ewlarson ewlarson self-assigned this Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant