Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rake - solr:reindex #102

Open
ewlarson opened this issue Oct 24, 2024 · 3 comments
Open

Rake - solr:reindex #102

ewlarson opened this issue Oct 24, 2024 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@ewlarson
Copy link
Contributor

Migrate/backport this rake task from GEOMG. But testing on production pgdump seeing this error:

rake aborted!9 Document:18E16E87-3A44-40F4-B9C7-4879F48E3C9F: |======================================================================                 | 30.83/s 86602/106455 81%  ETA: 00:10:44
NoMethodError: undefined method `dct_references_uri_key' for an instance of Kithe::Asset (NoMethodError)
@ewlarson ewlarson added the bug Something isn't working label Oct 24, 2024
@ewlarson ewlarson self-assigned this Oct 24, 2024
@ewlarson
Copy link
Contributor Author

Adding a Rails-ish reindexing task with a rescue to try and capture whatever is amiss here.

@ewlarson
Copy link
Contributor Author

Seeing just 1 document error...

Processed 1000 documents in this batch, total processed: 82000
Processed 1000 documents in this batch, total processed: 83000
Processed 1000 documents in this batch, total processed: 84000
Processed 1000 documents in this batch, total processed: 85000
Processed 1000 documents in this batch, total processed: 86000
Error updating index for document: 0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e
undefined method `dct_references_uri_key' for an instance of Kithe::Asset
Processed 1000 documents in this batch, total processed: 87000
Processed 1000 documents in this batch, total processed: 88000
Processed 1000 documents in this batch, total processed: 89000
Processed 1000 documents in this batch, total processed: 90000
Processed 1000 documents in this batch, total processed: 91000
Processed 1000 documents in this batch, total processed: 92000
Processed 1000 documents in this batch, total processed: 93000
Processed 1000 documents in this batch, total processed: 94000

From rails console...

irb(main):004> d = Document.find_by_friendlier_id("0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e")
  Document Load (2.2ms)  SELECT "kithe_models".* FROM "kithe_models" WHERE "kithe_models"."type" = $1 AND "kithe_models"."friendlier_id" = $2 LIMIT $3  [["type", "Document"], ["friendlier_id", "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e"], ["LIMIT", 1]]
=> 
#<Document:0x000000013afb21c0
...
irb(main):005> d
=> 
#<Document:0x000000013afb21c0
 id: "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558",
 title: "Moral statistics [France] {1833}",
 type: "Document",
 position: nil,
 json_attributes: "[FILTERED]",
 created_at: Thu, 29 Feb 2024 08:44:17.000000000 CST -06:00,
 updated_at: Fri, 01 Mar 2024 17:18:40.720658000 CST -06:00,
 parent_id: nil,
 friendlier_id: "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e",
 file_data: nil,
 representative_id: "461ee342-dcf9-432e-b977-0f7dcce15085",
 leaf_representative_id: "461ee342-dcf9-432e-b977-0f7dcce15085",
 kithe_model_type: "work",
 import_id: 112,
 publication_state: "published",
 dct_title_s: "Moral statistics [France] {1833}",
 dct_alternative_sm: ["Guerry"],
 dct_description_sm: ["Moral statistics of France (Guerry, 1833)"],
 dct_language_sm: ["eng"],
 gbl_displayNote_sm: [],
 dct_creator_sm: [],
 dct_publisher_sm: [],
 schema_provider_s: "GeoDa Data and Lab",
 gbl_resourceClass_sm: ["Datasets"],
 gbl_resourceType_sm: [],
 dct_subject_sm: [],
 dcat_theme_sm: [],
 dcat_keyword_sm: [],
 dct_temporal_sm: ["1833"],
 dct_issued_s: "",
 gbl_indexYear_im: [1833],
 gbl_dateRange_drsim: ["1833-1833"],
 dct_spatial_sm: ["France"],
 locn_geometry: "POLYGON((-5.45 51.31, 9.83 51.31, 9.83 41.26, -5.45 41.26, -5.45 51.31))",
 dcat_bbox: "-5.45,41.26,9.83,51.31",
 dcat_centroid: "46.285,2.19",
 gbl_georeferenced_b: nil,
 dct_relation_sm: [],
 pcdm_memberOf_sm: ["b0153110-e455-4ced-9114-9b13250a7093"],
 dct_isPartOf_sm: ["12d-05"],
 dct_source_sm: [],
 dct_isVersionOf_sm: [],
 dct_replaces_sm: [],
 dct_isReplacedBy_sm: [],
 dct_rights_sm: [],
 dct_rightsHolder_sm: [],
 dct_license_sm: [],
 dct_accessRights_s: "Public",
 dct_format_s: "Shapefile",
 gbl_fileSize_s: "",
 b1g_creatorID_sm: [],
 b1g_geonames_sm: [],
 gbl_wxsIdentifier_s: "",
 geomg_id_s: "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e",
 dct_identifier_sm: [],
 gbl_suppressed_b: nil,
 date_created_dtsi: Thu, 29 Feb 2024 08:44:17.000000000 CST -06:00,
 date_modified_dtsi: nil,
 b1g_language_sm: [],
 b1g_image_ss: "",
 b1g_code_s: "12d-05",
 b1g_dct_accrualMethod_s: "Manual",
 b1g_dct_accrualPeriodicity_s: "",
 b1g_dateAccessioned_sm: ["2024-02-29"],
 b1g_dateRetired_s: "",
 b1g_status_s: "",
 b1g_publication_state_s: "published",
 b1g_child_record_b: nil,
 b1g_dct_mediator_sm: [],
 b1g_access_s: "",
 dct_references_s:
  [#<Document::Reference:0x000000013dd99b60
    @attributes={"value"=>"https://geo.btaa.org/uploads/asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html", "category"=>"documentation_download"}>,
   #<Document::Reference:0x000000013dd999a8 @attributes={"value"=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip", "category"=>"download"}>,
irb(main):006> d.save


Document#references > seeded: {"http://lccn.loc.gov/sh85035852"=>["https://geo.btaa.org/uploads/asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html"], "http://schema.org/downloadUrl"=>["https://geodacenter.github.io/data-and-lab/data/guerry.zip"], "http://schema.org/url"=>["https://geodacenter.github.io/data-and-lab/Guerry/"]}
Document#dct_downloads > init: ["https://geodacenter.github.io/data-and-lab/data/guerry.zip"]


Document#multiple_downloads > aardvark: [{:label=>"Original Shapefile", :url=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip"}]


  TRANSACTION (0.4ms)  BEGIN
  DocumentDownload Load (10.7ms)  SELECT "document_downloads".* FROM "document_downloads" WHERE "document_downloads"."friendlier_id" = $1  [["friendlier_id", "0745f15d-b3e9-4a3d-aee7-4dfc47ff2a6e"]]
Document#dct_downloads > document_downloads: [{:label=>"Original Shapefile", :url=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip"}]


  Kithe::Asset Load (0.7ms)  SELECT "kithe_models"."id", "kithe_models"."title", "kithe_models"."type", "kithe_models"."position", "kithe_models"."json_attributes", "kithe_models"."created_at", "kithe_models"."updated_at", "kithe_models"."parent_id", "kithe_models"."friendlier_id", "kithe_models"."file_data", "kithe_models"."kithe_model_type", "kithe_models"."import_id", "kithe_models"."publication_state" FROM "kithe_models" WHERE "kithe_models"."type" IN ($1, $2) AND "kithe_models"."parent_id" = $3  [["type", "Kithe::Asset"], ["type", "Asset"], ["parent_id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
  Kithe::Model Load (0.8ms)  SELECT "kithe_models".* FROM "kithe_models" WHERE "kithe_models"."id" = $1  [["id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
  TRANSACTION (0.2ms)  ROLLBACK
/Users/ewlarson/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/activemodel-7.0.8.6/lib/active_model/attribute_methods.rb:450:in `method_missing': undefined method `dct_references_uri_key' for an instance of Kithe::Asset (NoMethodError)
irb(main):007> d.dct_references_s
=> 
[#<Document::Reference:0x000000013dd99b60
  @attributes={"value"=>"https://geo.btaa.org/uploads/asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html", "category"=>"documentation_download"}>,
 #<Document::Reference:0x000000013dd999a8 @attributes={"value"=>"https://geodacenter.github.io/data-and-lab/data/guerry.zip", "category"=>"download"}>,
 #<Document::Reference:0x000000013dd997f0 @attributes={"value"=>"https://geodacenter.github.io/data-and-lab/Guerry/", "category"=>"documentation_external"}>]
irb(main):008> d.assets
/Users/ewlarson/.rbenv/versions/3.3.0/lib/ruby/gems/3.3.0/gems/activemodel-7.0.8.6/lib/active_model/attribute_methods.rb:450:in `method_missing': undefined method `assets' for an instance of Document (NoMethodError)
Did you mean?  asset!
               asset?
irb(main):009> d.document_assets
  Kithe::Asset Load (1.1ms)  SELECT "kithe_models"."id", "kithe_models"."title", "kithe_models"."type", "kithe_models"."position", "kithe_models"."json_attributes", "kithe_models"."created_at", "kithe_models"."updated_at", "kithe_models"."parent_id", "kithe_models"."friendlier_id", "kithe_models"."file_data", "kithe_models"."kithe_model_type", "kithe_models"."import_id", "kithe_models"."publication_state" FROM "kithe_models" WHERE "kithe_models"."type" IN ($1, $2) AND "kithe_models"."parent_id" = $3  [["type", "Kithe::Asset"], ["type", "Asset"], ["parent_id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
  Kithe::Model Load (0.5ms)  SELECT "kithe_models".* FROM "kithe_models" WHERE "kithe_models"."id" = $1  [["id", "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558"]]
=> 
[#<Kithe::Asset:0x000000013c698498
  id: "461ee342-dcf9-432e-b977-0f7dcce15085",
  title: "Guerry_documentation.html",
  type: "Kithe::Asset",
  position: 1,
  json_attributes: nil,
  created_at: Fri, 01 Mar 2024 13:56:27.378069000 CST -06:00,
  updated_at: Fri, 01 Mar 2024 13:56:27.480131000 CST -06:00,
  parent_id: "d06ba0b9-53e6-4c3e-a5ad-518c0d01f558",
  friendlier_id: "hd9mhb9ky",
  file_data:
   {"id"=>"asset/461ee342-dcf9-432e-b977-0f7dcce15085/d7fed7dd22c9dbcba0fd8a296c79ae02.html",
    "storage"=>"store",
    "metadata"=>{"size"=>17577, "width"=>nil, "height"=>nil, "filename"=>"Guerry_documentation.html", "mime_type"=>"text/html"}},
  kithe_model_type: "asset",
  import_id: nil,
  publication_state: "draft">]

@ewlarson
Copy link
Contributor Author

Okay... so turns out we had one unexpected model type in the database — perhaps from before our DocumentAssets work
was fully baked.

{"count"=>35534, "type"=>"Asset"}
{"count"=>106455, "type"=>"Document"}
{"count"=>1, "type"=>"Kithe::Asset"}

In our database would should only have Documents and Assets. The Kithe::Asset is technically the super class of our Assets model.

Removing the Kithe::Asset from the database resolves this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant