Skip to content

What is the reason to use the Postgres id column for the _id field in Elasticsearch? #1947

Closed Answered by obulat
sarayourfriend asked this question in Q&A
Discussion options

You must be logged in to vote

As far as I understand from How Indexing works diagram1, the _id was used to determine whether the items in the database need to be synced with the elasticsearch index. We have migrated to refreshing all data instead of trying to sync only the latest items.

In fact, we have a [now obsolete, I think] comment in the ingestion_server/indexer.py describing this process:

"""
A utility for indexing data to Elasticsearch.
For each table to sync, find its largest ID in database. Find the corresponding largest
ID in Elasticsearch. If the database ID is greater than the largest corresponding

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@sarayourfriend
Comment options

Answer selected by zackkrida
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants