About

A simple process that pulls data from elastic -raw-* indices, takes the hits and mimics them as a fhir bundle and pushes to kafka for consuming into clickhouse.

HOW TO RUN IN QA / PROD

Step 1: Start migration / duplication process

Scale down reverse proxy to 0 (we will need to have clickhouse + clickhouse mapper running during the process, and we do not want new data to interfere with the migration).

docker service scale reverse-proxy_reverse-proxy-nginx=0

Create migration topic in kafka

docker exec kafka_kafka-01.1. ... /opt/bitnami/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --create --topic migration --partitions 3

Deploy clickhouse
Set KAFKA_2XX_TOPIC to migration in the relevant .env file
Set RAW_CONSUMER_GROUP_ID to clickhouse-migration in the relevant .env file

This is required before deploying the kafka-mapper-consumer service so we consume from the migration topic instead and so we do not influence the default xx group id.

Deploy kafka-mapper-consumer
Update ELASTIC_PASSWORD under docker/docker-compose.yml to the correct password
Copy this folder to the QA / PROD server

GLOBIGNORE='.git:.vscode' scp -r /path/to/folder/elastic-clickhouse-migrator/* user@ip-address:~/elastic-clickhouse-migrator

Deploy this code base as a service on the server. This is necessary since the networks are not attachable and to avoid creating a temporary attachable network we deploy this as a service, since that allows you to connect to the networks.

docker stack deploy -c docker/docker-compose.yml migration

Step 2: Cleanup

Once the service we deploy has exited double check the logs to make sure it exited due to finishing and not due to an error. If it was due to an error, investigate it, resovle any issues and then retry from step 1. You will need to remove clickhouse + clickhouse volumes on all nodes before starting step 1 again (just so we have a fresh clickhouse instance and not partial data). Also remove the migration stack since it won't restart if you redeploy the stack.

Check that the kafka topic has drained and so all the messages have been sent to clickhouse

docker exec kafka_kafka-01.1. ... /opt/bitnami/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group clickhouse-migration --describe

Remove the service we just deployed to migrate data

docker stack rm migration

Set KAFKA_2XX_TOPIC to 2xx and set RAW_CONSUMER_GROUP_ID to clickhouse-2xx.

docker service update kafka-mapper-consumer_kafka-mapper-consumer --env-add=KAFKA_2XX_TOPIC=2xx --env-add=RAW_CONSUMER_GROUP_ID=clickhouse-2xx

Remove the migration topic

docker exec kafka_kafka-01.1. ... /opt/bitnami/kafka/bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic migration

Scale reverse-proxy back again

docker service scale reverse-proxy_reverse-proxy-nginx=1

Step 3: Other

While this is not specific to the migration of data from clickhouse to elastic, clickhouse, under the cdr implementation, requires the dbt ofelia job to be running. So you'll need to redeploy that service. Just make sure you correctly set the SUBDOMAINS environment variable to include both ndr unique subdomains (clickhouse, superset) and cdr unique subdomains (kibana).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docker		docker
.gitignore		.gitignore
README.md		README.md
elastic.js		elastic.js
index.js		index.js
kafka.js		kafka.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

HOW TO RUN IN QA / PROD

Step 1: Start migration / duplication process

Step 2: Cleanup

Step 3: Other

About

Releases

Packages

Languages

arran-standish/eth-elastic-clickhouse-migrator

Folders and files

Latest commit

History

Repository files navigation

About

HOW TO RUN IN QA / PROD

Step 1: Start migration / duplication process

Step 2: Cleanup

Step 3: Other

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages