-
Notifications
You must be signed in to change notification settings - Fork 7
Comprehensive Guide to Setting Up the BIE Kafka Service
This document is intended to describe how to set up the BIE Kafka Service after the decommissioning of the VRO platform.
BIE Kafka is a Java/Spring Boot service which subscribes to Kafka streams to collect, persist, and rebroadcast contention events over RabbitMQ. As of the decommissioning of the VRO platform, this service did not expose an API for querying events from the database; persisting events to a Postgres database was introduced to demonstrate what functionality this services offered as part of the VRO platform.
Standing up instances of the BIE Kafka service and creating applications to consume contention events exchanges provides the opportunity to perform data analysis or validations on those events. Some proposed examples of what this data could be used for includes tracking and providing metrics on claims over their lifecycle or analyzing the accuracy of suggested contention classifications.
BIE Kafka is a Java application which consumes events from a Kafka broker, persists them to a Postgres database, and then rebroadcasts the events over RabbitMQ. As part of the VRO platform, the app is built with Gradle and run in a docker container alongside Postgres and RabbitMQ, but the application can be reconfigured to point to external resources instead. When deployed to LHDI (as described here), the service should have connectivity established with Kafka brokers which broadcast BIE contention events.
The following software is required to run BIE Kafka Service:
- Java 17+;
- Gradle 8.10.2 (or latest with additional testing);
- Docker and a running daemon, such as with Docker Desktop;
- RabbitMQ, or a pod running a RabbitMQ broker;
- Postgres, or a pod running a Postgres instance.
The Kafka broker endpoints are defined in src/main/resources/application-{env}.yaml
and point to resources only available on VA's intranet. Also defined in the properties files are the keystore and truststore configurations required for connecting to the kafka brokers: more information about creating these certificates can be found here, and documentation on LHDI connectivity with Kafka can be found here.
More information on running the VRO platform is available here.
1. Clone the platform and dev secrets repositories
git clone https://github.com/department-of-veterans-affairs/abd-vro.git
git clone https://github.com/department-of-veterans-affairs/abd-vro-dev-secrets.git
cd abd-vro
svc-bie-kafka
was removed with this PR; uncomment thesvc-bie-kafka
configuration indocker-compose.yml
as well asinclude :shared:lib-bie-kafka
andinclude :svc-bie-kafka
insettings.gradle
if they are commented out.
2. Build the containers
./gradlew clean
./gradlew build check docker
3. Run the platform base
source scripts/setenv.sh
COMPOSE_PROFILES="platform,svc" ./gradlew dockerComposeUp
4. Run svc-bie-kafka
COMPOSE_PROFILES="kafka" ./gradlew dockerComposeUp
curl http://localhost:10301/actuator/health
you should expect a successful 200 JSON response indicating
"status": "UP"
.
Being a Spring Boot application, configurations are defined under src/main/resources
including environment-specific configurations. Some properties are defined statically per deployment environment, while others are resolved via environment variables. The helm charts are responsible for injecting certain environment variables into the pods, such as ENV
or POSTGRES_URL
; most of these are handled using Vault secrets. The helm charts are setup to include secrets templates using the argo-vault
plugin, which fetches Vault secrets to be used in our helm configuration.
Secrets are defined as part of abd-vro-dev-secrets and used when running VRO services locally. For deployed instances, our helm charts are configured to fetch secrets from Vault and inject them as environment variables into our pods. More information on how the application interacts with Vault and how to configure Vault secrets can be found here.
As of VRO's decommissioning there were no applications connected to BIE Kafka's service. Previously, domain-xample was its only consumer and acted as an example application for partner teams to mimic if they required contention event information, but the service was shut down in favor of moving the functionality to the BIE Kafka service itself. Prior to that, domain-xample
was listening to the RabbitMQ exchanges created by BIE Kafka and persisting the contention events sent over the exchanges to a Postgres database.
This contention event queues user guide can be referenced if a new application is to be created to consume contention events rebroadcasted by BIE Kafka. The list of all BIE contention events being consumed over Kafka is also available here.
Urgent requests or questions should be submitted via Slack at #bip-support-ext-vro. For non-urgent requests, use this request form. See this link for more information.
- Certificates will need to be renewed occasionally, typically annually, which requires following the procedures described here.
- If SecRel is used in the pipeline to create images for deploying to LHDI, VRO has found that signed images will expire after some time and unhealthy pods will not be able to restart. Regular deployments should be made to ensure image signatures do not expire.
- Postgres/RDS instances should be monitored for disk usage, and excessive records should be archived periodically. Database tables should be backed up regularly to prevent loss of data.
Below are some metrics on the number of records and the amount of storage used for contention events collected by BIE Kafka Service across VRO's deployments.
develop
vro=> select event_type, count(1) from claims.bie_contention_event group by event_type;
event_type | count
-----------------------+--------
CONTENTION_ASSOCIATED | 136246
CONTENTION_CLASSIFIED | 2367
CONTENTION_COMPLETED | 10453
CONTENTION_DELETED | 543
CONTENTION_UPDATED | 964610
(5 rows)
--
vro=> SELECT pg_size_pretty(pg_total_relation_size('claims.bie_contention_event'));
pg_size_pretty
----------------
265 MB
(1 row)
--
vro=> select min(date_added), max(date_added) from claims.bie_contention_event where date_part('year', date_added) > 2000;
min | max
---------------------+---------------------
2016-10-22 15:09:14 | 2024-11-13 21:19:45
(1 row)
qa
vro=> select event_type, count(1) from claims.bie_contention_event group by event_type;
event_type | count
-----------------------+----------
CONTENTION_ASSOCIATED | 614072
CONTENTION_CLASSIFIED | 2694
CONTENTION_COMPLETED | 13737
CONTENTION_DELETED | 1831
CONTENTION_UPDATED | 40840882
(5 rows)
--
vro=> SELECT pg_size_pretty(pg_total_relation_size('claims.bie_contention_event'));
pg_size_pretty
----------------
11 GB
(1 row)
--
vro=> select min(date_added), max(date_added) from claims.bie_contention_event ;
min | max
---------------------+---------------------
2021-06-26 05:00:00 | 2024-11-14 19:25:28
(1 row)
sandbox
vro=> select event_type, count(1) from claims.bie_contention_event group by event_type;
event_type | count
-----------------------+---------
CONTENTION_ASSOCIATED | 36113
CONTENTION_CLASSIFIED | 540
CONTENTION_COMPLETED | 2448
CONTENTION_DELETED | 14589
CONTENTION_UPDATED | 9112083
(5 rows)
--
vro=> SELECT pg_size_pretty(pg_total_relation_size('claims.bie_contention_event'));
pg_size_pretty
----------------
2076 MB
(1 row)
--
vro=> select min(date_added), max(date_added) from claims.bie_contention_event ;
min | max
---------------------+---------------------
2021-06-26 05:00:00 | 2024-12-02 19:41:55
(1 row)
prod-test
The prod-test dev-tools
pod is currently down so RDS is inaccessible; this region will be revisited once the deployment is healthy again.
prod
vro=> select event_type, count(1) from claims.bie_contention_event group by event_type;
event_type | count
-----------------------+-----------
CONTENTION_ASSOCIATED | 256749914
CONTENTION_CLASSIFIED | 3787794
CONTENTION_COMPLETED | 5326871
CONTENTION_DELETED | 810682
CONTENTION_UPDATED | 78112043
(5 rows)
--
vro=> SELECT pg_size_pretty(pg_total_relation_size('claims.bie_contention_event'));
pg_size_pretty
----------------
97 GB
(1 row)
--
vro=> select min(date_added), max(date_added) from claims.bie_contention_event where date_part('year', date_added) > 2000;
min | max
---------------------+---------------------
2022-05-15 12:29:01 | 2024-12-02 19:03:08
(1 row)