Improve catalog information synchronisation with GraphQL #507
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
1. Context
Currently, we have the possibility to add tags and description to a topic and ns4kafka synchronizes these information with Confluent Cloud catalog information (tags & description). This synchronisation is performed with the Stream Catalog API (Stream Catalog documentation).
However, this API is not suited for the synchronisation because of the 500 topics limit per call, and the fact that there is no filter allowing to query only topics with a description (though it is possible to query only topics with at least one tag). These drawbacks lead to having to query all the cluster topics 500 per 500 in order to perform the synchronization, which is quite bad for performance.
2. Proposed implementation
This PR adds GraphQL API calls to query the topic lists with their tags and description. This API overcomes the two previous problems: it is possible to query the list of topics with a description, and query the list of topics with at least one tag, without pagination.
This API still has drawbacks:
3. Other
sync-catalog
property which must be set totrue
in order to allow the synchronization of catalog information in Confluent Cloud. It is set atfalse
by default.