-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to make routes exclusive #17040
Comments
Discussed this briefly today. A couple of thoughts:
|
Another use case would be for someone trying to migrate exclusion filters from Datadog. We create routes with filters and add a sample transform with different rate on each individual routes. When doing exclusion filtering using this approach we need a way to ensure events aren't duplicated one possible way would be to have something like exclusive routes. @jszwedko any updates on this? Is it likely to get prioritised near future? If not are you open to contributions for this issue (since the issue is tagged as needs approval)? |
Hey! We are open to contributions here, but given there is some ambiguity about what the configuration should look like, I think it'd be good to settle on that before implementation. Seemingly we'd need to switch the routes from a map of key/value to a list to support ordering so it knows which route to pick if multiple match. The proposal could just be a comment on this issue. You might already have considered this, but it is possible to workaround the lack of this using a |
@jszwedko thanks we will go with the suggested workaround for now, do you think there will be any performance implications of this workaround? |
I don't anticipate that there would be much overhead with the additional |
Can we use something like indexmap to preserve insertion order? This way, we can avoid having to introduce any breaking changes or extra configurations such as weight or priority.
I like the idea of introducing a var called Let me know your thoughts |
It is possible, but I hesitate to add ordering guarantees to maps since the configuration formats we have, in particular YAML and JSON, don't currently have a notion of ordering when it comes to maps. Thus it could end up being confusing to users or result in subtle bugs if they are using other tools to generate or transform Vector configs that don't preserve ordering. Thus I think it'd be better to use an array to define the routes or add a |
All of this would be extremely helpful in my current use case which involves dynamically updating thousands of transform files on an interval of ~5m based on per-service volume for dynamic sampling. Currently have to have one router per service with logs having to needlessly pass through so many routers before getting to the one they match instead of just going through a single top-level list on a main router. |
A note for the community
I will first start by saying that I think the route documentation needs to be updated to mention that routes are not exclusive and will send events to all matching routes. This is not documented at all. In trying to debug an issue, I ran across #9014 that informed me that the routes are not exclusive.
I do believe that there is a need for being able to make routes exclusive as an option. I think that It would be fairly easy to add a config flag of
exclusive_routes
that defaults to false but when set to true it would stop looking for more routes on the first match.Use Cases
There are many times where there are similar events with slight differences that I route to different transforms for remapping, and there are also generic event types that act as a catchall for select formats of events, as well as a default transform for things like JSON or Text events.. Usually I try to write the routes in the order of first match would be the correct route.. But then at times the generic route events can also pick the same event and send it as a duplicate events.
For example lets say im getting GCP logs and I have a transform for
vpc_flow
and another one forgcp_json_event
as well asgeneric_json
, Right now I would either have to make three routers taking the unmatched from each and feeding into the next.. or write match conditions that would exclude the already matched events.. This isnt bad when there are only a few different routes.. But when there are 50+ different routes, those match conditions would get rather lengthy.. Or could be many different routes that are hard to follow..Adding an
exclusive_routes
option would allow someone to take responsibility for making sure their routes are in the correct order, and the route transform would stop after sending the event to the first match in the list of routes and not sending them to other transforms that may have matched lower down that would have created similar or duplicate events to the one that they were intending on matching on.Attempted Solutions
No response
Proposal
I dont know if this is accurate as I am still new to Rust, but I think this solution mat work..
https://github.com/vectordotdev/vector/blob/master/src/transforms/route.rs#L35
References
No response
Version
0.28.1
The text was updated successfully, but these errors were encountered: