You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While working on aggregations performance, I encountered a gap in workloads.
Presently, all the workloads do not have single term aggregation request as part of their runs. This is one of the common use case which we should definitely be benchmarking.
Identify the workloads for which it would make sense to include the term aggregations. 2 of the obvious inclusions I see is geonames & http_logs
The existing workload tasks which have term in their name should be renamed to term_query to create a distinction between term queries and term aggregations.
Include single term aggregations in the identified workloads.
What alternatives have you considered?
None.
Do you have any additional context?
Cases in terms aggregations when the fielddata is indexed or not should be accounted separately. For example, with geonames workload, if you trigger the above query with "field": "country_code.raw" - then low cardinality workflow is triggered, however, if you run with "field": "country_code" - then the regular dense cardinality workflow is triggered.
The text was updated successfully, but these errors were encountered:
This will certainly improve and flesh out the functionality of the current workloads. However, a discussion is warranted on how the term query should be renamed.
Lets also add cardinality aggregation operation in BIG5 workload on a low cardinality field if it makes sense.
related to opensearch-project/OpenSearch#11959
Is your feature request related to a problem?
While working on aggregations performance, I encountered a gap in workloads.
Presently, all the workloads do not have single term aggregation request as part of their runs. This is one of the common use case which we should definitely be benchmarking.
Example search requests which is missing:
Task in custom workload I used temporarily:
What solution would you like?
term
in their name should be renamed toterm_query
to create a distinction between term queries and term aggregations.What alternatives have you considered?
None.
Do you have any additional context?
Cases in terms aggregations when the fielddata is indexed or not should be accounted separately. For example, with geonames workload, if you trigger the above query with
"field": "country_code.raw"
- then low cardinality workflow is triggered, however, if you run with"field": "country_code"
- then the regular dense cardinality workflow is triggered.The text was updated successfully, but these errors were encountered: