[Workload Improvement] Adding single term aggregation task in workloads #165

sandeshkr419 · 2024-01-29T21:28:00Z

Is your feature request related to a problem?

While working on aggregations performance, I encountered a gap in workloads.
Presently, all the workloads do not have single term aggregation request as part of their runs. This is one of the common use case which we should definitely be benchmarking.

Example search requests which is missing:

GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "response_codes": {
      "terms": {
        "field" : "response_code"
      }
    }
  }
}

Task in custom workload I used temporarily:

{
      "name": "country_term_aggregation",
      "operation-type": "search",
      "body": {
        "size": 0,
        "aggs": {
          "country_population": {
            "terms": {
              "field": "country_code.raw"
            }
          }
        }
      }
    }

What solution would you like?

Identify the workloads for which it would make sense to include the term aggregations. 2 of the obvious inclusions I see is geonames & http_logs
The existing workload tasks which have term in their name should be renamed to term_query to create a distinction between term queries and term aggregations.
Include single term aggregations in the identified workloads.

What alternatives have you considered?

None.

Do you have any additional context?

Cases in terms aggregations when the fielddata is indexed or not should be accounted separately. For example, with geonames workload, if you trigger the above query with "field": "country_code.raw" - then low cardinality workflow is triggered, however, if you run with "field": "country_code" - then the regular dense cardinality workflow is triggered.

The text was updated successfully, but these errors were encountered:

gkamat · 2024-01-30T19:53:14Z

This will certainly improve and flesh out the functionality of the current workloads. However, a discussion is warranted on how the term query should be renamed.

rishabhmaurya · 2024-03-04T19:19:58Z

Lets also add cardinality aggregation operation in BIG5 workload on a low cardinality field if it makes sense.
related to opensearch-project/OpenSearch#11959

sandeshkr419 added enhancement New feature or request untriaged labels Jan 29, 2024

gkamat removed the untriaged label Jan 30, 2024

IanHoang added the good first issue Good for newcomers label Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Workload Improvement] Adding single term aggregation task in workloads #165

[Workload Improvement] Adding single term aggregation task in workloads #165

sandeshkr419 commented Jan 29, 2024 •

edited

Loading

gkamat commented Jan 30, 2024

rishabhmaurya commented Mar 4, 2024

[Workload Improvement] Adding single term aggregation task in workloads #165

[Workload Improvement] Adding single term aggregation task in workloads #165

Comments

sandeshkr419 commented Jan 29, 2024 • edited Loading

Is your feature request related to a problem?

What solution would you like?

What alternatives have you considered?

Do you have any additional context?

gkamat commented Jan 30, 2024

rishabhmaurya commented Mar 4, 2024

sandeshkr419 commented Jan 29, 2024 •

edited

Loading