Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] relax max Clauses Count limitation of termS query over IP field #16200

Closed
mkhludnev opened this issue Oct 5, 2024 · 2 comments · Fixed by #16391
Closed
Labels
enhancement Enhancement or improvement to existing feature or request Search:Query Capabilities v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0

Comments

@mkhludnev
Copy link
Contributor

mkhludnev commented Oct 5, 2024

Is your feature request related to a problem? Please describe

Querying Ip field with terms query can hit max Clauses Count limit.
https://forum.opensearch.org/t/terms-search-gives-error-failed-to-create-query-maxclausecount-is-set-to-1024/21729/8

Describe the solution you'd like

Plain ip addresses might be handled by rewiriting into bitset efficiently. But ip masks with slashes causes a problem since they can only be handled with boolean query (and combining disjunction over many field types is really complex).

I propose to split ip terms onto two lists with masks and concrete ips, and handle them separately. Thus terms query will only limit number of masks values by max Clause count .., UPD .. for dv only field, and no explicit limit for indexed field. although we can nest bool over masks deeply to overcome it.

Related component

Search:Query Capabilities

Describe alternatives you've considered

No response

Additional context

No response

@sandeshkr419
Copy link
Contributor

[Search Triage] Yes, we should review max clause count limits, and for not just IP fields.

@mkhludnev Do you have some recommendations on it further as well?

@mkhludnev
Copy link
Contributor Author

mkhludnev commented Oct 9, 2024

Here are the approaches

  1. PR use terms in set for concrete IPs keep disjunctions over ranges. fix #16200 #16202 avoids max Clauses limit for concrete IPs, but /masks are still limited
  2. PR Support more than 1024 IP/masks with indexed field #16391 avoids the limit for IPs and /masks as well but doesn't work for DV-only fields. UPD I prefer this "at-best-effort" option. with further improvement via the next one.
  3. Lucene PR SortedSet DV Multi Range query apache/lucene#13974 to handle many masks for DV-only fields as well.

I'm not sure which of them to pursue. WDYT?

@reta reta added v3.0.0 Issues and PRs related to version 3.0.0 v2.19.0 Issues and PRs related to version 2.19.0 labels Nov 22, 2024
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in Search Project Board Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Query Capabilities v2.19.0 Issues and PRs related to version 2.19.0 v3.0.0 Issues and PRs related to version 3.0.0
Projects
Archived in project
3 participants