Skip to content

ESQL - Add K mandatory param for KNN function #129763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

carlosdelest
Copy link
Member

@carlosdelest carlosdelest commented Jun 20, 2025

Right now KNN function sets k as an option:

| WHERE KNN(vector, [0, 1, 2], {"k": 10})

This is a problem as it makes k optional. Setting a default value makes no sense until we use LIMIT for setting k (#129353).

Until we use LIMIT, this PR makes k a mandatory parameter for KNN function:

  • Users must explicitly set the value of K they use
  • K is a first class citizen in KNN. This makes sense as it will be the most tweaked option by users.

The new function format will be:

| WHERE KNN(vector, [0, 1, 2], 10)

In case users want to set num_candidates, they can do so via option:

| WHERE KNN(vector, [0, 1, 2], 10, {"num_candidates": 50})

After #129353 is done, we can make this parameter optional and check that a LIMIT can be used to set K on behalf of the user. Users will be able to override the default K by setting it explicitly.

This PR removes a test for the k option, which makes no sense now, and closes the following related issues for this test:

Closes #129447
Closes #129512

@carlosdelest carlosdelest added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.19.0 Team:Search - Relevance The Search organization Search Relevance team labels Jun 20, 2025
Copy link
Contributor

github-actions bot commented Jun 20, 2025

🔍 Preview links for changed docs:

🔔 The preview site may take up to 3 minutes to finish building. These links will become live once it completes.

@@ -511,9 +511,6 @@ tests:
- class: org.elasticsearch.entitlement.runtime.policy.FileAccessTreeTests
method: testWindowsAbsolutPathAccess
issue: https://github.com/elastic/elasticsearch/issues/129168
- class: org.elasticsearch.xpack.esql.qa.multi_node.EsqlSpecIT
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is removed in this PR as it's already being tested in all other tests

@@ -29,31 +29,12 @@ chartreuse | [127.0, 255.0, 0.0]
// end::knn-function-result[]
;

knnSearchWithKOption
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed test - k is already added to all other tests

@@ -100,14 +109,6 @@ public Knn(
description = "Floating point number used to decrease or increase the relevance scores of the query."
+ "Defaults to 1.0."
),
@MapParam.MapParamEntry(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k is removed as an option

@@ -2172,24 +2173,25 @@ private void checkFullTextFunctionNullArgs(String functionInvocation, String arg
);
}

public void testFullTextFunctionsConstantQuery() throws Exception {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed as we're checking other params not being null now

@carlosdelest carlosdelest marked this pull request as ready for review June 20, 2025 13:43
@elasticsearchmachine elasticsearchmachine removed Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch Team:Search - Relevance The Search organization Search Relevance team labels Jun 20, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Jun 20, 2025
Copy link
Contributor

@ioanatia ioanatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does what it says 👍

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one question

@ioanatia ioanatia removed auto-backport Automatically create backport pull requests when merged v8.19.0 labels Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >non-issue serverless-linked Added by automation, don't add manually Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] EsqlSpecIT test {knn-function.KnnSearchWithKOption SYNC} failing [CI] EsqlSpecIT test {knn-function.KnnSearchWithKOption ASYNC} failing
5 participants