Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sltr queries with minimum_should_match features #476

Open
jhinch-at-atlassian-com opened this issue Nov 3, 2023 · 0 comments
Open

sltr queries with minimum_should_match features #476

jhinch-at-atlassian-com opened this issue Nov 3, 2023 · 0 comments

Comments

@jhinch-at-atlassian-com
Copy link

One way to think about how an sltr query functions is that it is a bool query with custom scoring function.

For example given the following featureset definition:

{
  "featurset": {
    "features": [
      {
        "name": "title_text_match",
        "params": [
          "query_text"
        ],
        "template_language": "mustache",
        "template": {
          "match": {
            "title": "{{query_text}}"
          }
        }
      },
      {
        "name": "description_text_match",
        "params": [
          "query_text"
        ],
        "template_language": "mustache",
        "template": {
          "match": {
            "description": "{{query_text}}"
          }
        }
      },
      {
        "name": "description_knn_match",
        "params": [
          "query_embedding"
        ],
        "template_language": "mustache",
        "template": "{\"knn\":{\"field\":\"description_vector\",\"k\":10,\"query_vector\":{{#toJson}}query_embedding{{/toJson}}}}"
      }
    ]
  }
}

and a model example_model which was created using the above featureset, the following sltr query:

{
  "sltr": {
    "model": "example_model",
    "params": {
      "query_text": "the text query",
      "query_embedding": [1.0, 0.4, ...]
     }
  }
}

Can be thought conceptually as:

{
  "bool": {
    "filter": {
      "match_all": {}
    },
    "should": [
      {
        "match": {
          "title": "the text query"
        }
      },
      {
        "match": {
          "description": "the text query"
        }
      },
      {
        "knn": {
          "field": "description_vector",
          "k": 10,
          "query_vector": [1.0, 0.4, ...]
        }
      }
    ],
    "minimum_should_match": 0,
    // plus also use a special scoring function defined by example_model
  }
}

It would be great if the features used by the model could have a requirement of a minimum which should match so that the sltr:

{
  "sltr": {
    "model": "example_model",
    "params": {
      "query_text": "the text query",
      "query_embedding": [1.0, 0.4, ...]
     },
     "minimum_should_match": 1
  }
}

which would translates to roughly the following:

{
  "bool": {
    "should": [
      {
        "match": {
          "title": "the text query"
        }
      },
      {
        "match": {
          "description": "the text query"
        }
      },
      {
        "knn": {
          "field": "description_vector",
          "k": 10,
          "query_vector": [1.0, 0.4, ...]
        }
      }
    ],
    "minimum_should_match": 1,
    // plus also use a special scoring function defined by example_model
  }
}

This would make sltr queries more viable to use as part of the initial query and not need to be part of a rescore phase. The use case for this would be to use non-linear models (such as an LambdaMART model) as a means to deal with query clauses which have different scoring distributions which make them difficult to combined using a linear combination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant