Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPL command expression implementation for geoip #3228

Open
wants to merge 48 commits into
base: main
Choose a base branch
from

Conversation

andy-k-improving
Copy link
Contributor

@andy-k-improving andy-k-improving commented Jan 1, 2025

Description

Introduce a new PPL command expression geoip, to perform geo-spatial information lookup with the provided IPv4 || IPv6 addresses, result of the lookup is formatted into a tuple with attribute as key and location detail as value.

In this particular setting, SQL plugin will act as a thin client, by relaying the IPEnrichment request to OpenSearch Geo-Spatial plugin, WITHIN the same cluster.
Detail implementation and interface that exposed on Geo-Spatial side can be found:
opensearch-project/geospatial#700

Internally this functionality is achieved by:

  • Adding an no-op OpenSearchFunctionExpression marker to identify this is an expression has no default implement on other runtime (Ex: Prometheus)
  • Update OpenSearchIndex in order to provide an OpenSearch specific handler for eval operator and its expressions, when OS being used as the storage engine.

During runtime, all eval expressions, will being passed to OpenSearchIndex.visitEval( ), then OpenSearchEvalOperator class will pick up the call, by evaluating all eval expression as it is, and then handle all occasion of OpenSearchFunctionExpression separately, by reading the function name and argument, and execute the appropriate business logic.

Marker class OpenSearchFunctionExpression is being used in this case because the actual implementation require runtime OpenSearch client connectivity, however core module is mean to be generic, hence this workaround is being deployed, by tagging it as OpenSearchFunctionExpression on core and only handle it on the opensearch Cradle module .

Related Issues

Resolves: #3037

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
Signed-off-by: Andy Kwok <[email protected]>
@andy-k-improving
Copy link
Contributor Author

As per the offline discussion, I have separated out the integration related changes into #3244, in order to minimise the diff.

with:
product: opensearch

security-it-linux:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
security-it-linux:
geospatial-it-linux:

integ-test/build/testclusters/*/logs/*
integ-test/build/testclusters/*/config/*
security-it-windows-macos:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
security-it-windows-macos:
geospatial-it-windows-macos:

matrix:
java: [21]
runs-on: ubuntu-latest
container:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like container parameter is the only difference between the linux run vs the windows/mac run. It might be nice to combine the two and include an if statement around the container to only set the container when running on ubuntu-latest.

remoteIntegTestWithSecurity {
testDistribution = 'archive'
plugin(getJobSchedulerPlugin())
plugin ":opensearch-sql-plugin"
}

integTestWithGeo {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
integTestWithGeo {
integTestWithGeospatial {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it make sense to have a separate IT test for Geospatial or can we combine everything into the IT tests? I see that IT tests already have job scheduler plugin. Or is the Geospatial test really long?

@@ -256,11 +289,18 @@ testClusters {
plugin(getJobSchedulerPlugin())
plugin ":opensearch-sql-plugin"
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert?

@@ -419,6 +419,7 @@ evalFunctionName
| flowControlFunctionName
| systemFunctionName
| positionFunctionName
| goeipFunctionName
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| goeipFunctionName
| geoipFunctionName

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE]Add iplocation function to PPL for IP address geolocation
2 participants