Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPL command expression implementation for geoip #3228

Open
wants to merge 48 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
1196e14
Update grammar structure
andy-k-improving Dec 24, 2024
2296c85
Inteception points
andy-k-improving Dec 24, 2024
30cb69e
Update dummy map values
andy-k-improving Dec 25, 2024
83b2853
Test
andy-k-improving Dec 25, 2024
71f246c
Geo integration
andy-k-improving Dec 25, 2024
c061765
Read ipString from user
andy-k-improving Dec 30, 2024
e0c23a8
Read option from user input
andy-k-improving Dec 30, 2024
781caf5
Update changes
andy-k-improving Dec 31, 2024
8a46b14
Support multiple method signature
andy-k-improving Dec 31, 2024
4a486c5
Update result map format
andy-k-improving Dec 31, 2024
6cf166b
Code cleanup
andy-k-improving Dec 31, 2024
7dc34ca
Spotless
andy-k-improving Dec 31, 2024
51e6a7f
JavaDoc
andy-k-improving Dec 31, 2024
36bb4b7
Spotless
andy-k-improving Dec 31, 2024
f4a3e91
Update unit test
andy-k-improving Dec 31, 2024
727362f
Update unit-test wording
andy-k-improving Dec 31, 2024
559e4ed
Update unit test for eval
andy-k-improving Jan 1, 2025
91d7540
Update doc
andy-k-improving Jan 1, 2025
2f119a3
Unit test
andy-k-improving Jan 3, 2025
b424a64
Unit-test 1
andy-k-improving Jan 3, 2025
2e0accf
Update ipenrichment test-case
andy-k-improving Jan 4, 2025
264cee5
Update test-cases
andy-k-improving Jan 4, 2025
d9bc40a
Update gradle
andy-k-improving Jan 6, 2025
0e3bbad
Geo plugin test case
andy-k-improving Jan 7, 2025
643c8a5
Comment out security plugin
andy-k-improving Jan 8, 2025
f2ec4ff
Geo ip index load
andy-k-improving Jan 9, 2025
ed9f965
Update integ with data content assert
andy-k-improving Jan 9, 2025
fcf2003
Integ test for geoip
andy-k-improving Jan 9, 2025
8f1d8b7
Remove debug
andy-k-improving Jan 9, 2025
71c1bc7
Gradle task refactor
andy-k-improving Jan 9, 2025
65aa555
Remove duplicated urlDownload
andy-k-improving Jan 9, 2025
ae56c1a
Refactor gradle script
andy-k-improving Jan 9, 2025
3aaf656
Refactor integ test
andy-k-improving Jan 9, 2025
0ba723b
Separate sscurity plugin test
andy-k-improving Jan 9, 2025
efe9e47
Refactor gradle file
andy-k-improving Jan 9, 2025
c26806f
Infer type
andy-k-improving Jan 10, 2025
086ed12
Update java doc
andy-k-improving Jan 10, 2025
12905e3
Simplifier provider
andy-k-improving Jan 10, 2025
73082e8
Update test cases
andy-k-improving Jan 10, 2025
59060d3
Spotless check
andy-k-improving Jan 10, 2025
8682abb
Github action for GeoSpatial build
andy-k-improving Jan 10, 2025
b7acfe1
Execlude geoip doc test
andy-k-improving Jan 10, 2025
302ed85
Fix test
andy-k-improving Jan 10, 2025
da7af99
Test coverage
andy-k-improving Jan 11, 2025
c302b13
Style fix
andy-k-improving Jan 11, 2025
e95e2b0
Update test caess
andy-k-improving Jan 13, 2025
52be18b
Minimise diff
andy-k-improving Jan 13, 2025
af264c2
Fix spot
andy-k-improving Jan 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .github/workflows/integ-tests-with-geo.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
name: GeoSpatial Plugin IT

on:
pull_request:
push:
branches-ignore:
- 'dependabot/**'
paths:
- 'integ-test/**'
- '.github/workflows/integ-tests-with-geo.yml'

jobs:
Get-CI-Image-Tag:
uses: opensearch-project/opensearch-build/.github/workflows/get-ci-image-tag.yml@main
with:
product: opensearch

security-it-linux:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
security-it-linux:
geospatial-it-linux:

needs: Get-CI-Image-Tag
strategy:
fail-fast: false
matrix:
java: [21]
runs-on: ubuntu-latest
container:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like container parameter is the only difference between the linux run vs the windows/mac run. It might be nice to combine the two and include an if statement around the container to only set the container when running on ubuntu-latest.

# using the same image which is used by opensearch-build team to build the OpenSearch Distribution
# this image tag is subject to change as more dependencies and updates will arrive over time
image: ${{ needs.Get-CI-Image-Tag.outputs.ci-image-version-linux }}
options: ${{ needs.Get-CI-Image-Tag.outputs.ci-image-start-options }}

steps:
- name: Run start commands
run: ${{ needs.Get-CI-Image-Tag.outputs.ci-image-start-command }}

- uses: actions/checkout@v4

- name: Set up JDK ${{ matrix.java }}
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: ${{ matrix.java }}

- name: Build with Gradle
run: |
chown -R 1000:1000 `pwd`
su `id -un 1000` -c "./gradlew integTestWithGeo"

- name: Upload test reports
if: ${{ always() }}
uses: actions/upload-artifact@v4
continue-on-error: true
with:
name: test-reports-${{ matrix.os }}-${{ matrix.java }}
path: |
integ-test/build/reports/**
integ-test/build/testclusters/*/logs/*
integ-test/build/testclusters/*/config/*

security-it-windows-macos:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
security-it-windows-macos:
geospatial-it-windows-macos:

strategy:
fail-fast: false
matrix:
os: [ windows-latest, macos-13 ]
java: [21]

runs-on: ${{ matrix.os }}

steps:
- uses: actions/checkout@v4

- name: Set up JDK ${{ matrix.java }}
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: ${{ matrix.java }}

- name: Build with Gradle
run: ./gradlew integTestWithGeo

- name: Upload test reports
if: ${{ always() }}
uses: actions/upload-artifact@v4
continue-on-error: true
with:
name: test-reports-${{ matrix.os }}-${{ matrix.java }}
path: |
integ-test/build/reports/**
integ-test/build/testclusters/*/logs/*
integ-test/build/testclusters/*/config/*
4 changes: 4 additions & 0 deletions core/src/main/java/org/opensearch/sql/expression/DSL.java
Original file line number Diff line number Diff line change
Expand Up @@ -969,6 +969,10 @@ public static FunctionExpression utc_timestamp(
return compile(functionProperties, BuiltinFunctionName.UTC_TIMESTAMP, args);
}

public static FunctionExpression geoip(Expression... args) {
return compile(FunctionProperties.None, BuiltinFunctionName.GEOIP, args);
}

@SuppressWarnings("unchecked")
private static <T extends FunctionImplementation> T compile(
FunctionProperties functionProperties, BuiltinFunctionName bfn, Expression... args) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,9 @@ public enum BuiltinFunctionName {
TRIM(FunctionName.of("trim")),
UPPER(FunctionName.of("upper")),

/** GEOSPATIAL Functions. */
GEOIP(FunctionName.of("geoip")),

/** NULL Test. */
IS_NULL(FunctionName.of("is null")),
IS_NOT_NULL(FunctionName.of("is not null")),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import org.opensearch.sql.expression.aggregation.AggregatorFunctions;
import org.opensearch.sql.expression.datetime.DateTimeFunctions;
import org.opensearch.sql.expression.datetime.IntervalClause;
import org.opensearch.sql.expression.ip.GeoIPFunctions;
import org.opensearch.sql.expression.ip.IPFunctions;
import org.opensearch.sql.expression.operator.arthmetic.ArithmeticFunctions;
import org.opensearch.sql.expression.operator.arthmetic.MathematicalFunctions;
Expand Down Expand Up @@ -83,6 +84,7 @@ public static synchronized BuiltinFunctionRepository getInstance() {
SystemFunctions.register(instance);
OpenSearchFunctions.register(instance);
IPFunctions.register(instance);
GeoIPFunctions.register(instance);
}
return instance;
}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.expression.ip;

import static org.opensearch.sql.data.type.ExprCoreType.BOOLEAN;
import static org.opensearch.sql.data.type.ExprCoreType.STRING;
import static org.opensearch.sql.expression.function.FunctionDSL.define;

import java.util.Arrays;
import java.util.List;
import lombok.experimental.UtilityClass;
import org.apache.commons.lang3.tuple.Pair;
import org.opensearch.sql.data.type.ExprType;
import org.opensearch.sql.expression.function.BuiltinFunctionName;
import org.opensearch.sql.expression.function.BuiltinFunctionRepository;
import org.opensearch.sql.expression.function.DefaultFunctionResolver;
import org.opensearch.sql.expression.function.FunctionBuilder;
import org.opensearch.sql.expression.function.FunctionName;
import org.opensearch.sql.expression.function.FunctionSignature;
import org.opensearch.sql.expression.function.SerializableFunction;

/**
* Utility class to register the method signature for geoip( ) expression, concreted reallocated to
* `opensearch` module, as this Ip location require GeoSpatial Plugin runtime support.
*/
@UtilityClass
public class GeoIPFunctions {

public void register(BuiltinFunctionRepository repository) {
repository.register(geoIp());
}

/**
* To register all method signatures related to geoip( ) expression under eval.
*
* @return Resolver for geoip( ) expression.
*/
private DefaultFunctionResolver geoIp() {
return define(
BuiltinFunctionName.GEOIP.getName(),
openSearchImpl(BOOLEAN, Arrays.asList(STRING, STRING)),
openSearchImpl(BOOLEAN, Arrays.asList(STRING, STRING, STRING)));
}

/**
* Util method to generate probe implementation with given list of argument types, with marker
* class `OpenSearchFunctionExpression` to annotate this is an OpenSearch specific expression.
*
* @param returnType return type.
* @return Binary Function Implementation.
*/
public static SerializableFunction<FunctionName, Pair<FunctionSignature, FunctionBuilder>>
openSearchImpl(ExprType returnType, List<ExprType> args) {
return functionName -> {
FunctionSignature functionSignature = new FunctionSignature(functionName, args);
FunctionBuilder functionBuilder =
(functionProperties, arguments) ->
new OpenSearchFunctionExpression(functionName, arguments, returnType);
return Pair.of(functionSignature, functionBuilder);
};
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
/*
*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*
*/

package org.opensearch.sql.expression.ip;

import java.util.List;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.data.type.ExprType;
import org.opensearch.sql.expression.Expression;
import org.opensearch.sql.expression.FunctionExpression;
import org.opensearch.sql.expression.env.Environment;
import org.opensearch.sql.expression.function.FunctionName;

/**
* Marker class to identify functions only compatible with OpenSearch storage engine. Any attempt to
* invoke the method different from OpenSearch will result in UnsupportedOperationException.
*/
public class OpenSearchFunctionExpression extends FunctionExpression {

private final ExprType returnType;

public OpenSearchFunctionExpression(
FunctionName functionName, List<Expression> arguments, ExprType returnType) {
super(functionName, arguments);
this.returnType = returnType;
}

@Override
public ExprValue valueOf() {
return null;
}

@Override
public ExprValue valueOf(Environment<Expression, ExprValue> valueEnv) {
throw new UnsupportedOperationException(
"OpenSearch runtime specific function, no default implementation available");
}

@Override
public ExprType type() {
return returnType;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.expression.ip;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertNull;
import static org.junit.jupiter.api.Assertions.assertThrows;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.opensearch.sql.data.type.ExprCoreType.BOOLEAN;
import static org.opensearch.sql.data.type.ExprCoreType.STRING;

import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.mockito.Mock;
import org.mockito.junit.jupiter.MockitoExtension;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.expression.DSL;
import org.opensearch.sql.expression.Expression;
import org.opensearch.sql.expression.env.Environment;

@ExtendWith(MockitoExtension.class)
public class GeoIPFunctionTest {

// Mock value environment for testing.
@Mock private Environment<Expression, ExprValue> env;

@Test
public void geoIpDefaultImplementation() {
UnsupportedOperationException exception =
assertThrows(
UnsupportedOperationException.class,
() ->
DSL.geoip(DSL.literal("HARDCODED_DATASOURCE_NAME"), DSL.ref("ip_address", STRING))
.valueOf(env));
assertTrue(exception.getMessage().matches(".*no default implementation available"));
}

@Test
public void testGeoipFnctionSignature() {
var geoip = DSL.geoip(DSL.literal("HARDCODED_DATASOURCE_NAME"), DSL.ref("ip_address", STRING));
assertEquals(BOOLEAN, geoip.type());
}

/** To make sure no logic being evaluated when no environment being passed. */
@Test
public void testDefaultValueOf() {
var geoip = DSL.geoip(DSL.literal("HARDCODED_DATASOURCE_NAME"), DSL.ref("ip_address", STRING));
assertNull(geoip.valueOf());
}
}
38 changes: 38 additions & 0 deletions docs/user/ppl/functions/geoip.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
====================
Geo IP Address Functions
====================

.. rubric:: Table of contents

.. contents::
:local:
:depth: 1

GEOIP
---------

Description
>>>>>>>>>>>

Usage: `geoip(dataSourceName, ipAddress, options)` to lookup location information from given IP addresses via OpenSearch GeoSpatial plugin API.

Argument type: STRING, STRING, STRING

Return type: Tuple

Example:

os> source=weblogs | eval LookupResult = geoip("dataSourceName", "50.68.18.229", "country_iso_code,city_name")
fetched rows / total rows = 1/1
+-------------------------------------------------------------+
| LookupResult |
|-------------------------------------------------------------|
| {'city_name': 'Vancouver', 'country_iso_code': 'CA'} |
+-------------------------------------------------------------+


Note:
- `dataSourceName` must be an established dataSource on OpenSearch GeoSpatial plugin, detail of configuration can be found: https://opensearch.org/docs/latest/ingest-pipelines/processors/ip2geo/
- `ip` can be an IPv4 or an IPv6 address
- `options` is a comma separated String for user to specify fields to output, the selection of fields subject to dataSourceProvider's schema, the list of geolite2-city dataset provide fields: "country_iso_code", "country_name", "continent_name", "region_iso_code", "region_name", "city_name", "time_zone", "location"

2 changes: 2 additions & 0 deletions docs/user/ppl/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,8 @@ The query start with search command and then flowing a set of command delimited

- `IP Address Functions <functions/ip.rst>`_

- `Geo IP Address Functions <functions/geoip.rst>`_

* **Optimization**

- `Optimization <../../user/optimization/optimization.rst>`_
Expand Down
Loading
Loading