Skip to content

Add Data Source APIs #9903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions _api-reference/data-source-apis/create-data-source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
---
layout: default
title: Create Data Source API
parent: Data Source APIs
nav_order: 10
---

# Create Data Source API

Check failure on line 8 in _api-reference/data-source-apis/create-data-source.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _api-reference/data-source-apis/create-data-source.md#L8

[OpenSearch.HeadingCapitalization] 'Create Data Source API' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Create Data Source API' is a heading and should be in sentence case.", "location": {"path": "_api-reference/data-source-apis/create-data-source.md", "range": {"start": {"line": 8, "column": 3}}}, "severity": "ERROR"}
**Introduced 2.4**
{: .label .label-purple }

The Create Data Source API allows you to register and configure external data sources that can be queried through OpenSearch's query engine. Data sources represent connections to external databases or services, enabling cross-database querying and data federation capabilities.

<!-- spec_insert_start
api: query.datasources_create
component: endpoints
-->
## Endpoints
```json
POST /_plugins/_query/_datasources
```
<!-- spec_insert_end -->

## Request body fields

The request body is required. It is a JSON object with the following fields:

| Property | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `connector` | **Required** | String | The connector type for the data source such as `s3glue`, `prometheus`, or `mysql`. |
| `name` | **Required** | String | The name of the data source. This must be unique within the cluster. |
| `properties` | **Required** | Object | The configuration properties for the data source, specific to each connector type. |
| `resultIndex` | **Required** | String | The index where query results from this data source are stored. |
| `status` | **Required** | String | The current status of the data source. Set to `ACTIVE` for a new data source. |
| `allowedRoles` | Optional | Array of strings | List of roles that are allowed to access this data source. |
| `configuration` | Optional | Object | Additional configuration settings for the data source connection. |
| `description` | Optional | String | A human-readable description of the data source. |

<details markdown="block">
<summary>
Request body fields: <code>configuration</code>
</summary>
{: .text-delta }

`configuration` is a JSON object with the following fields:

| Property | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `credentials` | **Required** | Object | The authentication credentials for the data source. |
| `endpoint` | **Required** | String | The connection endpoint for the data source (such as URL and hostname). |

</details>

<details markdown="block">
<summary>
Request body fields: <code>configuration</code> > <code>credentials</code>
</summary>
{: .text-delta }

`credentials` is a JSON object with the following fields:

| Property | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `username` | **Required** | String | The username for authentication. |
| `password` | **Required** | String | The password for authentication. |

</details>

<details markdown="block">
<summary>
Request body fields: <code>properties</code>
</summary>
{: .text-delta }

The `properties` object contains fields specific to each connector type. Each connector requires different properties for establishing connections and executing queries. Refer to the specific connector documentation for details about required properties.

</details>

## Example request

The following example creates a MySQL data source:

```json
POST /_plugins/_query/_datasources
{
"name": "mysql-customer-database",
"description": "Connection to customer database",
"connector": "mysql",
"resultIndex": "customer_data_results",
"status": "ACTIVE",
"allowedRoles": ["analyst_role", "admin_role"],
"configuration": {
"endpoint": "jdbc:mysql://mysql-server:3306/customers",
"credentials": {
"username": "db_user",
"password": "db_password"
}
},
"properties": {
"database": "customers",
"port": "3306",
"host": "mysql-server",
"encryption": true,
"connectionTimeout": 30000
}
}
```
{% include copy-curl.html %}

## Example response

```json
{
"dataSourceId": "gxUYL4QB-QllXHAp5pF4",
"name": "mysql-customer-database",
"status": "ACTIVE",
"message": "Data source created successfully."
}
```

## Response body fields

The response body is a JSON object with the following fields:

| Property | Data type | Description |
| :--- | :--- | :--- |
| `dataSourceId` | String | The unique identifier for the created data source. |
| `name` | String | The name of the data source. |
| `status` | String | The status of the data source. |
| `message` | String | A message describing the result of the operation. |

## Usage notes

When creating data sources in OpenSearch, it's important to consider both security and performance implications. Data sources serve as bridges between your OpenSearch cluster and external data systems, enabling powerful cross-database queries and data federation scenarios. However, they also represent potential security boundaries and performance bottlenecks if not configured properly. The following guidelines will help you implement data sources securely and efficiently

- **Security**: Store credentials securely. Consider using a secrets manager or environment variables instead of hardcoding credentials.

Check failure on line 136 in _api-reference/data-source-apis/create-data-source.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _api-reference/data-source-apis/create-data-source.md#L136

[OpenSearch.Spelling] Error: hardcoding. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.
Raw output
{"message": "[OpenSearch.Spelling] Error: hardcoding. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_api-reference/data-source-apis/create-data-source.md", "range": {"start": {"line": 136, "column": 114}}}, "severity": "ERROR"}

- **Connection testing**: After creating a data source, use the test connection API to verify connectivity.

- **Role-based access**: Use `allowedRoles` to restrict access to the data source to specific user roles.

- **Result index**: The `resultIndex` parameter specifies where query results will be stored. Ensure appropriate index permissions are configured.

- **Connector types**: Different connector types have specific property requirements. Consult the documentation for your specific connector.

- **Status management**: A newly created data source typically has the status "ACTIVE". Other possible statuses include "DISABLED" or "ERROR".

- **Permissions**: Users need appropriate permissions to create data sources. Typically, this requires administrative privileges.
79 changes: 79 additions & 0 deletions _api-reference/data-source-apis/delete-data-source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
layout: default
title: Delete Data Source API
parent: Data Source APIs
nav_order: 50
---

# Delete Data Source API

Check failure on line 8 in _api-reference/data-source-apis/delete-data-source.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _api-reference/data-source-apis/delete-data-source.md#L8

[OpenSearch.HeadingCapitalization] 'Delete Data Source API' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Delete Data Source API' is a heading and should be in sentence case.", "location": {"path": "_api-reference/data-source-apis/delete-data-source.md", "range": {"start": {"line": 8, "column": 3}}}, "severity": "ERROR"}
**Introduced 2.4**
{: .label .label-purple }

The Delete Data Source API allows you to remove an existing data source from your OpenSearch cluster. This API is useful when you need to decommission a data source that is no longer needed, remove a misconfigured data source, or clean up unused resources.

<!-- spec_insert_start
api: query.datasource_delete
component: endpoints
-->
## Endpoints
```json
DELETE /_plugins/_query/_datasources/{datasource_name}
```
<!-- spec_insert_end -->

<!-- spec_insert_start
api: query.datasource_delete
component: path_parameters
-->
## Path parameters

The following table lists the available path parameters.

| Parameter | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `datasource_name` | **Required** | String | The name of the data source to delete. |

<!-- spec_insert_end -->

## Example request

```bash
DELETE /_plugins/_query/_datasources/mysql-customer-database
```
{% include copy-curl.html %}

## Example response

```json
{
"dataSourceId": "gxUYL4QB-QllXHAp5pF4",
"name": "mysql-customer-database",
"message": "Data source deleted successfully."
}
```

## Response body fields

The response body is a JSON object with the following fields:

| Property | Data type | Description |
| :--- | :--- | :--- |
| `dataSourceId` | String | The unique identifier of the deleted data source. |
| `name` | String | The name of the deleted data source. |
| `message` | String | A message describing the result of the operation. |

## Usage notes

When deleting data sources, consider these important implications:

- **Permanent operation**: Deleting a data source is a permanent operation and cannot be undone. If you need the data source again in the future, you will need to recreate it.

- **Active queries**: Deletion of a data source will fail if there are active queries using the data source. It's recommended to ensure no active queries are running against the data source before attempting to delete it.

- **Result index**: The `resultIndex` associated with the data source is not automatically deleted. If you want to remove the stored query results, you'll need to delete that index separately.

- **Dependent objects**: Any saved queries, visualizations, or dashboards that reference the deleted data source may stop functioning. Ensure that you update or remove any dependent objects before or after deleting a data source.

- **Permissions**: Deleting data sources typically requires administrative privileges.

- **Cleanup considerations**: Consider backing up the data source configuration before deletion if there's a possibility you might need to recreate it in the future.
141 changes: 141 additions & 0 deletions _api-reference/data-source-apis/get-data-source.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
---
layout: default
title: Get Data Source API
parent: Data Source APIs
nav_order: 30
---

# Get Data Source API

Check failure on line 8 in _api-reference/data-source-apis/get-data-source.md

View workflow job for this annotation

GitHub Actions / vale

[vale] _api-reference/data-source-apis/get-data-source.md#L8

[OpenSearch.HeadingCapitalization] 'Get Data Source API' is a heading and should be in sentence case.
Raw output
{"message": "[OpenSearch.HeadingCapitalization] 'Get Data Source API' is a heading and should be in sentence case.", "location": {"path": "_api-reference/data-source-apis/get-data-source.md", "range": {"start": {"line": 8, "column": 3}}}, "severity": "ERROR"}
**Introduced 2.4**
{: .label .label-purple }

The Get Data Source API retrieves detailed information about a specific data source by name. This API is useful when you need to inspect the configuration, connection details, or status of an individual data source in your OpenSearch environment.

<!-- spec_insert_start
api: query.datasource_retrieve
component: endpoints
-->
## Endpoints
```json
GET /_plugins/_query/_datasources/{datasource_name}
```
<!-- spec_insert_end -->

<!-- spec_insert_start
api: query.datasource_retrieve
component: path_parameters
-->
## Path parameters

The following table lists the available path parameters.

| Parameter | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `datasource_name` | **Required** | String | The name of the data source to retrieve. |

<!-- spec_insert_end -->

## Example request

```bash
GET /_plugins/_query/_datasources/mysql-customer-database
```
{% include copy-curl.html %}

## Example response

```json
{
"id": "gxUYL4QB-QllXHAp5pF4",
"name": "mysql-customer-database",
"description": "Connection to customer database",
"connector": "mysql",
"resultIndex": "customer_data_results",
"status": "ACTIVE",
"allowedRoles": ["analyst_role", "admin_role"],
"configuration": {
"endpoint": "jdbc:mysql://mysql-server:3306/customers",
"credentials": {
"username": "db_user",
"password": "******"
}
},
"properties": {
"database": "customers",
"port": "3306",
"host": "mysql-server",
"encryption": true,
"connectionTimeout": 30000
}
}
```

## Response body fields

The response body is a JSON object with the following fields:

| Property | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `connector` | **Required** | String | The connector type for the data source. |
| `name` | **Required** | String | The name of the data source. |
| `properties` | **Required** | Object | The configuration properties for the data source. |
| `resultIndex` | **Required** | String | The index where query results are stored. |
| `status` | **Required** | String | The current status of the data source. |
| `allowedRoles` | Optional | Array of Strings | The roles allowed to access this data source. |
| `configuration` | Optional | Object | Additional configuration settings for the data source connection. |
| `description` | Optional | String | The description of the data source. |
| `id` | **Required** | String | The unique identifier for the data source. |

<details markdown="block">
<summary>
Response body fields: <code>configuration</code>
</summary>
{: .text-delta }

`configuration` is a JSON object with the following fields:

| Property | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `credentials` | **Required** | Object | Authentication credentials for the data source. |
| `endpoint` | **Required** | String | The connection endpoint for the data source. |

</details>

<details markdown="block">
<summary>
Response body fields: <code>configuration</code> > <code>credentials</code>
</summary>
{: .text-delta }

`credentials` is a JSON object with the following fields:

| Property | Required | Data type | Description |
| :--- | :--- | :--- | :--- |
| `username` | **Required** | String | The username for authentication. |
| `password` | **Required** | String | The password for authentication (masked in the response). |

</details>

<details markdown="block">
<summary>
Response body fields: <code>properties</code>
</summary>
{: .text-delta }

The `properties` object contains fields specific to each connector type. Each connector includes different properties depending on its type and configuration.

</details>

## Usage notes

The Get Data Source API is particularly useful in these scenarios:

- **Configuration verification**: Confirm that a data source is configured correctly before attempting to query it.

- **Troubleshooting**: When experiencing issues with a specific data source, retrieve its configuration to diagnose the problem.

- **Documentation**: Generate detailed documentation for a specific data source.

- **Auditing**: Review the configuration of a specific data source for security or compliance purposes.

If the specified data source doesn't exist, the API returns a `404 Not Found error`. For security reasons, sensitive information such as passwords is masked in the response. Access to this API should be restricted to users who need to view data source configurations.
Loading