Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Logs+] Add API to create application logs integration during on-boarding #159991

Closed
weltenwort opened this issue Jun 20, 2023 · 13 comments · Fixed by #160777
Closed

[Logs+] Add API to create application logs integration during on-boarding #159991

weltenwort opened this issue Jun 20, 2023 · 13 comments · Fixed by #160777
Assignees
Labels
Feature:Logs Onboarding Logs Onboarding feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services

Comments

@weltenwort
Copy link
Member

weltenwort commented Jun 20, 2023

📓 Summary

Currently the log entries ingested after executing the "custom logs" on-boarding flow rely on the default logs-*-* templates shipped with Elasticsearch. That means the user can't customize the ingest pipeline or mapping for the specific dataset they're on-boarding. We want to provide an API that the workflow can use to create a integration for the newly on-boarded dataset.

✔️ Acceptance criteria

  • There is a fleet API that creates all the saved objects to bring a new integration into existence.
  • The API respects the fleet plugin's conventions.
  • The API handler re-uses the integration installation mechanism that already exists as much as possible.
  • The API blocks until the installation is complete.
  • The API reports the failure or success with appropriate status codes and response payloads.
  • The API receives the following parameters:
    • integrationName: string
    • datasetType: 'logs' | 'metrics' | 'traces' | 'synthetics' | 'profiling'
    • datasetName: string
  • The API creates the following assets:
    • package manifest with...
      • format_version: 2.9.0
      • name: integration name
      • title: capitalized integration name
      • description: "Collect logs for the dataset {dataset}."
      • version: 1.0.0
      • owner: {"github": "$currentKibanaUser"}
      • type: "integration"
      • conditions: {"kibana.version": ">=$currentKibanaVersion"}
    • datastream with...
      • datastream manifest
        • format_version: 2.8.1
        • dataset: dataset name
        • title: capitalized dataset name
        • type: dataset type
      • index template
        • derived from fields.yml
        • with basic fields
        • with agent fields
        • with reference to the common logs ILM policy
      • ingest pipeline
  • If the API call completes successfully the new integration is listed in the list of installed integrations.
  • If the API call fails no orphaned assets are left behind.
  • There are integration tests that cover this new API.
@weltenwort weltenwort added Feature:Logs UI Logs UI feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services labels Jun 20, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

@weltenwort
Copy link
Member Author

weltenwort commented Jun 20, 2023

Open Questions

Is the manifest content accurate?

  • User can't yet enter a title, a description, nor an icon during the on-boarding
  • What are the semantics of the owner?
  • Should we include a license?
  • Should we set a type?
  • Should we set conditions
  • Should we set categories?

Verification

  • The integrations UI warns about the integration not being signed. Is that acceptable or do we want Kibana to sign them?

DLM policy

  • What should the DLM policy be?

@weltenwort
Copy link
Member Author

@ruflin, your feedback would be welcome, especially about the manifest content

@yngrdyn, @ogupte, do you think this would work for the on-boarding workflow?

@ruflin
Copy link
Contributor

ruflin commented Jun 21, 2023

Some quick notes:

  • description: "Collect logs for the dataset {dataset} with Elastic Agent."
    • Can we leave out "with Elastic Agent" here?
  • {"github": "elastic"}
    • I think ideally we should set a user here that matches the user logged in. If we can't, lets not make it elastic as it might be confusing when later a user exports, and elastic is indicated as the owner.
  • manifest version: We need to find a way to keept his up-to-date with new versions of Kibana
  • Verification: Yes, it is acceptable / expected, that is not signed. But we might change the UI for these slightly to make it clear, this is ok for your own integrations.
  • DLM: Doesn't exist yet. The ILM policy by default should be the global logs ILM policy.

@weltenwort
Copy link
Member Author

@ruflin thanks for the feedback!

Can we leave out "with Elastic Agent" here?

Sure, I just copied the package spec recommendation.

{"github": "elastic"}

The current user would work, I guess. When we add a publishing workflow at some point that should probably ask for the "real" value.

manifest version: We need to find a way to keep his up-to-date with new versions of Kibana

Isn't this the package spec version? Wouldn't we only have to update it when we use new features from the package spec?

The ILM policy by default should be the global logs ILM policy.

Do we copy it or refer to it?

@ruflin
Copy link
Contributor

ruflin commented Jun 26, 2023

Isn't this the package spec version? Wouldn't we only have to update it when we use new features from the package spec?

Yes, it is the package spec versions. You are right that we only have to update it when we use a new feature. At the same time it would be nice, if newer versions of Kibana always create the "current" versions of package spec.

Do we copy it or refer to it?

Good question. So far in all integrations we refer to it. We should likely discuss if we should keep this. As soon as a user makes some modifications, we would copy. Eventually this is hopefully DLM and not ILM anymore which removes the problem.

@weltenwort
Copy link
Member Author

Yes, it is the package spec versions. You are right that we only have to update it when we use a new feature. At the same time it would be nice, if newer versions of Kibana always create the "current" versions of package spec.

As far as I can tell, Kibana right now doesn't even know which package version it supports. @juliaElastic @jsoriano is there a mapping available to the code that represents the association between package spec and Kibana version?

@weltenwort
Copy link
Member Author

Given that the packages have to specify the Kibana version, I've added a condition entry to the manifest ACs above. Does anyone see a problem with >=$currentKibanaVersion?

@ruflin
Copy link
Contributor

ruflin commented Jun 27, 2023

Does anyone see a problem with >=$currentKibanaVersion

This LGTM. With serverless, the Kibana version will become irrelevant and the package version will be important for compatibility. So we will need both to be up-to-date. For now, lets do it manually. @jsoriano or @juliaElastic might have ideas on how to keep it in sync. My assumption would be that somewhere in the package manager in Kibana / Fleet, the supported version will be coded in eventually because of the above change.

@jsoriano
Copy link
Member

jsoriano commented Jun 30, 2023

That means the user can't customize the ingest pipeline or mapping for the specific dataset they're on-boarding. We want to provide an API that the workflow can use to create a integration for the newly on-boarded dataset.

I think that the custom logs input package (available since Kibana 8.8.0) already covers this.

When you create a policy with a custom dataset, Fleet already creates a logs-<dataset>@custom component template that can be used for customization.
imagen

It also creates a logs-<dataset>-<version> pipeline that references a logs-<dataset>@custom one:
imagen

It is true that this is not supported by current integration packages intended to be used as inputs. But I would rather consider this an area of improvement (I opened an issue about this #160775).

Is this new API still neccesary if input packages already provide this functionality?

@ruflin
Copy link
Contributor

ruflin commented Jun 30, 2023

Is this new API still neccesary if input packages already provide this functionality?

Yes. The goal is to create a new integration. What currently happens with input packages is that the templates and pipelines are created, but I don't think belong then to an integration. In your example above, this would the foo integration. Also important, a Fleet policy is not part of our workflow.

@jsoriano
Copy link
Member

jsoriano commented Jul 4, 2023

Also important, a Fleet policy is not part of our workflow.

Ok, this is an important difference 👍

The use case is then for users that don't need (or want) to use neither Fleet nor pre-built integrations?

It may be good in any case to have alignment between how Fleet creates the dataset-specific templates and pipelines and how this new API will do it.

type: "integration"

Given that this is going to be more similar to an input package, should we use type: "input" instead? This will be probably simpler for an integration with a single data stream.

name: integration name
version: 1.0.0

What kind of integration names do we expect here? I wonder if there can be conflicts if this API is called for example with integrationName=apache and the user also tries to use the Apache package.

@ruflin
Copy link
Contributor

ruflin commented Jul 5, 2023

The use case is then for users that don't need (or want) to use neither Fleet nor pre-built integrations?

Correct, they build their own.

Given that this is going to be more similar to an input package, should we use type: "input" instead? This will be probably simpler for an integration with a single data stream.

I think the opposite is true. Lets assume the nginx integration would not exist. The user would ship data to logs-nginx.access-default and then build templates / pipelines / dashboards on top of it. They just built an integration. Eventually, a user should be able to add more data streams to their integration.

What kind of integration names do we expect here? I wonder if there can be conflicts if this API is called for example with integrationName=apache and the user also tries to use the Apache package.

It is a problem we will hit. Eventually we will need something like a "namespace" prefix or something for the integrations. To get started, I would ignore it. A user just can't create an integration with a name that is already installed. Ideally also Fleet would not install an integration that already exists locally ...

Kerry350 added a commit that referenced this issue Jul 13, 2023
## Summary

Closes #159991

Fields that have been utilised to fulfil `basic` and `agent` fields can
be easily amended if these are incorrect.

Multiple datasets are supported, and these can contain more than one
type.

## Testing

A curl command similar to the following should allow you to hit the API
(check the credentials etc):

```
curl -XPOST -u 'elastic:changeme' -H 'kbn-xsrf: something' -d '{
    "integrationName": "web_custom_nginx",
    "datasets": [{"name": "access", "type": "logs"}, {"name": "error", "type": "metrics"}, {"name": "warning", "type":"logs"}]
}' 'http://localhost:5601/<BASE_PATH>/api/fleet/epm/custom_integrations'
```

## History / context

- [Prototype
learnings](#158552 (comment))
- [Prototype PR](#160003)

## Results / expectations

API response (with installed assets):

![Screenshot 2023-07-05 at 16 56
33](https://github.com/elastic/kibana/assets/471693/fc4a0bab-7057-430a-8c03-18dd4ee17ab7)

We see the custom integration in "installed integrations" (albeit with a
verification warning):

![Screenshot 2023-07-05 at 16 57
14](https://github.com/elastic/kibana/assets/471693/0c9177d2-2871-490f-9b5c-f338e96484c4)

We see the custom integration in Discover with the logs explorer
profile:

![Screenshot 2023-07-05 at 16 58
20](https://github.com/elastic/kibana/assets/471693/30c556f2-9fcd-416e-8047-5976fc11ffa2)

The assets are installed correctly:

![Screenshot 2023-07-05 at 16 59
06](https://github.com/elastic/kibana/assets/471693/abb82632-f619-4fc3-be93-dc6ce97abedd)

![Screenshot 2023-07-05 at 16 59
20](https://github.com/elastic/kibana/assets/471693/ca1c1da5-1e4b-422c-9edb-0f56e0ed3f98)

![Screenshot 2023-07-05 at 16 59
36](https://github.com/elastic/kibana/assets/471693/8bd60d7e-aebc-4833-b423-eba3336fb42c)
@gbamparop gbamparop added Feature:Logs Onboarding Logs Onboarding feature and removed Feature:Logs UI Logs UI feature labels Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Logs Onboarding Logs Onboarding feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants