Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

out_es: add cloud_apikey configuration #7935

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

soedar
Copy link

@soedar soedar commented Sep 18, 2023

Adds Elastic Cloud API Key support to the out_es plugin. This patch adds a new config option, cloud_apikey, which would be added to the HTTP request through the Authorization: Apikey <cloud_apikey> header.

Addresses #6727. While we can re-use the cloud_auth config option, we would have to make additional assumptions on the API Key to identify it properly (i.e. does does not contain :, is base64 encoded, etc).


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

fluent/fluent-bit-docs#1213

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@soedar
Copy link
Author

soedar commented Sep 18, 2023

Example configuration

[SERVICE]
    Flush     1
    Daemon    off
    Log_Level debug

[INPUT]
    Name      cpu

[OUTPUT]
    Name      stdout
    Match     *

[OUTPUT]
    Name                es
    Match               *
    tls                 On
    tls.verify          Off
    Cloud_Id            <redacted>
    Cloud_Apikey        <redacted>
    Suppress_Type_Name  On

Debug output and Valgrind

$ valgrind ./bin/fluent-bit -c es.conf
==70111== Memcheck, a memory error detector
==70111== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==70111== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==70111== Command: ./bin/fluent-bit -c es.conf
==70111==
Fluent Bit v2.1.10
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/09/18 02:55:15] [ info] Configuration:
[2023/09/18 02:55:15] [ info]  flush time     | 1.000000 seconds
[2023/09/18 02:55:15] [ info]  grace          | 5 seconds
[2023/09/18 02:55:15] [ info]  daemon         | 0
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info]  inputs:
[2023/09/18 02:55:15] [ info]      cpu
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info]  filters:
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info]  outputs:
[2023/09/18 02:55:15] [ info]      stdout.0
[2023/09/18 02:55:15] [ info]      es.1
[2023/09/18 02:55:15] [ info] ___________
[2023/09/18 02:55:15] [ info]  collectors:
[2023/09/18 02:55:15] [ info] [fluent bit] version=2.1.10, commit=b777d90050, pid=70111
[2023/09/18 02:55:15] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2023/09/18 02:55:15] [ info] [storage] ver=1.4.0, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2023/09/18 02:55:15] [ info] [cmetrics] version=0.6.3
[2023/09/18 02:55:15] [ info] [output:stdout:stdout.0] worker #0 started
[2023/09/18 02:55:15] [ info] [ctraces ] version=0.3.1
[2023/09/18 02:55:15] [ info] [input:cpu:cpu.0] initializing
[2023/09/18 02:55:15] [ info] [input:cpu:cpu.0] storage_strategy='memory' (memory only)
[2023/09/18 02:55:15] [debug] [cpu:cpu.0] created event channels: read=21 write=22
[2023/09/18 02:55:15] [debug] [stdout:stdout.0] created event channels: read=23 write=24
[2023/09/18 02:55:15] [debug] [es:es.1] created event channels: read=30 write=31
[2023/09/18 02:55:16] [debug] [output:es:es.1] extracted cloud_host: '<redacted>'
[2023/09/18 02:55:16] [debug] [output:es:es.1] cloud_host: '<redacted>' does not contain a port: '<redacted>'
[2023/09/18 02:55:16] [ info] [output:es:es.1] worker #1 started
[2023/09/18 02:55:16] [ info] [output:es:es.1] worker #0 started
[2023/09/18 02:55:16] [debug] [output:es:es.1] checked whether extracted port was null and set it to default https port or not. Outcome: '443' and cloud_host: '<redacted>'.
[2023/09/18 02:55:16] [debug] [output:es:es.1] host=<redacted> port=443 uri=/_bulk index=fluent-bit type=_doc
[2023/09/18 02:55:16] [debug] [router] match rule cpu.0:stdout.0
[2023/09/18 02:55:16] [debug] [router] match rule cpu.0:es.1
[2023/09/18 02:55:16] [ info] [sp] stream processor started
[2023/09/18 02:55:17] [debug] [input chunk] update output instances with new chunk size diff=207, records=1, input=cpu.0
^C[2023/09/18 02:55:17] [engine] caught signal (SIGINT)
[2023/09/18 02:55:17] [debug] [task] created task=0x5322eb0 id=0 OK
[0] cpu.0: [[1695005716.948405812, {}], {"cpu_p"=>34.000000, "user_p"=>33.000000, "system_p"=>1.000000, "cpu0.p_cpu"=>7.000000, "cpu0.p_user"=>6.000000, "cpu0.p_system"=>1.000000, "cpu1.p_cpu"=>62.000000, "cpu1.p_user"=>61.000000, "cpu1.p_system"=>1.000000}]
[2023/09/18 02:55:17] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2023/09/18 02:55:17] [debug] [out flush] cb_destroy coro_id=0
[2023/09/18 02:55:17] [debug] [output:es:es.1] task_id=0 assigned to thread #0
[2023/09/18 02:55:17] [ warn] [engine] service will shutdown in max 5 seconds
[2023/09/18 02:55:17] [ info] [input] pausing cpu.0
[2023/09/18 02:55:18] [ info] [task] cpu/cpu.0 has 1 pending task(s):
[2023/09/18 02:55:18] [ info] [task]   task_id=0 still running on route(s): stdout/stdout.0 es/es.1
[2023/09/18 02:55:18] [ info] [input] pausing cpu.0
[2023/09/18 02:55:18] [debug] [upstream] KA connection #60 to <redacted>:443 is connected
[2023/09/18 02:55:18] [debug] [http_client] not using http_proxy for header
[2023/09/18 02:55:18] [debug] [output:es:es.1] using elastic cloud apikey
[2023/09/18 02:55:18] [debug] [output:es:es.1] HTTP Status=200 URI=/_bulk
[2023/09/18 02:55:18] [debug] [output:es:es.1] Elasticsearch response
{"took":31,"errors":false,"items":[{"create":{"_index":"fluent-bit","_id":"fko2pooBqbJxt1RguQ8l","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":31,"_primary_term":1,"status":201}}]}
[2023/09/18 02:55:18] [debug] [upstream] KA connection #60 to <redacted>:443 is now available
[2023/09/18 02:55:18] [debug] [task] destroy task=0x5322eb0 (task_id=0)
[2023/09/18 02:55:18] [debug] [out flush] cb_destroy coro_id=0
[2023/09/18 02:55:18] [ info] [input] pausing cpu.0
[2023/09/18 02:55:19] [ info] [engine] service has stopped (0 pending tasks)
[2023/09/18 02:55:19] [ info] [input] pausing cpu.0
[2023/09/18 02:55:19] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2023/09/18 02:55:19] [ info] [output:stdout:stdout.0] thread worker #0 stopped
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #0 stopping...
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #0 stopped
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #1 stopping...
[2023/09/18 02:55:20] [ info] [output:es:es.1] thread worker #1 stopped
==70111==
==70111== HEAP SUMMARY:
==70111==     in use at exit: 0 bytes in 0 blocks
==70111==   total heap usage: 18,885 allocs, 18,885 frees, 2,765,200 bytes allocated
==70111==
==70111== All heap blocks were freed -- no leaks are possible
==70111==
==70111== For lists of detected and suppressed errors, rerun with: -s
==70111== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@@ -75,6 +75,27 @@ static flb_sds_t add_aws_auth(struct flb_http_client *c,
}
#endif /* FLB_HAVE_AWS */

static int es_http_add_cloud_apikey(struct flb_http_client *c,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be some opportunities for refactoring http auths in the flb_http_client library, but should probably be tackled in another PR.

@soedar soedar force-pushed the out-es-add-cloud-apikey branch from 44e3dba to 38b7710 Compare September 29, 2023 15:41
@soedar soedar temporarily deployed to pr October 5, 2023 12:07 — with GitHub Actions Inactive
@soedar soedar temporarily deployed to pr October 5, 2023 12:07 — with GitHub Actions Inactive
@soedar soedar temporarily deployed to pr October 5, 2023 12:07 — with GitHub Actions Inactive
@soedar soedar temporarily deployed to pr October 5, 2023 12:35 — with GitHub Actions Inactive
@soedar soedar force-pushed the out-es-add-cloud-apikey branch 2 times, most recently from 90d77d6 to cd0fbdf Compare October 18, 2023 07:17
@soedar
Copy link
Author

soedar commented Oct 18, 2023

@patrick-stephens could you assist to re-run the integration tests? not quite sure why the integration runs have been failed the first time round. I've rebased master

@soedar soedar temporarily deployed to pr October 19, 2023 11:33 — with GitHub Actions Inactive
@soedar soedar temporarily deployed to pr October 19, 2023 11:33 — with GitHub Actions Inactive
@soedar soedar temporarily deployed to pr October 19, 2023 11:33 — with GitHub Actions Inactive
@patrick-stephens
Copy link
Contributor

@patrick-stephens could you assist to re-run the integration tests? not quite sure why the integration runs have been failed the first time round. I've rebased master

Do you mean unit tests? Integration tests are not run unless this is labelled.
macOS unit tests are flaky at the moment I believe so can be ignored as long as Linux passes.

@soedar soedar temporarily deployed to pr October 19, 2023 12:02 — with GitHub Actions Inactive
@soedar
Copy link
Author

soedar commented Oct 20, 2023

Do you mean unit tests? Integration tests are not run unless this is labelled.
macOS unit tests are flaky at the moment I believe so can be ignored as long as Linux passes.

Ah, that's what I meant. I noticed the failing macOS test and wasn't sure if that was the blocker for the PR.

What would be the next steps to move this PR forward?

@patrick-stephens
Copy link
Contributor

It's on the codeowners to review so will be in the queue.

@alexku7
Copy link

alexku7 commented Dec 7, 2023

Can we extend this feature?

  1. The header name should be dynamic. There are many cases when other headers are used, for example Bearer header instead of apiKey header.
  2. The header value should be taken dynamically from a file instead of static value. The file can be dynamically updated when a value/token is updated/refreshed.

@soedar
Copy link
Author

soedar commented Dec 14, 2023

The header name should be dynamic. There are many cases when other headers are used, for example Bearer header instead of apiKey header.

Could you elaborate a use case where the Bearer header is used in the context of elasticsearch? This change in particular is to support integration with Elastic Cloud via API Keys (see https://www.elastic.co/guide/en/cloud/current/ec-api-authentication.html)

Regardless, I'm not quite sure that allowing users to specify arbitrary authorization headers is ideal, especially if the set of the allowable authorization type for the plugin could be well defined.

The header value should be taken dynamically from a file instead of static value. The file can be dynamically updated when a value/token is updated/refreshed.

Looking at other fluentbit output plugins, this does not appear to be a common pattern. (The only exception seems to be Google Cloud Credential json, which seem to contain quite a bit of auth information, which would probably not be the norm). I would be hesistant to make this change in this PR without maintainers' inputs, since this feels like a config design change that would also be applicable to other plugins.

@patrick-stephens
Copy link
Contributor

I think personally I would be of the opinion to keep things simple in a PR, land one feature before adding more.

@alexku7
Copy link

alexku7 commented Dec 14, 2023

Hi
On the other hand the ElasticSearch supports a JWT token as a bearer authorization header and probably other methods.
So why not to support universally any http header set by the user as a env variable or as a file containing this header (for security reasons)

We do a similar thing with Prometheus sending metrics to a remote store.
See the authorization section here

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write

@yanbutan
Copy link

yanbutan commented Jan 10, 2024

Hi any updates on merging this? This feature will be of great help ❤️

@ICUMD
Copy link

ICUMD commented Jan 25, 2024

that's a shame, i won't be able to use fluent bit because it does not support sending logs using elastic api keys

Copy link
Contributor

github-actions bot commented May 1, 2024

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label May 1, 2024
@soedar soedar force-pushed the out-es-add-cloud-apikey branch from cd0fbdf to b2f5248 Compare May 1, 2024 04:40
@soedar
Copy link
Author

soedar commented May 1, 2024

rebased

@patrick-stephens @edsiper @PettitWesley could you remove the stale label?

This patch adds support for the Elastic Cloud API Keys. It updates the
elastic search plugin to add a cloud_apikey configuration option. Usage:

[OUTPUT]
    name            es
    Cloud_Id        elastic_cloud_id
    Cloud_Apikey    elastic_apikey

Signed-off-by: Soedarsono <[email protected]>
@soedar soedar force-pushed the out-es-add-cloud-apikey branch from 599f48d to 1c6df25 Compare July 20, 2024 16:27
@dariusvalaitis
Copy link

adding the comment to move it from the stale state.

What ar the current blockers? as this is really long avaited feature. Thank you @soedar for making this.

Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Dec 14, 2024
@raphael-imperai
Copy link

Still waiting on this PR, is there anything to be done to move it along?

@patrick-stephens
Copy link
Contributor

It looks like it needs a rebase to build for CentOS 7, it cannot be merged if it breaks builds.

@github-actions github-actions bot removed the Stale label Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants