-```
-
-Make sure to replace your API key in the configuration.\
-\
-After a few seconds upon restart your Fluent Bit agent, the Calyptia Cloud Dashboard will list your agent. Metrics will take around 30 seconds to shows up.
-![](../.gitbook/assets/agent.png)
+- If this equation evaluates to `TRUE`, then Fluent Bit is unhealthy.
+- If this equation evaluates to `FALSE`, then Fluent Bit is healthy.
-### Contact Calyptia
+## Telemetry Pipeline
-If want to get in touch with Calyptia team, just send an email to [hello@calyptia.com](mailto:hello@calyptia.com)
+[Telemetry Pipeline](https://chronosphere.io/platform/telemetry-pipeline/) is a
+hosted service that allows you to monitor your Fluent Bit agents including data flow,
+metrics, and configurations.
diff --git a/administration/multithreading.md b/administration/multithreading.md
new file mode 100644
index 000000000..57d792d9c
--- /dev/null
+++ b/administration/multithreading.md
@@ -0,0 +1,50 @@
+---
+description: Learn how to run Fluent Bit in multiple threads for improved scalability.
+---
+
+# Multithreading
+
+Fluent Bit has one event loop to handle critical operations, like managing
+timers, receiving internal messages, scheduling flushes, and handling retries.
+This event loop runs in Fluent Bit's main thread.
+
+To free up resources in the main thread, you can configure
+[inputs](../pipeline/inputs/README.md) and [outputs](../pipeline/outputs/README.md)
+to run in their own self-contained threads. However, inputs and outputs implement
+multithreading in distinct ways: inputs can run in **threaded** mode, and outputs
+can use one or more **workers**.
+
+Threading also affects certain processes related to inputs and outputs. For example,
+[filters](../pipeline/filters/README.md) always run in the main thread, but
+[processors](../pipeline/processors/README.md) run in the self-contained threads of
+their respective inputs or outputs, if applicable.
+
+## Inputs
+
+When inputs collect telemetry data, they can either perform this process
+inside Fluent Bit's main thread or inside a separate dedicated thread. You can
+configure this behavior by enabling or disabling the `threaded` setting.
+
+All inputs are capable of running in threaded mode, but certain inputs always
+run in threaded mode regardless of configuration. These always-threaded inputs are:
+
+- [Kubernetes Events](../pipeline/inputs/kubernetes-events.md)
+- [Node Exporter Metrics](../pipeline/inputs/node-exporter-metrics.md)
+- [Process Exporter Metrics](../pipeline/inputs/process-exporter-metrics.md)
+- [Windows Exporter Metrics](../pipeline/inputs/windows-exporter-metrics.md)
+
+Inputs are not internally aware of multithreading. If an input runs in threaded
+mode, Fluent Bit manages the logistics of that input's thread.
+
+## Outputs
+
+When outputs flush data, they can either perform this operation inside Fluent Bit's
+main thread or inside a separate dedicated thread called a _worker_. Each output
+can have one or more workers running in parallel, and each worker can handle multiple
+concurrent flushes. You can configure this behavior by changing the value of the
+`workers` setting.
+
+All outputs are capable of running in multiple workers, and each output has
+a default value of `0`, `1`, or `2` workers. However, even if an output uses
+workers by default, you can safely reduce the number of workers below the default
+or disable workers entirely.
diff --git a/administration/scheduling-and-retries.md b/administration/scheduling-and-retries.md
index 67100acc7..eb865096d 100644
--- a/administration/scheduling-and-retries.md
+++ b/administration/scheduling-and-retries.md
@@ -1,5 +1,7 @@
# Scheduling and Retries
+
+
[Fluent Bit](https://fluentbit.io) has an Engine that helps to coordinate the data ingestion from input plugins and calls the _Scheduler_ to decide when it is time to flush the data through one or multiple output plugins. The Scheduler flushes new data at a fixed time of seconds and the _Scheduler_ retries when asked.
Once an output plugin gets called to flush some data, after processing that data it can notify the Engine three possible return statuses:
diff --git a/administration/transport-security.md b/administration/transport-security.md
index f2a644c8a..4443cfd70 100644
--- a/administration/transport-security.md
+++ b/administration/transport-security.md
@@ -9,6 +9,7 @@ Both input and output plugins that perform Network I/O can optionally enable TLS
| :--- | :--- | :--- |
| tls | enable or disable TLS support | Off |
| tls.verify | force certificate validation | On |
+| tls.verify\_hostname | force TLS verification of hostnames | Off |
| tls.debug | Set TLS debug verbosity level. It accept the following values: 0 \(No debug\), 1 \(Error\), 2 \(State change\), 3 \(Informational\) and 4 Verbose | 1 |
| tls.ca\_file | absolute path to CA certificate file | |
| tls.ca\_path | absolute path to scan for certificate files | |
@@ -171,3 +172,42 @@ Fluent Bit supports [TLS server name indication](https://en.wikipedia.org/wiki/S
tls.ca_file /etc/certs/fluent.crt
tls.vhost fluent.example.com
```
+
+### Verify subjectAltName
+
+By default, TLS verification of hostnames is not done automatically.
+As an example, we can extract the X509v3 Subject Alternative Name from a certificate:
+
+```
+X509v3 Subject Alternative Name:
+ DNS:my.fluent-aggregator.net
+```
+
+As you can see, this certificate covers only `my.fluent-aggregator.net` so if we use a different hostname it should fail.
+
+To fully verify the alternative name and demonstrate the failure we enable `tls.verify_hostname`:
+
+
+```text
+[INPUT]
+ Name cpu
+ Tag cpu
+
+[OUTPUT]
+ Name forward
+ Match *
+ Host other.fluent-aggregator.net
+ Port 24224
+ tls On
+ tls.verify On
+ tls.verify_hostname on
+ tls.ca_file /path/to/fluent-x509v3-alt-name.crt
+```
+
+This outgoing connect will be failed and disconnected:
+
+```
+[2024/06/17 16:51:31] [error] [tls] error: unexpected EOF with reason: certificate verify failed
+[2024/06/17 16:51:31] [debug] [upstream] connection #50 failed to other.fluent-aggregator.net:24224
+[2024/06/17 16:51:31] [error] [output:forward:forward.0] no upstream connections available
+```
diff --git a/administration/troubleshooting.md b/administration/troubleshooting.md
index cae5ad2ce..033ad23d1 100644
--- a/administration/troubleshooting.md
+++ b/administration/troubleshooting.md
@@ -1,5 +1,7 @@
# Troubleshooting
+
+
* [Tap Functionality: generate events or records](troubleshooting.md#tap-functionality)
* [Dump Internals Signal](troubleshooting#dump-internals-signal)
diff --git a/concepts/data-pipeline/buffer.md b/concepts/data-pipeline/buffer.md
index c13f904e4..928a3b7f6 100644
--- a/concepts/data-pipeline/buffer.md
+++ b/concepts/data-pipeline/buffer.md
@@ -6,9 +6,21 @@ description: Data processing with reliability
Previously defined in the [Buffering](../buffering.md) concept section, the `buffer` phase in the pipeline aims to provide a unified and persistent mechanism to store your data, either using the primary in-memory model or using the filesystem based mode.
-The `buffer` phase already contains the data in an immutable state, meaning, no other filter can be applied.
+The `buffer` phase already contains the data in an immutable state, meaning that no other filter can be applied.
-![](<../../.gitbook/assets/logging\_pipeline\_buffer (1) (1) (2) (2) (2) (2) (2) (2) (2) (1).png>)
+```mermaid
+graph LR
+ accTitle: Fluent Bit data pipeline
+ accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs.
+ A[Input] --> B[Parser]
+ B --> C[Filter]
+ C --> D[Buffer]
+ D --> E((Routing))
+ E --> F[Output 1]
+ E --> G[Output 2]
+ E --> H[Output 3]
+ style D stroke:darkred,stroke-width:2px;
+```
{% hint style="info" %}
Note that buffered data is not raw text, it's in Fluent Bit's internal binary representation.
diff --git a/concepts/data-pipeline/filter.md b/concepts/data-pipeline/filter.md
index 2323d165e..c903b7c0a 100644
--- a/concepts/data-pipeline/filter.md
+++ b/concepts/data-pipeline/filter.md
@@ -6,7 +6,19 @@ description: Modify, Enrich or Drop your records
In production environments we want to have full control of the data we are collecting, filtering is an important feature that allows us to **alter** the data before delivering it to some destination.
-![](<../../.gitbook/assets/logging\_pipeline\_filter (1) (2) (2) (2) (2) (2) (2) (1).png>)
+```mermaid
+graph LR
+ accTitle: Fluent Bit data pipeline
+ accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs.
+ A[Input] --> B[Parser]
+ B --> C[Filter]
+ C --> D[Buffer]
+ D --> E((Routing))
+ E --> F[Output 1]
+ E --> G[Output 2]
+ E --> H[Output 3]
+ style C stroke:darkred,stroke-width:2px;
+```
Filtering is implemented through plugins, so each filter available could be used to match, exclude or enrich your logs with some specific metadata.
diff --git a/concepts/data-pipeline/input.md b/concepts/data-pipeline/input.md
index ca8500b0b..0779c0985 100644
--- a/concepts/data-pipeline/input.md
+++ b/concepts/data-pipeline/input.md
@@ -6,7 +6,19 @@ description: The way to gather data from your sources
[Fluent Bit](http://fluentbit.io) provides different _Input Plugins_ to gather information from different sources, some of them just collect data from log files while others can gather metrics information from the operating system. There are many plugins for different needs.
-![](<../../.gitbook/assets/logging\_pipeline\_input (1) (2) (2) (2) (2) (2) (2) (2) (1).png>)
+```mermaid
+graph LR
+ accTitle: Fluent Bit data pipeline
+ accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs.
+ A[Input] --> B[Parser]
+ B --> C[Filter]
+ C --> D[Buffer]
+ D --> E((Routing))
+ E --> F[Output 1]
+ E --> G[Output 2]
+ E --> H[Output 3]
+ style A stroke:darkred,stroke-width:2px;
+```
When an input plugin is loaded, an internal _instance_ is created. Every instance has its own and independent configuration. Configuration keys are often called **properties**.
diff --git a/concepts/data-pipeline/output.md b/concepts/data-pipeline/output.md
index 5a96f7ee6..2fad550bd 100644
--- a/concepts/data-pipeline/output.md
+++ b/concepts/data-pipeline/output.md
@@ -6,7 +6,21 @@ description: 'Destinations for your data: databases, cloud services and more!'
The output interface allows us to define destinations for the data. Common destinations are remote services, local file system or standard interface with others. Outputs are implemented as plugins and there are many available.
-![](<../../.gitbook/assets/logging\_pipeline\_output (1) (1).png>)
+```mermaid
+graph LR
+ accTitle: Fluent Bit data pipeline
+ accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs.
+ A[Input] --> B[Parser]
+ B --> C[Filter]
+ C --> D[Buffer]
+ D --> E((Routing))
+ E --> F[Output 1]
+ E --> G[Output 2]
+ E --> H[Output 3]
+ style F stroke:darkred,stroke-width:2px;
+ style G stroke:darkred,stroke-width:2px;
+ style H stroke:darkred,stroke-width:2px;
+```
When an output plugin is loaded, an internal _instance_ is created. Every instance has its own independent configuration. Configuration keys are often called **properties**.
diff --git a/concepts/data-pipeline/parser.md b/concepts/data-pipeline/parser.md
index 034376606..b3d6e05b5 100644
--- a/concepts/data-pipeline/parser.md
+++ b/concepts/data-pipeline/parser.md
@@ -6,8 +6,19 @@ description: Convert Unstructured to Structured messages
Dealing with raw strings or unstructured messages is a constant pain; having a structure is highly desired. Ideally we want to set a structure to the incoming data by the Input Plugins as soon as they are collected:
-![](<../../.gitbook/assets/logging\_pipeline\_parser (1) (1) (1) (1) (2) (2) (2) (3) (3) (3) (3) (3) (1).png>)
-
+```mermaid
+graph LR
+ accTitle: Fluent Bit data pipeline
+ accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs.
+ A[Input] --> B[Parser]
+ B --> C[Filter]
+ C --> D[Buffer]
+ D --> E((Routing))
+ E --> F[Output 1]
+ E --> G[Output 2]
+ E --> H[Output 3]
+ style B stroke:darkred,stroke-width:2px;
+```
The Parser allows you to convert from unstructured to structured data. As a demonstrative example consider the following Apache (HTTP Server) log entry:
```
diff --git a/concepts/data-pipeline/router.md b/concepts/data-pipeline/router.md
index 0041c992e..d267f3607 100644
--- a/concepts/data-pipeline/router.md
+++ b/concepts/data-pipeline/router.md
@@ -6,7 +6,19 @@ description: Create flexible routing rules
Routing is a core feature that allows to **route** your data through Filters and finally to one or multiple destinations. The router relies on the concept of [Tags](../key-concepts.md) and [Matching](../key-concepts.md) rules
-![](<../../.gitbook/assets/logging\_pipeline\_routing (1) (1) (2) (2) (2) (2) (2) (2) (2) (1) (1).png>)
+```mermaid
+graph LR
+ accTitle: Fluent Bit data pipeline
+ accDescr: A diagram of the Fluent Bit data pipeline, which includes input, a parser, a filter, a buffer, routing, and various outputs.
+ A[Input] --> B[Parser]
+ B --> C[Filter]
+ C --> D[Buffer]
+ D --> E((Routing))
+ E --> F[Output 1]
+ E --> G[Output 2]
+ E --> H[Output 3]
+ style E stroke:darkred,stroke-width:2px;
+```
There are two important concepts in Routing:
@@ -77,7 +89,7 @@ The following example demonstrates how to route data from sources based on a reg
[OUTPUT]
Name stdout
- Match_regex .*_sensor_[AB]
+ Match_regex .*_sensor_[AB]
```
In this configuration, the **Match_regex** rule is set to `.*_sensor_[AB]`. This regular expression will match any Tag that ends with "_sensor_A" or "_sensor_B", regardless of what precedes it.
diff --git a/imgs/processor_opentelemetry_envelope.png b/imgs/processor_opentelemetry_envelope.png
new file mode 100644
index 000000000..d44920a14
Binary files /dev/null and b/imgs/processor_opentelemetry_envelope.png differ
diff --git a/installation/docker.md b/installation/docker.md
index 55d011c76..c4a468876 100644
--- a/installation/docker.md
+++ b/installation/docker.md
@@ -17,6 +17,28 @@ The following table describes the Linux container tags that are available on Doc
| Tag(s) | Manifest Architectures | Description |
| ------------ | ------------------------- | -------------------------------------------------------------- |
+| 3.1.9-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.9 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.9](https://fluentbit.io/announcements/v3.1.9/) |
+| 3.1.8-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.8 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.8](https://fluentbit.io/announcements/v3.1.8/) |
+| 3.1.7-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.7 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.7](https://fluentbit.io/announcements/v3.1.7/) |
+| 3.1.6-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.6 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.6](https://fluentbit.io/announcements/v3.1.6/) |
+| 3.1.5-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.5 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.5](https://fluentbit.io/announcements/v3.1.5/) |
+| 3.1.4-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.4 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.4](https://fluentbit.io/announcements/v3.1.4/) |
+| 3.1.3-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.3 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.3](https://fluentbit.io/announcements/v3.1.3/) |
+| 3.1.2-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.2 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.2](https://fluentbit.io/announcements/v3.1.2/) |
+| 3.1.1-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.1 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.1](https://fluentbit.io/announcements/v3.1.1/) |
+| 3.1.0-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.1.0 | x86_64, arm64v8, arm32v7, s390x | Release [v3.1.0](https://fluentbit.io/announcements/v3.1.0/) |
+| 3.0.7-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
+| 3.0.7 | x86_64, arm64v8, arm32v7, s390x | Release [v3.0.7](https://fluentbit.io/announcements/v3.0.7/) |
| 3.0.6-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
| 3.0.6 | x86_64, arm64v8, arm32v7, s390x | Release [v3.0.6](https://fluentbit.io/announcements/v3.0.6/) |
| 3.0.5-debug | x86_64, arm64v8, arm32v7, s390x | Debug images |
diff --git a/installation/getting-started-with-fluent-bit.md b/installation/getting-started-with-fluent-bit.md
index 04354dcff..5b1e4223c 100644
--- a/installation/getting-started-with-fluent-bit.md
+++ b/installation/getting-started-with-fluent-bit.md
@@ -2,6 +2,8 @@
description: The following serves as a guide on how to install/deploy/upgrade Fluent Bit
---
+
+
# Getting Started with Fluent Bit
## Container Deployment
diff --git a/installation/sources/build-and-install.md b/installation/sources/build-and-install.md
index 568130cb4..99674abf8 100644
--- a/installation/sources/build-and-install.md
+++ b/installation/sources/build-and-install.md
@@ -220,3 +220,4 @@ The following table describes the processors available on this version:
| option | description | default |
| :--- | :--- | :--- |
| [FLB\_PROCESSOR\_METRICS\_SELECTOR](../../pipeline/processors/metrics-selector.md) | Enable metrics selector processor | On |
+| [FLB\_PROCESSOR\_LABELS](../../pipeline/processors/labels.md) | Enable metrics label manipulation processor | On |
diff --git a/installation/windows.md b/installation/windows.md
index 1dd7d0a88..0b42b3dfa 100644
--- a/installation/windows.md
+++ b/installation/windows.md
@@ -79,30 +79,30 @@ From version 1.9, `td-agent-bit` is a deprecated package and was removed after 1
## Installation Packages
-The latest stable version is 3.0.6.
+The latest stable version is 3.1.9.
Each version is available via the following download URLs.
| INSTALLERS | SHA256 CHECKSUMS |
| ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------- |
-| [fluent-bit-3.0.6-win32.exe](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win32.exe) | [9ea39f5efda5b013e9135a5b67b60fa94d27dcacf5c43094dfffb693ada11f2b](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win32.exe.sha256) |
-| [fluent-bit-3.0.6-win32.zip](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win32.zip) | [97b6465ed324dd9060dcf10b28341fcfeb3d30ec3abc02583e45902a80dece5e](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win32.zip.sha256) |
-| [fluent-bit-3.0.6-win64.exe](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win64.exe) | [4f4e02bca5adac697a8d9c0477b9060c038c9ca94f4b9c31f5ca65c746a4ddfc](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win64.exe.sha256) |
-| [fluent-bit-3.0.6-win64.zip](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win64.zip) | [00701ab3260a7f9347dcaeeb55d5183dd6ba1cd51cde92625c05b0e643f2b5c1](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win64.zip.sha256) |
-| [fluent-bit-3.0.6-winarm64.exe](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-winarm64.exe) | [fc90682c155e1288af81742c50740d3be0bff2ab211b8785c7c7e85196f121cb](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-winarm64.exe.sha256) |
-| [fluent-bit-3.0.6-winarm64.zip](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-winarm64.zip) | [fda4788899de83e68d9abf79f19ea5ced3e324568551fc1543891747ba241832](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-winarm64.zip.sha256) |
+| [fluent-bit-3.1.9-win32.exe](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win32.exe) | [a42458b6275cd08bbede45ffc9f3d7abd3145e7a82bd806226b86e5cf67793bb](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win32.exe.sha256) |
+| [fluent-bit-3.1.9-win32.zip](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win32.zip) | [3eb8bcdbb394bed326b19386ba95c932819aaa6ea0418adbce6675dd98656b41](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win32.zip.sha256) |
+| [fluent-bit-3.1.9-win64.exe](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win64.exe) | [ccc12e5c01e9e87b88d431ecae34c014d2b227d2d72282359795605e1efeea3d](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win64.exe.sha256) |
+| [fluent-bit-3.1.9-win64.zip](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win64.zip) | [e83b9f1d8c91ebdbaec1b95426c54d1df55f09c84b5446ca1476662ea72f3a36](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win64.zip.sha256) |
+| [fluent-bit-3.1.9-winarm64.exe](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-winarm64.exe) | [f87aaa56ed3e0dfca39947f0d46e2a7721d89c08efd58fd3f7bf0968522ffc13](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-winarm64.exe.sha256) |
+| [fluent-bit-3.1.9-winarm64.zip](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-winarm64.zip) | [4be74c9696836fd802d79a00e65e53e13f901a2e3adb6c7786e74ffde89b072b](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-winarm64.zip.sha256) |
**Note these are now using the Github Actions built versions, the legacy AppVeyor builds are still available (AMD 32/64 only) at releases.fluentbit.io but are deprecated.**
MSI installers are also available:
-- [fluent-bit-3.0.6-win32.msi](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win32.msi)
-- [fluent-bit-3.0.6-win64.msi](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-win64.msi)
-- [fluent-bit-3.0.6-winarm64.msi](https://packages.fluentbit.io/windows/fluent-bit-3.0.6-winarm64.msi)
+- [fluent-bit-3.1.9-win32.msi](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win32.msi)
+- [fluent-bit-3.1.9-win64.msi](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-win64.msi)
+- [fluent-bit-3.1.9-winarm64.msi](https://packages.fluentbit.io/windows/fluent-bit-3.1.9-winarm64.msi)
To check the integrity, use `Get-FileHash` cmdlet on PowerShell.
```powershell
-PS> Get-FileHash fluent-bit-3.0.6-win32.exe
+PS> Get-FileHash fluent-bit-3.1.9-win32.exe
```
## Installing from ZIP archive
@@ -112,7 +112,7 @@ Download a ZIP archive from above. There are installers for 32-bit and 64-bit en
Then you need to expand the ZIP archive. You can do this by clicking "Extract All" on Explorer, or if you're using PowerShell, you can use `Expand-Archive` cmdlet.
```powershell
-PS> Expand-Archive fluent-bit-3.0.6-win64.zip
+PS> Expand-Archive fluent-bit-3.1.9-win64.zip
```
The ZIP package contains the following set of files.
diff --git a/local-testing/validating-your-data-and-structure.md b/local-testing/validating-your-data-and-structure.md
index 81e3d2e63..aa72c6af7 100644
--- a/local-testing/validating-your-data-and-structure.md
+++ b/local-testing/validating-your-data-and-structure.md
@@ -1,47 +1,80 @@
-# Validating your Data and Structure
+# Validating your data and structure
-Fluent Bit is a powerful log processing tool that can deal with different sources and formats, in addition it provides several filters that can be used to perform custom modifications. This flexibility is really good but while your pipeline grows, it's strongly recommended to validate your data and structure.
+Fluent Bit is a powerful log processing tool that supports mulitple sources and
+formats. In addition, it provides filters that can be used to perform custom
+modifications. As your pipeline grows, it's important to validate your data and
+structure.
-> We encourage Fluent Bit users to integrate data validation in their CI systems
+Fluent Bit users are encouraged to integrate data validation in their contininuous
+integration (CI) systems.
-A simplified view of our data processing pipeline is as follows:
+In a normal production environment, inputs, filters, and outputs are defined in the
+configuration. Fluent Bit provides the [Expect](../pipeline/filters/expect.md) filter,
+which can be used to validate `keys` and `values` from your records and take action
+when an exception is found.
-![](../.gitbook/assets/flb_pipeline_simplified.png)
+A simplified view of the data processing pipeline is as follows:
-In a normal production environment, many Inputs, Filters, and Outputs are defined in the configuration, so integrating a continuous validation of your configuration against expected results is a must. For this requirement, Fluent Bit provides a specific Filter called **Expect** which can be used to validate expected Keys and Values from your records and takes some action when an exception is found.
+```mermaid
+flowchart LR
+IS[Inputs / Sources]
+Fil[Filters]
+OD[Outputs/ Destination]
+IS --> Fil --> OD
+```
-## How it Works
+## Understand structure and configuration
-As an example, consider the following pipeline where your source of data is a normal file with JSON content on it and then two filters: [grep](../pipeline/filters/grep.md) to exclude certain records and [record\_modifier](../pipeline/filters/record-modifier.md) to alter the record content adding and removing specific keys.
+Consider the following pipeline, where your source of data is a file with JSON
+content and two filters:
-![](../.gitbook/assets/flb_pipeline_simplified_example_01.png)
+- [grep](../pipeline/filters/grep.md) to exclude certain records
+- [record_modifier](../pipeline/filters/record-modifier.md) to alter the record
+ content by adding and removing specific keys.
-Ideally you want to add checkpoints of validation of your data between each step so you can know if your data structure is correct, we do this by using **expect** filter.
+```mermaid
+flowchart LR
+tail["tail (input)"]
+grep["grep (filter)"]
+record["record_modifier (filter)"]
+stdout["stdout (output)"]
-![](../.gitbook/assets/flb_pipeline_simplified_expect.png)
+tail --> grep
+grep --> record
+record --> stdout
+```
-Expect filter sets rules that aims to validate certain criteria like:
+Add data validation between each step to ensure your data structure is correct.
+
+This example uses the `expect` filter.
+
+```mermaid
+flowchart LR
+tail["tail (input)"]
+grep["grep (filter)"]
+record["record_modifier (filter)"]
+stdout["stdout (output)"]
+E1["expect (filter)"]
+E2["expect (filter)"]
+E3["expect (filter)"]
+tail --> E1 --> grep
+grep --> E2 --> record --> E3 --> stdout
+```
-* does the record contain a key A ?
-* does the record not contains key A?
-* does the record key A value equals NULL ?
-* does the record key A value a different value than NULL ?
-* does the record key A value equals B ?
+`Expect` filters set rules aiming to validate criteria like:
-Every expect filter configuration can expose specific rules to validate the content of your records, it supports the following configuration properties:
+- Does the record contain a key `A`?
+- Does the record not contain key `A`?
+- Does the record key `A` value equal `NULL`?
+- Is the record key `A` value not `NULL`?
+- Does the record key `A` value equal `B`?
-| Property | Description |
-| :--- | :--- |
-| key\_exists | Check if a key with a given name exists in the record. |
-| key\_not\_exists | Check if a key does not exist in the record. |
-| key\_val\_is\_null | check that the value of the key is NULL. |
-| key\_val\_is\_not\_null | check that the value of the key is NOT NULL. |
-| key\_val\_eq | check that the value of the key equals the given value in the configuration. |
-| action | action to take when a rule does not match. The available options are `warn` or `exit`. On `warn`, a warning message is sent to the logging layer when a mismatch of the rules above is found; using `exit` makes Fluent Bit abort with status code `255`. |
+Every `expect` filter configuration exposes rules to validate the content of your
+records using [configuration properties](../pipeline/filters/expect.md#configuration-parameters).
-## Start Testing
+## Test the configuration
-Consider the following JSON file called `data.log` with the following content:
+Consider a JSON file `data.log` with the following content:
```javascript
{"color": "blue", "label": {"name": null}}
@@ -49,7 +82,9 @@ Consider the following JSON file called `data.log` with the following content:
{"color": "green", "label": {"name": "abc"}, "meta": null}
```
-The following Fluent Bit configuration file will configure a pipeline to consume the log above apply an expect filter to validate that keys `color` and `label` exists:
+The following Fluent Bit configuration file configures a pipeline to consume the
+log, while applying an `expect` filter to validate that the keys `color` and `label`
+exist:
```python
[SERVICE]
@@ -76,9 +111,12 @@ The following Fluent Bit configuration file will configure a pipeline to consume
match *
```
-note that if for some reason the JSON parser failed or is missing in the `tail` input \(line 9\), the `expect` filter will trigger the `exit` action. As a test, go ahead and comment out or remove line 9.
+If the JSON parser fails or is missing in the `tail` input
+(`parser json`), the `expect` filter triggers the `exit` action.
-As a second step, we will extend our pipeline and we will add a grep filter to match records that map `label` contains a key called `name` with value `abc`, then an expect filter to re-validate that condition:
+To extend the pipeline, add a grep filter to match records that map `label`
+containing a key called `name` with value the `abc`, and add an `expect` filter
+to re-validate that condition:
```python
[SERVICE]
@@ -131,7 +169,8 @@ As a second step, we will extend our pipeline and we will add a grep filter to m
match *
```
-## Deploying in Production
-
-When deploying your configuration in production, you might want to remove the expect filters from your configuration since it's an unnecessary _extra work_ unless you want to have a 100% coverage of checks at runtime.
+## Production deployment
+When deploying in production, consider removing the `expect` filters from your
+configuration. These filters are unneccesary unless you need 100% coverage of
+checks at runtime.
diff --git a/pipeline/filters/grep.md b/pipeline/filters/grep.md
index c3aeb609d..019b007c5 100644
--- a/pipeline/filters/grep.md
+++ b/pipeline/filters/grep.md
@@ -1,28 +1,33 @@
---
-description: Select or exclude records per patterns
+description: Select or exclude records using patterns
---
# Grep
-The _Grep Filter_ plugin allows you to match or exclude specific records based on regular expression patterns for values or nested values.
+The _Grep Filter_ plugin lets you match or exclude specific records based on
+regular expression patterns for values or nested values.
-## Configuration Parameters
+## Configuration parameters
The plugin supports the following configuration parameters:
-| Key | Value Format | Description |
-| :--- | :--- | :--- |
-| Regex | KEY REGEX | Keep records in which the content of KEY matches the regular expression. |
-| Exclude | KEY REGEX | Exclude records in which the content of KEY matches the regular expression. |
-| Logical_Op| Operation | Specify which logical operator to use. `AND` , `OR` and `legacy` are allowed as an Operation. Default is `legacy` for backward compatibility. In `legacy` mode the behaviour is either AND or OR depending whether the `grep` is including (uses AND) or excluding (uses OR). Only available from 2.1+. |
+| Key | Value Format | Description |
+| ------------ | ------------ | ----------- |
+| `Regex` | KEY REGEX | Keep records where the content of KEY matches the regular expression. |
+| `Exclude` | KEY REGEX | Exclude records where the content of KEY matches the regular expression. |
+| `Logical_Op` | Operation | Specify a logical operator: `AND`, `OR` or `legacy` (default). In `legacy` mode the behaviour is either `AND` or `OR` depending on whether the `grep` is including (uses AND) or excluding (uses OR). Available from 2.1 or higher. |
-#### Record Accessor Enabled
+### Record Accessor Enabled
-This plugin enables the [Record Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md) feature to specify the KEY. Using the _record accessor_ is suggested if you want to match values against nested values.
+Enable the [Record Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)
+feature to specify the KEY. Use the record accessor to match values against nested
+values.
-## Getting Started
+## Filter records
-In order to start filtering records, you can run the filter from the command line or through the configuration file. The following example assumes that you have a file called `lines.txt` with the following content:
+To start filtering records, run the filter from the command line or through the
+configuration file. The following example assumes that you have a file named
+`lines.txt` with the following content:
```text
{"log": "aaa"}
@@ -35,20 +40,25 @@ In order to start filtering records, you can run the filter from the command lin
{"log": "ggg"}
```
-### Command Line
+### Command line
-> Note: using the command line mode need special attention to quote the regular expressions properly. It's suggested to use a configuration file.
+When using the command line, pay close attention to quote the regular expressions.
+Using a configuration file might be easier.
-The following command will load the _tail_ plugin and read the content of `lines.txt` file. Then the _grep_ filter will apply a regular expression rule over the _log_ field \(created by tail plugin\) and only _pass_ the records which field value starts with _aa_:
+The following command loads the [tail](../../pipeline/inputs/tail) plugin and
+reads the content of `lines.txt`. Then the `grep` filter applies a regular
+expression rule over the `log` field created by the `tail` plugin and only passes
+records with a field value starting with `aa`:
```text
$ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o stdout
```
-### Configuration File
+### Configuration file
{% tabs %}
{% tab title="fluent-bit.conf" %}
+
```python
[SERVICE]
parsers_file /path/to/parsers.conf
@@ -67,9 +77,11 @@ $ bin/fluent-bit -i tail -p 'path=lines.txt' -F grep -p 'regex=log aa' -m '*' -o
name stdout
match *
```
+
{% endtab %}
{% tab title="fluent-bit.yaml" %}
+
```yaml
service:
parsers_file: /path/to/parsers.conf
@@ -87,14 +99,21 @@ pipeline:
match: '*'
```
+
{% endtab %}
{% endtabs %}
-The filter allows to use multiple rules which are applied in order, you can have many _Regex_ and _Exclude_ entries as required.
+The filter lets you use multiple rules which are applied in order. You can
+have as many `Regex` and `Exclude` entries as required.
### Nested fields example
-If you want to match or exclude records based on nested values, you can use a [Record Accessor ](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)format as the KEY name. Consider the following record example:
+To match or exclude records based on nested values, you can use
+[Record
+Accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md)
+format as the `KEY` name.
+
+Consider the following record example:
```javascript
{
@@ -113,40 +132,45 @@ If you want to match or exclude records based on nested values, you can use a [R
}
```
-if you want to exclude records that match given nested field \(for example `kubernetes.labels.app`\), you can use the following rule:
+For example, to exclude records that match the nested field `kubernetes.labels.app`,
+use the following rule:
{% tabs %}
{% tab title="fluent-bit.conf" %}
+
```python
[FILTER]
Name grep
Match *
Exclude $kubernetes['labels']['app'] myapp
```
-{% endtab %}
+{% endtab %}
{% tab title="fluent-bit.yaml" %}
+
```yaml
filters:
- name: grep
match: '*'
exclude: $kubernetes['labels']['app'] myapp
```
+
{% endtab %}
{% endtabs %}
-### Excluding records missing/invalid fields
-
-It may be that in your processing pipeline you want to drop records that are missing certain keys.
+### Excluding records with missing or invalid fields
-A simple way to do this is just to `exclude` with a regex that matches anything, a missing key will fail this check.
+You might want to drop records that are missing certain keys.
-Here is an example that checks for a specific valid value for the key as well:
+One way to do this is to `exclude` with a regex that matches anything. A missing
+key fails this check.
+The followinfg example checks for a specific valid value for the key:
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```
+
+```text
# Use Grep to verify the contents of the iot_timestamp value.
# If the iot_timestamp key does not exist, this will fail
# and exclude the row.
@@ -156,9 +180,10 @@ Here is an example that checks for a specific valid value for the key as well:
Match iots_thread.*
Regex iot_timestamp ^\d{4}-\d{2}-\d{2}
```
-{% endtab %}
+{% endtab %}
{% tab title="fluent-bit.yaml" %}
+
```yaml
filters:
- name: grep
@@ -166,20 +191,23 @@ Here is an example that checks for a specific valid value for the key as well:
match: iots_thread.*
regex: iot_timestamp ^\d{4}-\d{2}-\d{2}
```
+
{% endtab %}
{% endtabs %}
-The specified key `iot_timestamp` must match the expected expression - if it does not or is missing/empty then it will be excluded.
+The specified key `iot_timestamp` must match the expected expression. If it doesn't,
+or is missing or empty, then it will be excluded.
### Multiple conditions
-If you want to set multiple `Regex` or `Exclude`, you can use `Logical_Op` property to use logical conjuction or disjunction.
-
-Note: If `Logical_Op` is set, setting both 'Regex' and `Exclude` results in an error.
+If you want to set multiple `Regex` or `Exclude`, use the `Logical_Op` property
+to use a logical conjuction or disjunction.
+If `Logical_Op` is set, setting both `Regex` and `Exclude` results in an error.
{% tabs %}
{% tab title="fluent-bit.conf" %}
+
```python
[INPUT]
Name dummy
@@ -196,9 +224,11 @@ Note: If `Logical_Op` is set, setting both 'Regex' and `Exclude` results in an e
[OUTPUT]
Name stdout
```
+
{% endtab %}
{% tab title="fluent-bit.yaml" %}
+
```yaml
pipeline:
inputs:
@@ -215,11 +245,13 @@ pipeline:
outputs:
- name: stdout
```
+
{% endtab %}
{% endtabs %}
-Output will be
-```
+The output looks similar to:
+
+```text
Fluent Bit v2.0.9
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
@@ -236,4 +268,4 @@ Fluent Bit v2.0.9
[2023/01/22 09:46:49] [ info] [output:stdout:stdout.0] worker #0 started
[0] dummy: [1674348410.558341857, {"endpoint"=>"localhost", "value"=>"something"}]
[0] dummy: [1674348411.546425499, {"endpoint"=>"localhost", "value"=>"something"}]
-```
\ No newline at end of file
+```
diff --git a/pipeline/filters/kubernetes.md b/pipeline/filters/kubernetes.md
index 3a352eadb..492c86862 100644
--- a/pipeline/filters/kubernetes.md
+++ b/pipeline/filters/kubernetes.md
@@ -37,6 +37,7 @@ The plugin supports the following configuration parameters:
| Keep\_Log | When `Keep_Log` is disabled, the `log` field is removed from the incoming message once it has been successfully merged \(`Merge_Log` must be enabled as well\). | On |
| tls.debug | Debug level between 0 \(nothing\) and 4 \(every detail\). | -1 |
| tls.verify | When enabled, turns on certificate validation when connecting to the Kubernetes API server. | On |
+| tls.verify\_hostname | When enabled, turns on hostname validation for certificates | Off |
| Use\_Journal | When enabled, the filter reads logs coming in Journald format. | Off |
| Cache\_Use\_Docker\_Id | When enabled, metadata will be fetched from K8s when docker\_id is changed. | Off |
| Regex\_Parser | Set an alternative Parser to process record Tag and extract pod\_name, namespace\_name, container\_name and docker\_id. The parser must be registered in a [parsers file](https://github.com/fluent/fluent-bit/blob/master/conf/parsers.conf) \(refer to parser _filter-kube-test_ as an example\). | |
@@ -270,7 +271,7 @@ There are some configuration setup needed for this feature.
Role Configuration for Fluent Bit DaemonSet Example:
-```text
+```yaml
---
apiVersion: v1
kind: ServiceAccount
@@ -313,34 +314,34 @@ The difference is that kubelet need a special permission for resource `nodes/pro
Fluent Bit Configuration Example:
```text
- [INPUT]
- Name tail
- Tag kube.*
- Path /var/log/containers/*.log
- DB /var/log/flb_kube.db
- Parser docker
- Docker_Mode On
- Mem_Buf_Limit 50MB
- Skip_Long_Lines On
- Refresh_Interval 10
-
- [FILTER]
- Name kubernetes
- Match kube.*
- Kube_URL https://kubernetes.default.svc.cluster.local:443
- Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
- Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
- Merge_Log On
- Buffer_Size 0
- Use_Kubelet true
- Kubelet_Port 10250
+[INPUT]
+ Name tail
+ Tag kube.*
+ Path /var/log/containers/*.log
+ DB /var/log/flb_kube.db
+ Parser docker
+ Docker_Mode On
+ Mem_Buf_Limit 50MB
+ Skip_Long_Lines On
+ Refresh_Interval 10
+
+[FILTER]
+ Name kubernetes
+ Match kube.*
+ Kube_URL https://kubernetes.default.svc.cluster.local:443
+ Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
+ Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
+ Merge_Log On
+ Buffer_Size 0
+ Use_Kubelet true
+ Kubelet_Port 10250
```
So for fluent bit configuration, you need to set the `Use_Kubelet` to true to enable this feature.
DaemonSet config Example:
-```text
+```yaml
---
apiVersion: apps/v1
kind: DaemonSet
diff --git a/pipeline/filters/log_to_metrics.md b/pipeline/filters/log_to_metrics.md
index 5dbc017d7..6e1630fbf 100644
--- a/pipeline/filters/log_to_metrics.md
+++ b/pipeline/filters/log_to_metrics.md
@@ -2,6 +2,8 @@
description: Generate metrics from logs
---
+
+
# Log To Metrics
The _Log To Metrics Filter_ plugin allows you to generate log-derived metrics. It currently supports modes to count records, provide a gauge for field values or create a histogram. You can also match or exclude specific records based on regular expression patterns for values or nested values. This filter plugin does not actually act as a record filter and does not change or drop records. All records will pass this filter untouched and generated metrics will be emitted into a seperate metric pipeline.
diff --git a/pipeline/filters/lua.md b/pipeline/filters/lua.md
index 09d56808a..4f1bab092 100644
--- a/pipeline/filters/lua.md
+++ b/pipeline/filters/lua.md
@@ -2,6 +2,8 @@
The **Lua** filter allows you to modify the incoming records (even split one record into multiple records) using custom [Lua](https://www.lua.org/) scripts.
+
+
Due to the necessity to have a flexible filtering mechanism, it is now possible to extend Fluent Bit capabilities by writing custom filters using Lua programming language. A Lua-based filter takes two steps:
1. Configure the Filter in the main configuration
@@ -194,12 +196,12 @@ We want to extract the `sandboxbsh` name and add it to our record as a special k
{% tabs %}
{% tab title="fluent-bit.conf" %}
```
- [FILTER]
- Name lua
- Alias filter-iots-lua
- Match iots_thread.*
- Script filters.lua
- Call set_landscape_deployment
+[FILTER]
+ Name lua
+ Alias filter-iots-lua
+ Match iots_thread.*
+ Script filters.lua
+ Call set_landscape_deployment
```
{% endtab %}
@@ -356,23 +358,23 @@ Configuration to get istio logs and apply response code filter to them.
{% tabs %}
{% tab title="fluent-bit.conf" %}
```ini
- [INPUT]
- Name tail
- Path /var/log/containers/*_istio-proxy-*.log
- multiline.parser docker, cri
- Tag istio.*
- Mem_Buf_Limit 64MB
- Skip_Long_Lines Off
-
- [FILTER]
- Name lua
- Match istio.*
- Script response_code_filter.lua
- call cb_response_code_filter
-
- [Output]
- Name stdout
- Match *
+[INPUT]
+ Name tail
+ Path /var/log/containers/*_istio-proxy-*.log
+ multiline.parser docker, cri
+ Tag istio.*
+ Mem_Buf_Limit 64MB
+ Skip_Long_Lines Off
+
+[FILTER]
+ Name lua
+ Match istio.*
+ Script response_code_filter.lua
+ call cb_response_code_filter
+
+[Output]
+ Name stdout
+ Match *
```
{% endtab %}
@@ -436,3 +438,134 @@ pipeline:
#### Output
In the output only the messages with response code 0 or greater than 399 are shown.
+
+
+### Timeformat Conversion
+
+The following example converts a field's specific type of `datetime` format to
+`utc ISO 8601` format.
+
+#### Lua script
+
+Script `custom_datetime_format.lua`
+
+```lua
+function convert_to_utc(tag, timestamp, record)
+ local date_time = record["pub_date"]
+ local new_record = record
+ if date_time then
+ if string.find(date_time, ",") then
+ local pattern = "(%a+, %d+ %a+ %d+ %d+:%d+:%d+) ([+-]%d%d%d%d)"
+ local date_part, zone_part = date_time:match(pattern)
+
+ if date_part and zone_part then
+ local command = string.format("date -u -d '%s %s' +%%Y-%%m-%%dT%%H:%%M:%%SZ", date_part, zone_part)
+ local handle = io.popen(command)
+ local result = handle:read("*a")
+ handle:close()
+ new_record["pub_date"] = result:match("%S+")
+ end
+ end
+ end
+ return 1, timestamp, new_record
+end
+```
+
+#### Configuration
+
+Use this configuration to obtain a JSON key with `datetime`, and then convert it to
+another format.
+
+{% tabs %}
+{% tab title="fluent-bit.conf" %}
+```ini
+[INPUT]
+ Name dummy
+ Dummy {"event": "Restock", "pub_date": "Tue, 30 Jul 2024 18:01:06 +0000"}
+ Tag event_category_a
+
+[INPUT]
+ Name dummy
+ Dummy {"event": "Soldout", "pub_date": "Mon, 29 Jul 2024 10:15:00 +0600"}
+ Tag event_category_b
+
+
+[FILTER]
+ Name lua
+ Match *
+ Script custom_datetime_format.lua
+ call convert_to_utc
+
+[Output]
+ Name stdout
+ Match *
+```
+{% endtab %}
+
+{% tab title="fluent-bit.yaml" %}
+```yaml
+pipeline:
+ inputs:
+ - name: dummy
+ dummy: '{"event": "Restock", "pub_date": "Tue, 30 Jul 2024 18:01:06 +0000"}'
+ tag: event_category_a
+
+ - name: dummy
+ dummy: '{"event": "Soldout", "pub_date": "Mon, 29 Jul 2024 10:15:00 +0600"}'
+ tag: event_category_b
+
+ filters:
+ - name: lua
+ match: '*'
+ code: |
+ function convert_to_utc(tag, timestamp, record)
+ local date_time = record["pub_date"]
+ local new_record = record
+ if date_time then
+ if string.find(date_time, ",") then
+ local pattern = "(%a+, %d+ %a+ %d+ %d+:%d+:%d+) ([+-]%d%d%d%d)"
+ local date_part, zone_part = date_time:match(pattern)
+ if date_part and zone_part then
+ local command = string.format("date -u -d '%s %s' +%%Y-%%m-%%dT%%H:%%M:%%SZ", date_part, zone_part)
+ local handle = io.popen(command)
+ local result = handle:read("*a")
+ handle:close()
+ new_record["pub_date"] = result:match("%S+")
+ end
+ end
+ end
+ return 1, timestamp, new_record
+ end
+ call: convert_to_utc
+
+ outputs:
+ - name: stdout
+ match: '*'
+```
+{% endtab %}
+{% endtabs %}
+
+#### Input
+
+```json
+{"event": "Restock", "pub_date": "Tue, 30 Jul 2024 18:01:06 +0000"}
+```
+and
+
+```json
+{"event": "Soldout", "pub_date": "Mon, 29 Jul 2024 10:15:00 +0600"}
+```
+Which are handled by dummy in this example.
+
+#### Output
+
+The output of this process shows the conversion of the `datetime` of two timezones to
+`ISO 8601` format in `UTC`.
+
+```ini
+...
+[2024/08/01 00:56:25] [ info] [output:stdout:stdout.0] worker #0 started
+[0] event_category_a: [[1722452186.727104902, {}], {"event"=>"Restock", "pub_date"=>"2024-07-30T18:01:06Z"}]
+[0] event_category_b: [[1722452186.730255842, {}], {"event"=>"Soldout", "pub_date"=>"2024-07-29T04:15:00Z"}]
+...
+```
\ No newline at end of file
diff --git a/pipeline/filters/nest.md b/pipeline/filters/nest.md
index 96990ca81..b0262d324 100644
--- a/pipeline/filters/nest.md
+++ b/pipeline/filters/nest.md
@@ -1,15 +1,16 @@
# Nest
-The _Nest Filter_ plugin allows you to operate on or with nested data. Its modes of operation are
+The _Nest Filter_ plugin lets you operate on or with nested data. Its modes of operation are:
-* `nest` - Take a set of records and place them in a map
-* `lift` - Take a map by key and lift its records up
+- `nest` - Take a set of records and place them in a map.
+- `lift` - Take a map by key and lift its records up.
-## Example usage \(nest\)
+## Example usage for `nest`
-As an example using JSON notation, to nest keys matching the `Wildcard` value `Key*` under a new key `NestKey` the transformation becomes,
+As an example using JSON notation, to nest keys matching the `Wildcard` value `Key*`
+under a new key `NestKey` the transformation becomes:
-_Example \(input\)_
+Input:
```text
{
@@ -19,7 +20,7 @@ _Example \(input\)_
}
```
-_Example \(output\)_
+Output:
```text
{
@@ -31,11 +32,12 @@ _Example \(output\)_
}
```
-## Example usage \(lift\)
+## Example usage for `lift`
-As an example using JSON notation, to lift keys nested under the `Nested_under` value `NestKey*` the transformation becomes,
+As an example using JSON notation, to lift keys nested under the `Nested_under` value
+`NestKey*` the transformation becomes:
-_Example \(input\)_
+Input:
```text
{
@@ -47,7 +49,7 @@ _Example \(input\)_
}
```
-_Example \(output\)_
+Output:
```text
{
@@ -61,40 +63,47 @@ _Example \(output\)_
The plugin supports the following configuration parameters:
-| Key | Value Format | Operation | Description |
+| Key | Value format | Operation | Description |
| :--- | :--- | :--- | :--- |
-| Operation | ENUM \[`nest` or `lift`\] | | Select the operation `nest` or `lift` |
-| Wildcard | FIELD WILDCARD | `nest` | Nest records which field matches the wildcard |
-| Nest\_under | FIELD STRING | `nest` | Nest records matching the `Wildcard` under this key |
-| Nested\_under | FIELD STRING | `lift` | Lift records nested under the `Nested_under` key |
-| Add\_prefix | FIELD STRING | ANY | Prefix affected keys with this string |
-| Remove\_prefix | FIELD STRING | ANY | Remove prefix from affected keys if it matches this string |
+| `Operation` | ENUM [`nest` or `lift`] | | Select the operation `nest` or `lift` |
+| `Wildcard` | FIELD WILDCARD | `nest` | Nest records which field matches the wildcard |
+| `Nest_under` | FIELD STRING | `nest` | Nest records matching the `Wildcard` under this key |
+| `Nested_under` | FIELD STRING | `lift` | Lift records nested under the `Nested_under` key |
+| `Add_prefix` | FIELD STRING | ANY | Prefix affected keys with this string |
+| `Remove_prefix` | FIELD STRING | ANY | Remove prefix from affected keys if it matches this string |
## Getting Started
-In order to start filtering records, you can run the filter from the command line or through the configuration file. The following invokes the [Memory Usage Input Plugin](../inputs/memory-metrics.md), which outputs the following \(example\),
+To start filtering records, run the filter from the command line or through the
+configuration file. The following example invokes the
+[Memory Usage Input Plugin](../inputs/memory-metrics.md), which outputs the
+following:
```text
[0] memory: [1488543156, {"Mem.total"=>1016044, "Mem.used"=>841388, "Mem.free"=>174656, "Swap.total"=>2064380, "Swap.used"=>139888, "Swap.free"=>1924492}]
```
-## Example \#1 - nest
+## Example 1 - nest
### Command Line
-> Note: Using the command line mode requires quotes parse the wildcard properly. The use of a configuration file is recommended.
+Using command line mode requires quotes to parse the wildcard properly. The use
+of a configuration file is recommended.
-The following command will load the _mem_ plugin. Then the _nest_ filter will match the wildcard rule to the keys and nest the keys matching `Mem.*` under the new key `NEST`.
+The following command loads the _mem_ plugin. Then the _nest_ filter matches the
+wildcard rule to the keys and nests the keys matching `Mem.*` under the new key
+`NEST`.
-```text
-$ bin/fluent-bit -i mem -p 'tag=mem.local' -F nest -p 'Operation=nest' -p 'Wildcard=Mem.*' -p 'Nest_under=Memstats' -p 'Remove_prefix=Mem.' -m '*' -o stdout
+```shell copy
+bin/fluent-bit -i mem -p 'tag=mem.local' -F nest -p 'Operation=nest' -p 'Wildcard=Mem.*' -p 'Nest_under=Memstats' -p 'Remove_prefix=Mem.' -m '*' -o stdout
```
### Configuration File
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```python
+
+```python copy
[INPUT]
Name mem
Tag mem.local
@@ -111,10 +120,12 @@ $ bin/fluent-bit -i mem -p 'tag=mem.local' -F nest -p 'Operation=nest' -p 'Wildc
Nest_under Memstats
Remove_prefix Mem.
```
+
{% endtab %}
{% tab title="fluent-bit.yaml" %}
-```yaml
+
+```yaml copy
pipeline:
inputs:
- name: mem
@@ -130,6 +141,7 @@ pipeline:
- name: stdout
match: '*'
```
+
{% endtab %}
{% endtabs %}
@@ -142,15 +154,17 @@ The output of both the command line and configuration invocations should be iden
[0] mem.local: [1522978514.007359767, {"Swap.total"=>1046524, "Swap.used"=>0, "Swap.free"=>1046524, "Memstats"=>{"total"=>4050908, "used"=>714984, "free"=>3335924}}]
```
-## Example \#2 - nest and lift undo
+## Example 2 - nest and lift undo
-This example nests all `Mem.*` and `Swap,*` items under the `Stats` key and then reverses these actions with a `lift` operation. The output appears unchanged.
+This example nests all `Mem.*` and `Swap.*` items under the `Stats` key and then
+reverses these actions with a `lift` operation. The output appears unchanged.
-### Configuration File
+### Example 2 Configuration File
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```python
+
+```python copy
[INPUT]
Name mem
Tag mem.local
@@ -175,10 +189,11 @@ This example nests all `Mem.*` and `Swap,*` items under the `Stats` key and then
Nested_under Stats
Remove_prefix NESTED
```
-{% endtab %}
+{% endtab %}
{% tab title="fluent-bit.yaml" %}
-```yaml
+
+```yaml copy
pipeline:
inputs:
- name: mem
@@ -201,6 +216,7 @@ pipeline:
- name: stdout
match: '*'
```
+
{% endtab %}
{% endtabs %}
@@ -211,15 +227,17 @@ pipeline:
[0] mem.local: [1529566958.000940636, {"Mem.total"=>8053656, "Mem.used"=>6940380, "Mem.free"=>1113276, "Swap.total"=>16532988, "Swap.used"=>1286772, "Swap.free"=>15246216}]
```
-## Example \#3 - nest 3 levels deep
+## Example 3 - nest 3 levels deep
-This example takes the keys starting with `Mem.*` and nests them under `LAYER1`, which itself is then nested under `LAYER2`, which is nested under `LAYER3`.
+This example takes the keys starting with `Mem.*` and nests them under `LAYER1`,
+which is then nested under `LAYER2`, which is nested under `LAYER3`.
-### Configuration File
+### Example 3 Configuration File
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```python
+
+```python copy
[INPUT]
Name mem
Tag mem.local
@@ -249,10 +267,11 @@ This example takes the keys starting with `Mem.*` and nests them under `LAYER1`,
Wildcard LAYER2*
Nest_under LAYER3
```
-{% endtab %}
+{% endtab %}
{% tab title="fluent-bit.yaml" %}
-```yaml
+
+```yaml copy
pipeline:
inputs:
- name: mem
@@ -277,6 +296,7 @@ pipeline:
- name: stdout
match: '*'
```
+
{% endtab %}
{% endtabs %}
@@ -302,15 +322,19 @@ pipeline:
}
```
-## Example \#4 - multiple nest and lift filters with prefix
+## Example 4 - multiple nest and lift filters with prefix
-This example starts with the 3-level deep nesting of _Example 2_ and applies the `lift` filter three times to reverse the operations. The end result is that all records are at the top level, without nesting, again. One prefix is added for each level that is lifted.
+This example uses the 3-level deep nesting of _Example 2_ and applies the
+`lift` filter three times to reverse the operations. The end result is that all
+records are at the top level, without nesting, again. One prefix is added for each
+level that's lifted.
### Configuration file
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```python
+
+```python copy
[INPUT]
Name mem
Tag mem.local
@@ -361,10 +385,12 @@ This example starts with the 3-level deep nesting of _Example 2_ and applies the
Nested_under Lifted3_Lifted2_LAYER1
Add_prefix Lifted3_Lifted2_Lifted1_
```
+
{% endtab %}
{% tab title="fluent-bit.yaml" %}
-```yaml
+
+```yaml copy
pipeline:
inputs:
- name: mem
@@ -404,23 +430,21 @@ pipeline:
- name: stdout
match: '*'
```
+
{% endtab %}
{% endtabs %}
-
### Result
```text
[0] mem.local: [1524862951.013414798, {"Swap.total"=>1046524, "Swap.used"=>0, "Swap.free"=>1046524, "Lifted3_Lifted2_Lifted1_Mem.total"=>4050908, "Lifted3_Lifted2_Lifted1_Mem.used"=>1253912, "Lifted3_Lifted2_Lifted1_Mem.free"=>2796996}]
-
{
- "Swap.total"=>1046524,
- "Swap.used"=>0,
- "Swap.free"=>1046524,
- "Lifted3_Lifted2_Lifted1_Mem.total"=>4050908,
- "Lifted3_Lifted2_Lifted1_Mem.used"=>1253912,
+ "Swap.total"=>1046524,
+ "Swap.used"=>0,
+ "Swap.free"=>1046524,
+ "Lifted3_Lifted2_Lifted1_Mem.total"=>4050908,
+ "Lifted3_Lifted2_Lifted1_Mem.used"=>1253912,
"Lifted3_Lifted2_Lifted1_Mem.free"=>2796996
}
```
-
diff --git a/pipeline/filters/record-modifier.md b/pipeline/filters/record-modifier.md
index 0a1f74572..dc64aea7c 100644
--- a/pipeline/filters/record-modifier.md
+++ b/pipeline/filters/record-modifier.md
@@ -1,24 +1,26 @@
# Record Modifier
-The _Record Modifier Filter_ plugin allows to append fields or to exclude specific fields.
+The _Record Modifier_ [filter](pipeline/filters.md) plugin lets you append
+fields to a record, or exclude specific fields.
-## Configuration Parameters
+## Configuration parameters
-The plugin supports the following configuration parameters: _Remove\_key_ and _Allowlist\_key_ are exclusive.
+The plugin supports the following configuration parameters:
| Key | Description |
| :--- | :--- |
-| Record | Append fields. This parameter needs key and value pair. |
-| Remove\_key | If the key is matched, that field is removed. |
-| Allowlist\_key | If the key is **not** matched, that field is removed. |
-| Whitelist\_key | An alias of `Allowlist_key` for backwards compatibility. |
-| Uuid\_key| If set, the plugin appends uuid to each record. The value assigned becomes the key in the map.|
+| `Record` | Append fields. This parameter needs a key/value pair. |
+| `Remove_key` | If the key is matched, that field is removed. You can this or `Allowlist_key`.|
+| `Allowlist_key` | If the key isn't matched, that field is removed. You can this or `Remove_key`. |
+| `Whitelist_key` | An alias of `Allowlist_key` for backwards compatibility. |
+| `Uuid_key` | If set, the plugin appends Uuid to each record. The value assigned becomes the key in the map. |
-## Getting Started
+## Get started
-In order to start filtering records, you can run the filter from the command line or through the configuration file.
+To start filtering records, run the filter from the command line or through a
+configuration file.
-This is a sample in\_mem record to filter.
+This is a sample `in_mem` record to filter.
```text
{"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>299352, "Swap.total"=>2064380, "Swap.used"=>32656, "Swap.free"=>2031724}
@@ -26,11 +28,13 @@ This is a sample in\_mem record to filter.
### Append fields
-The following configuration file is to append product name and hostname \(via environment variable\) to record.
+The following configuration file appends a product name and hostname to a record
+using an environment variable:
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```python
+
+```python copy
[INPUT]
Name mem
Tag mem.local
@@ -45,10 +49,12 @@ The following configuration file is to append product name and hostname \(via en
Record hostname ${HOSTNAME}
Record product Awesome_Tool
```
+
{% endtab %}
{% tab title="fluent-bit.yaml" %}
-```yaml
+
+```yaml copy
pipeline:
inputs:
- name: mem
@@ -56,37 +62,37 @@ pipeline:
filters:
- name: record_modifier
match: '*'
- record:
+ record:
- hostname ${HOSTNAME}
- product Awesome_Tool
outputs:
- name: stdout
match: '*'
```
+
{% endtab %}
{% endtabs %}
+You can run the filter from command line:
-You can also run the filter from command line.
-
-```text
-$ fluent-bit -i mem -o stdout -F record_modifier -p 'Record=hostname ${HOSTNAME}' -p 'Record=product Awesome_Tool' -m '*'
+```shell copy
+fluent-bit -i mem -o stdout -F record_modifier -p 'Record=hostname ${HOSTNAME}' -p 'Record=product Awesome_Tool' -m '*'
```
-The output will be
+The output looks something like:
-```python
+```python copy
[0] mem.local: [1492436882.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>299352, "Swap.total"=>2064380, "Swap.used"=>32656, "Swap.free"=>2031724, "hostname"=>"localhost.localdomain", "product"=>"Awesome_Tool"}]
```
-### Remove fields with Remove\_key
-
-The following configuration file is to remove 'Swap.\*' fields.
+### Remove fields with `Remove_key`
+The following configuration file removes `Swap.*` fields:
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```python
+
+```python copy
[INPUT]
Name mem
Tag mem.local
@@ -102,10 +108,12 @@ The following configuration file is to remove 'Swap.\*' fields.
Remove_key Swap.used
Remove_key Swap.free
```
+
{% endtab %}
{% tab title="fluent-bit.yaml" %}
-```yaml
+
+```yaml copy
pipeline:
inputs:
- name: mem
@@ -113,7 +121,7 @@ pipeline:
filters:
- name: record_modifier
match: '*'
- remove_key:
+ remove_key:
- Swap.total
- Swap.used
- Swap.free
@@ -121,28 +129,30 @@ pipeline:
- name: stdout
match: '*'
```
+
{% endtab %}
{% endtabs %}
You can also run the filter from command line.
-```text
-$ fluent-bit -i mem -o stdout -F record_modifier -p 'Remove_key=Swap.total' -p 'Remove_key=Swap.free' -p 'Remove_key=Swap.used' -m '*'
+```shell copy
+fluent-bit -i mem -o stdout -F record_modifier -p 'Remove_key=Swap.total' -p 'Remove_key=Swap.free' -p 'Remove_key=Swap.used' -m '*'
```
-The output will be
+The output looks something like:
```python
[0] mem.local: [1492436998.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>295332}]
```
-### Remove fields with Allowlist\_key
+### Retain fields with `Allowlist_key`
-The following configuration file is to remain 'Mem.\*' fields.
+The following configuration file retains `Mem.*` fields.
{% tabs %}
{% tab title="fluent-bit.conf" %}
-```python
+
+```python copy
[INPUT]
Name mem
Tag mem.local
@@ -158,10 +168,12 @@ The following configuration file is to remain 'Mem.\*' fields.
Allowlist_key Mem.used
Allowlist_key Mem.free
```
+
{% endtab %}
{% tab title="fluent-bit.yaml" %}
-```yaml
+
+```yaml copy
pipeline:
inputs:
- name: mem
@@ -169,7 +181,7 @@ pipeline:
filters:
- name: record_modifier
match: '*'
- Allowlist_key:
+ Allowlist_key:
- Mem.total
- Mem.used
- Mem.free
@@ -177,18 +189,18 @@ pipeline:
- name: stdout
match: '*'
```
+
{% endtab %}
{% endtabs %}
-You can also run the filter from command line.
+You can also run the filter from command line:
-```text
-$ fluent-bit -i mem -o stdout -F record_modifier -p 'Allowlist_key=Mem.total' -p 'Allowlist_key=Mem.free' -p 'Allowlist_key=Mem.used' -m '*'
+```shell copy
+fluent-bit -i mem -o stdout -F record_modifier -p 'Allowlist_key=Mem.total' -p 'Allowlist_key=Mem.free' -p 'Allowlist_key=Mem.used' -m '*'
```
-The output will be
+The output looks something like:
```python
[0] mem.local: [1492436998.000000000, {"Mem.total"=>1016024, "Mem.used"=>716672, "Mem.free"=>295332}]
```
-
diff --git a/pipeline/filters/rewrite-tag.md b/pipeline/filters/rewrite-tag.md
index 17574ad2a..72a991c1b 100644
--- a/pipeline/filters/rewrite-tag.md
+++ b/pipeline/filters/rewrite-tag.md
@@ -59,7 +59,7 @@ If we wanted to match against the value of the key `name` we must use `$name`. T
* `$name` = "abc-123"
* `$ss['s1']['s2']` = "flb"
-Note that a key must point a value that contains a string, it's **not valid** for numbers, booleans, maps or arrays.
+Note that a key must point to a value that contains a string, it's **not valid** for numbers, booleans, maps or arrays.
### Regex
diff --git a/pipeline/filters/type-converter.md b/pipeline/filters/type-converter.md
index b185467f8..d55d5be40 100644
--- a/pipeline/filters/type-converter.md
+++ b/pipeline/filters/type-converter.md
@@ -2,6 +2,8 @@
The _Type Converter Filter_ plugin allows to convert data type and append new key value pair.
+
+
This plugin is useful in combination with plugins which expect incoming string value.
e.g. [filter_grep](grep.md), [filter_modify](modify.md)
diff --git a/pipeline/filters/wasm.md b/pipeline/filters/wasm.md
index c311447ba..0140bc28d 100644
--- a/pipeline/filters/wasm.md
+++ b/pipeline/filters/wasm.md
@@ -21,7 +21,9 @@ The plugin supports the following configuration parameters:
| Wasm\_Path | Path to the built Wasm program that will be used. This can be a relative path against the main configuration file. |
| Event\_Format | Define event format to interact with Wasm programs: msgpack or json. Default: json |
| Function\_Name | Wasm function name that will be triggered to do filtering. It's assumed that the function is built inside the Wasm program specified above. |
-| Accessible\_Paths | Specify the whilelist of paths to be able to access paths from WASM programs. |
+| Accessible\_Paths | Specify the whitelist of paths to be able to access paths from WASM programs. |
+| Wasm\_Heap\_Size | Size of the heap size of Wasm execution. Review [unit sizes](../../administration/configuring-fluent-bit/unit-sizes.md) for allowed values. |
+| Wasm\_Stack\_Size | Size of the stack size of Wasm execution. Review [unit sizes](../../administration/configuring-fluent-bit/unit-sizes.md) for allowed values. |
## Configuration Examples
diff --git a/pipeline/inputs/collectd.md b/pipeline/inputs/collectd.md
index acbf773c1..b10ea73fd 100644
--- a/pipeline/inputs/collectd.md
+++ b/pipeline/inputs/collectd.md
@@ -11,6 +11,7 @@ The plugin supports the following configuration parameters:
| Listen | Set the address to listen to | 0.0.0.0 |
| Port | Set the port to listen to | 25826 |
| TypesDB | Set the data specification file | /usr/share/collectd/types.db |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Configuration Examples
@@ -31,4 +32,3 @@ Here is a basic configuration example.
With this configuration, Fluent Bit listens to `0.0.0.0:25826`, and outputs incoming datagram packets to stdout.
You must set the same types.db files that your collectd server uses. Otherwise, Fluent Bit may not be able to interpret the payload properly.
-
diff --git a/pipeline/inputs/cpu-metrics.md b/pipeline/inputs/cpu-metrics.md
index 3c296f0cd..c54558cbf 100644
--- a/pipeline/inputs/cpu-metrics.md
+++ b/pipeline/inputs/cpu-metrics.md
@@ -4,13 +4,15 @@ The **cpu** input plugin, measures the CPU usage of a process or the whole syste
The following tables describes the information generated by the plugin. The keys below represent the data used by the overall system, all values associated to the keys are in a percentage unit \(0 to 100%\):
-The CPU metrics plugin creates metrics that are log-based \(I.e. JSON payload\). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
+The CPU metrics plugin creates metrics that are log-based, such as JSON payload. For
+Prometheus-based metrics, see the Node Exporter Metrics input plugin.
| key | description |
| :--- | :--- |
| cpu\_p | CPU usage of the overall system, this value is the summation of time spent on user and kernel space. The result takes in consideration the numbers of CPU cores in the system. |
| user\_p | CPU usage in User mode, for short it means the CPU usage by user space programs. The result of this value takes in consideration the numbers of CPU cores in the system. |
| system\_p | CPU usage in Kernel mode, for short it means the CPU usage by the Kernel. The result of this value takes in consideration the numbers of CPU cores in the system. |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
In addition to the keys reported in the above table, a similar content is created **per** CPU core. The cores are listed from _0_ to _N_ as the Kernel reports:
diff --git a/pipeline/inputs/disk-io-metrics.md b/pipeline/inputs/disk-io-metrics.md
index 024399314..c28cc4acf 100644
--- a/pipeline/inputs/disk-io-metrics.md
+++ b/pipeline/inputs/disk-io-metrics.md
@@ -2,7 +2,8 @@
The **disk** input plugin, gathers the information about the disk throughput of the running system every certain interval of time and reports them.
-The Disk I/O metrics plugin creates metrics that are log-based \(I.e. JSON payload\). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
+The Disk I/O metrics plugin creates metrics that are log-based, such as JSON payload.
+For Prometheus-based metrics, see the Node Exporter Metrics input plugin.
## Configuration Parameters
@@ -13,6 +14,7 @@ The plugin supports the following configuration parameters:
| Interval\_Sec | Polling interval \(seconds\). | 1 |
| Interval\_NSec | Polling interval \(nanosecond\). | 0 |
| Dev\_Name | Device name to limit the target. \(e.g. sda\). If not set, _in\_disk_ gathers information from all of disks and partitions. | all disks |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
@@ -72,4 +74,3 @@ pipeline:
Note: Total interval \(sec\) = Interval\_Sec + \(Interval\_Nsec / 1000000000\).
e.g. 1.5s = 1s + 500000000ns
-
diff --git a/pipeline/inputs/docker-events.md b/pipeline/inputs/docker-events.md
index 6e850c1ee..40d85ec5c 100644
--- a/pipeline/inputs/docker-events.md
+++ b/pipeline/inputs/docker-events.md
@@ -14,6 +14,7 @@ This plugin supports the following configuration parameters:
| Key | When a message is unstructured \(no parser applied\), it's appended as a string under the key name _message_. | message |
| Reconnect.Retry_limits| The maximum number of retries allowed. The plugin tries to reconnect with docker socket when EOF is detected. | 5 |
| Reconnect.Retry_interval| The retrying interval. Unit is second. | 1 |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
### Command Line
diff --git a/pipeline/inputs/docker-metrics.md b/pipeline/inputs/docker-metrics.md
index 1102e576f..ce044604e 100644
--- a/pipeline/inputs/docker-metrics.md
+++ b/pipeline/inputs/docker-metrics.md
@@ -6,12 +6,7 @@ description: >-
# Docker Metrics
-Content:
-
-* [Configuration Parameters](https://app.gitbook.com/s/-LKKSx-3LBTCtaHbg0gl-887967055/pipeline/inputs/docker.md#configuration-parameters)
-* [Configuration File](https://app.gitbook.com/s/-LKKSx-3LBTCtaHbg0gl-887967055/pipeline/inputs/docker.md#configuration-file)
-
-### Configuration Parameters
+## Configuration Parameters
The plugin supports the following configuration parameters:
@@ -20,10 +15,11 @@ The plugin supports the following configuration parameters:
| Interval_Sec | Polling interval in seconds | 1 |
| Include | A space-separated list of containers to include | |
| Exclude | A space-separated list of containers to exclude | |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
If you set neither `Include` nor `Exclude`, the plugin will try to get metrics from _all_ the running containers.
-### Configuration File
+## Configuration File
Here is an example configuration that collects metrics from two docker instances (`6bab19c3a0f9` and `14159be4ca2c`).
diff --git a/pipeline/inputs/dummy.md b/pipeline/inputs/dummy.md
index 745bacaa2..48177ac10 100644
--- a/pipeline/inputs/dummy.md
+++ b/pipeline/inputs/dummy.md
@@ -6,18 +6,19 @@ The **dummy** input plugin, generates dummy events. It is useful for testing, de
The plugin supports the following configuration parameters:
-| Key | Description |
-| :--- | :--- |
-| Dummy | Dummy JSON record. Default: `{"message":"dummy"}` |
-| Metadata | Dummy JSON metadata. Default: `{}` |
-| Start\_time\_sec | Dummy base timestamp in seconds. Default: 0 |
-| Start\_time\_nsec | Dummy base timestamp in nanoseconds. Default: 0 |
-| Rate | Rate at which messages are generated expressed in how many times per second. Default: 1 |
-| Interval\_sec | Set seconds of time interval at which every message is generated. If set, `Rate` configuration will be ignored. Default: 0 |
-| Interval\_nsec | Set nanoseconds of time interval at which every message is generated. If set, `Rate` configuration will be ignored. Default: 0 |
-| Samples | If set, the events number will be limited. e.g. If Samples=3, the plugin only generates three events and stops. |
-| Copies | Number of messages to generate each time they are generated. Defaults to 1. |
-| Flush\_on\_startup | If set to `true`, the first dummy event is generated at startup. Default: `false` |
+| Key | Description | Default |
+| :----------------- | :---------- | :------ |
+| Dummy | Dummy JSON record. | `{"message":"dummy"}` |
+| Metadata | Dummy JSON metadata. | `{}` |
+| Start\_time\_sec | Dummy base timestamp, in seconds. | `0` |
+| Start\_time\_nsec | Dummy base timestamp, in nanoseconds. | `0` |
+| Rate | Rate at which messages are generated expressed in how many times per second. | `1` |
+| Interval\_sec | Set time interval, in seconds, at which every message is generated. If set, `Rate` configuration is ignored. | `0` |
+| Interval\_nsec | Set time interval, in nanoseconds, at which every message is generated. If set, `Rate` configuration is ignored. | `0` |
+| Samples | If set, the events number will be limited. For example, if Samples=3, the plugin generates only three events and stops. | _none_ |
+| Copies | Number of messages to generate each time they are generated. | `1` |
+| Flush\_on\_startup | If set to `true`, the first dummy event is generated at startup. | `false` |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
diff --git a/pipeline/inputs/elasticsearch.md b/pipeline/inputs/elasticsearch.md
index 27f659095..08a67d795 100644
--- a/pipeline/inputs/elasticsearch.md
+++ b/pipeline/inputs/elasticsearch.md
@@ -14,6 +14,7 @@ The plugin supports the following configuration parameters:
| meta\_key | Specify a key name for meta information. | "@meta" |
| hostname | Specify hostname or FQDN. This parameter can be used for "sniffing" (auto-discovery of) cluster node information. | "localhost" |
| version | Specify Elasticsearch server version. This parameter is effective for checking a version of Elasticsearch/OpenSearch server version. | "8.0.0" |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
**Note:** The Elasticsearch cluster uses "sniffing" to optimize the connections between its cluster and clients.
Elasticsearch can build its cluster and dynamically generate a connection list which is called "sniffing".
diff --git a/pipeline/inputs/exec-wasi.md b/pipeline/inputs/exec-wasi.md
index b094f3a01..dabd0a065 100644
--- a/pipeline/inputs/exec-wasi.md
+++ b/pipeline/inputs/exec-wasi.md
@@ -10,11 +10,14 @@ The plugin supports the following configuration parameters:
| :--- | :--- |
| WASI\_Path | The place of a WASM program file. |
| Parser | Specify the name of a parser to interpret the entry as a structured message. |
-| Accessible\_Paths | Specify the whilelist of paths to be able to access paths from WASM programs. |
+| Accessible\_Paths | Specify the whitelist of paths to be able to access paths from WASM programs. |
| Interval\_Sec | Polling interval \(seconds\). |
| Interval\_NSec | Polling interval \(nanosecond\). |
-| Buf\_Size | Size of the buffer \(check [unit sizes](https://docs.fluentbit.io/manual/configuration/unit_sizes) for allowed values\) |
+| Wasm\_Heap\_Size | Size of the heap size of Wasm execution. Review [unit sizes](../../administration/configuring-fluent-bit/unit-sizes.md) for allowed values. |
+| Wasm\_Stack\_Size | Size of the stack size of Wasm execution. Review [unit sizes](../../administration/configuring-fluent-bit/unit-sizes.md) for allowed values. |
+| Buf\_Size | Size of the buffer \(check [unit sizes](../../administration/configuring-fluent-bit/unit-sizes.md) for allowed values\) |
| Oneshot | Only run once at startup. This allows collection of data precedent to fluent-bit's startup (bool, default: false) |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
## Configuration Examples
diff --git a/pipeline/inputs/exec.md b/pipeline/inputs/exec.md
index 2f7a32416..73b4c3450 100644
--- a/pipeline/inputs/exec.md
+++ b/pipeline/inputs/exec.md
@@ -21,10 +21,11 @@ The plugin supports the following configuration parameters:
| Parser | Specify the name of a parser to interpret the entry as a structured message. |
| Interval\_Sec | Polling interval \(seconds\). |
| Interval\_NSec | Polling interval \(nanosecond\). |
-| Buf\_Size | Size of the buffer \(check [unit sizes](https://docs.fluentbit.io/manual/configuration/unit_sizes) for allowed values\) |
+| Buf\_Size | Size of the buffer \(check [unit sizes](../../administration/configuring-fluent-bit/unit-sizes.md) for allowed values\) |
| Oneshot | Only run once at startup. This allows collection of data precedent to fluent-bit's startup (bool, default: false) |
| Exit\_After\_Oneshot | Exit as soon as the one-shot command exits. This allows the exec plugin to be used as a wrapper for another command, sending the target command's output to any fluent-bit sink(s) then exiting. (bool, default: false) |
| Propagate\_Exit\_Code | When exiting due to Exit\_After\_Oneshot, cause fluent-bit to exit with the exit code of the command exited by this plugin. Follows [shell conventions for exit code propagation](https://www.gnu.org/software/bash/manual/html_node/Exit-Status.html). (bool, default: false) |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
## Getting Started
diff --git a/pipeline/inputs/fluentbit-metrics.md b/pipeline/inputs/fluentbit-metrics.md
index 9ef8c604e..358ac92b3 100644
--- a/pipeline/inputs/fluentbit-metrics.md
+++ b/pipeline/inputs/fluentbit-metrics.md
@@ -12,12 +12,13 @@ They can be sent to output plugins including [Prometheus Exporter](../outputs/pr
**Important note:** Metrics collected with Node Exporter Metrics flow through a separate pipeline from logs and current filters do not operate on top of metrics.
-## Configuration
+## Configuration
| Key | Description | Default |
| --------------- | --------------------------------------------------------------------------------------------------------- | --------- |
| scrape_interval | The rate at which metrics are collected from the host operating system | 2 seconds |
| scrape_on_start | Scrape metrics upon start, useful to avoid waiting for 'scrape_interval' for the first round of metrics. | false |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
diff --git a/pipeline/inputs/forward.md b/pipeline/inputs/forward.md
index 2878f5b8b..9e7a61599 100644
--- a/pipeline/inputs/forward.md
+++ b/pipeline/inputs/forward.md
@@ -20,6 +20,7 @@ The plugin supports the following configuration parameters:
| Shared\_Key | Shared key for secure forward authentication. | |
| Self\_Hostname | Hostname for secure forward authentication. | |
| Security.Users | Specify the username and password pairs for secure forward authentication. | |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
@@ -81,7 +82,7 @@ pipeline:
Since Fluent Bit v3, in\_forward can handle secure forward protocol.
-For using user-password authentication, it needs to specify `secutiry.users` at least an one-pair.
+For using user-password authentication, it needs to specify `security.users` at least an one-pair.
For using shared key, it needs to specify `shared_key` in both of forward output and forward input.
`self_hostname` is not able to specify with the same hostname between fluent servers and clients.
@@ -142,4 +143,3 @@ Copyright (C) Treasure Data
[2016/10/07 21:49:40] [ info] [in_fw] binding 0.0.0.0:24224
[0] my_tag: [1475898594, {"key 1"=>123456789, "key 2"=>"abcdefg"}]
```
-
diff --git a/pipeline/inputs/head.md b/pipeline/inputs/head.md
index b5f4496b4..6537f0ee2 100644
--- a/pipeline/inputs/head.md
+++ b/pipeline/inputs/head.md
@@ -16,6 +16,7 @@ The plugin supports the following configuration parameters:
| Key | Rename a key. Default: head. |
| Lines | Line number to read. If the number N is set, in\_head reads first N lines like head\(1\) -n. |
| Split\_line | If enabled, in\_head generates key-value pair per line. |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
### Split Line Mode
@@ -84,7 +85,7 @@ pipeline:
Output is
```bash
-$ bin/fluent-bit -c head.conf
+$ bin/fluent-bit -c head.conf
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
@@ -162,4 +163,3 @@ pipeline:
Note: Total interval \(sec\) = Interval\_Sec + \(Interval\_Nsec / 1000000000\).
e.g. 1.5s = 1s + 500000000ns
-
diff --git a/pipeline/inputs/health.md b/pipeline/inputs/health.md
index 0e5694338..51028e735 100644
--- a/pipeline/inputs/health.md
+++ b/pipeline/inputs/health.md
@@ -15,6 +15,7 @@ The plugin supports the following configuration parameters:
| Alert | If enabled, it will only generate messages if the target TCP service is down. By default this option is disabled. |
| Add\_Host | If enabled, hostname is appended to each records. Default value is _false_. |
| Add\_Port | If enabled, port number is appended to each records. Default value is _false_. |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
## Getting Started
@@ -87,4 +88,3 @@ Fluent Bit v1.8.0
[2] health.0: [1624145990.306498573, {"alive"=>true}]
[3] health.0: [1624145991.305595498, {"alive"=>true}]
```
-
diff --git a/pipeline/inputs/http.md b/pipeline/inputs/http.md
index 2e8afc5aa..52150a24b 100644
--- a/pipeline/inputs/http.md
+++ b/pipeline/inputs/http.md
@@ -15,6 +15,7 @@ description: The HTTP input plugin allows you to send custom records to an HTTP
| buffer_chunk_size | This sets the chunk size for incoming incoming JSON messages. These chunks are then stored/managed in the space available by buffer_max_size. | 512K |
| successful_response_code | It allows to set successful response code. `200`, `201` and `204` are supported. | 201 |
| success_header | Add an HTTP header key/value pair on success. Multiple headers can be set. Example: `X-Custom custom-answer` | |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
### TLS / SSL
diff --git a/pipeline/inputs/kafka.md b/pipeline/inputs/kafka.md
index e779b6f07..a0c83b97d 100644
--- a/pipeline/inputs/kafka.md
+++ b/pipeline/inputs/kafka.md
@@ -14,8 +14,8 @@ This plugin uses the official [librdkafka C library](https://github.com/edenhill
| group\_id | Group id passed to librdkafka. | fluent-bit |
| poll\_ms | Kafka brokers polling interval in milliseconds. | 500 |
| Buffer\_Max\_Size | Specify the maximum size of buffer per cycle to poll kafka messages from subscribed topics. To increase throughput, specify larger size. | 4M |
-| poll\_ms | Kafka brokers polling interval in milliseconds. | 500 |
| rdkafka.{property} | `{property}` can be any [librdkafka properties](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) | |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
@@ -46,7 +46,8 @@ In your main configuration file append the following _Input_ & _Output_ sections
#### Example of using kafka input/output plugins
-The fluent-bit source repository contains a full example of using fluent-bit to process kafka records:
+The Fluent Bit source repository contains a full example of using Fluent Bit to
+process Kafka records:
```text
[INPUT]
diff --git a/pipeline/inputs/kernel-logs.md b/pipeline/inputs/kernel-logs.md
index 7fa9cf143..9614391a5 100644
--- a/pipeline/inputs/kernel-logs.md
+++ b/pipeline/inputs/kernel-logs.md
@@ -7,6 +7,7 @@ The **kmsg** input plugin reads the Linux Kernel log buffer since the beginning,
| Key | Description | Default |
| :--- | :--- | :--- |
| Prio_Level | The log level to filter. The kernel log is dropped if its priority is more than prio_level. Allowed values are 0-8. Default is 8. 8 means all logs are saved. | 8 |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
@@ -60,4 +61,3 @@ pipeline:
```
{% endtab %}
{% endtabs %}
-
diff --git a/pipeline/inputs/kubernetes-events.md b/pipeline/inputs/kubernetes-events.md
index 164c4eb60..1bfb7d134 100644
--- a/pipeline/inputs/kubernetes-events.md
+++ b/pipeline/inputs/kubernetes-events.md
@@ -14,8 +14,8 @@ Kubernetes exports it events through the API server. This input plugin allows to
|---------------------|---------------------------------------------------------------------------------------|------------------------------------------------------|
| db | Set a database file to keep track of recorded Kubernetes events | |
| db.sync | Set a database sync method. values: extra, full, normal and off | normal |
-| interval_sec | Set the polling interval for each channel. | 0 |
-| interval_nsec | Set the polling interval for each channel (sub seconds: nanoseconds) | 500000000 |
+| interval_sec | Set the reconnect interval (seconds)* | 0 |
+| interval_nsec | Set the reconnect interval (sub seconds: nanoseconds)* | 500000000 |
| kube_url | API Server end-point | https://kubernetes.default.svc |
| kube_ca_file | Kubernetes TLS CA file | /var/run/secrets/kubernetes.io/serviceaccount/ca.crt |
| kube_ca_path | Kubernetes TLS ca path | |
@@ -28,26 +28,45 @@ Kubernetes exports it events through the API server. This input plugin allows to
| tls.verify | Enable or disable verification of TLS peer certificate. | On |
| tls.vhost | Set optional TLS virtual host. | |
+
+- _* As of Fluent-Bit 3.1, this plugin uses a Kubernetes watch stream instead of polling. In versions before 3.1, the interval parameters are used for reconnecting the Kubernetes watch stream._
+
+## Threading
+
+This input always runs in its own [thread](../../administration/multithreading.md#inputs).
+
## Getting Started
+### Kubernetes Service Account
+The Kubernetes service account used by Fluent Bit must have `get`, `list`, and `watch`
+permissions to `namespaces` and `pods` for the namespaces watched in the
+`kube_namespace` configuration parameter. If you're using the helm chart to configure
+Fluent Bit, this role is included.
+
### Simple Configuration File
In the following configuration file, the input plugin *kubernetes_events* collects events every 5 seconds (default for *interval_nsec*) and exposes them through the [standard output plugin](../outputs/standard-output.md) on the console.
```text
[SERVICE]
-flush 1
-log_level info
+ flush 1
+ log_level info
[INPUT]
-name kubernetes_events
-tag k8s_events
-kube_url https://kubernetes.default.svc
+ name kubernetes_events
+ tag k8s_events
+ kube_url https://kubernetes.default.svc
[OUTPUT]
-name stdout
-match *
+ name stdout
+ match *
```
### Event Timestamp
-Event timestamp will be created from the first existing field in the following order of precendence: lastTimestamp, firstTimestamp, metadata.creationTimestamp
+
+Event timestamps are created from the first existing field, based on the following
+order of precedence:
+
+1. `lastTimestamp`
+1. `firstTimestamp`
+1. `metadata.creationTimestamp`
diff --git a/pipeline/inputs/memory-metrics.md b/pipeline/inputs/memory-metrics.md
index 0f380c614..de04a3719 100644
--- a/pipeline/inputs/memory-metrics.md
+++ b/pipeline/inputs/memory-metrics.md
@@ -23,6 +23,11 @@ Fluent Bit v1.x.x
[3] memory: [1488543159, {"Mem.total"=>1016044, "Mem.used"=>841420, "Mem.free"=>174624, "Swap.total"=>2064380, "Swap.used"=>139888, "Swap.free"=>1924492}]
```
+## Threading
+
+You can enable the `threaded` setting to run this input in its own
+[thread](../../administration/multithreading.md#inputs).
+
### Configuration File
In your main configuration file append the following _Input_ & _Output_ sections:
diff --git a/pipeline/inputs/mqtt.md b/pipeline/inputs/mqtt.md
index 5ed4295fc..9d67bdbea 100644
--- a/pipeline/inputs/mqtt.md
+++ b/pipeline/inputs/mqtt.md
@@ -6,11 +6,12 @@ The **MQTT** input plugin, allows to retrieve messages/data from MQTT control pa
The plugin supports the following configuration parameters:
-| Key | Description |
-| :--- | :--- |
-| Listen | Listener network interface, default: 0.0.0.0 |
-| Port | TCP port where listening for connections, default: 1883 |
-| Payload_Key | Specify the key where the payload key/value will be preserved. |
+| Key | Description | Default |
+| :---------- | :------------------------------------------------------------- | :------ |
+| Listen | Listener network interface. | `0.0.0.0` |
+| Port | TCP port where listening for connections. | `1883` |
+| Payload_Key | Specify the key where the payload key/value will be preserved. | _none_ |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
@@ -53,4 +54,3 @@ In your main configuration file append the following _Input_ & _Output_ sections
Name stdout
Match *
```
-
diff --git a/pipeline/inputs/network-io-metrics.md b/pipeline/inputs/network-io-metrics.md
index 64fb21f26..bbfbcc5de 100644
--- a/pipeline/inputs/network-io-metrics.md
+++ b/pipeline/inputs/network-io-metrics.md
@@ -2,7 +2,8 @@
The **netif** input plugin gathers network traffic information of the running system every certain interval of time, and reports them.
-The Network I/O Metrics plugin creates metrics that are log-based \(I.e. JSON payload\). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
+The Network I/O Metrics plugin creates metrics that are log-based, such as JSON
+payload. For Prometheus-based metrics, see the Node Exporter Metrics input plugin.
## Configuration Parameters
@@ -15,6 +16,7 @@ The plugin supports the following configuration parameters:
| Interval\_NSec | Polling interval \(nanosecond\). | 0 |
| Verbose | If true, gather metrics precisely. | false |
| Test\_At\_Init | If true, testing if the network interface is valid at initialization. | false |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
@@ -75,4 +77,3 @@ pipeline:
Note: Total interval \(sec\) = Interval\_Sec + \(Interval\_Nsec / 1000000000\).
e.g. 1.5s = 1s + 500000000ns
-
diff --git a/pipeline/inputs/nginx.md b/pipeline/inputs/nginx.md
index d56fc375b..1735e5ef3 100644
--- a/pipeline/inputs/nginx.md
+++ b/pipeline/inputs/nginx.md
@@ -12,6 +12,7 @@ The plugin supports the following configuration parameters:
| Port | Port of the target nginx service to connect to. | 80 |
| Status_URL | The URL of the Stub Status Handler. | /status |
| Nginx_Plus | Turn on NGINX plus mode. | true |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
@@ -35,8 +36,8 @@ server {
### Configuration with NGINX Plus REST API
-A much more powerful and flexible metrics API is available with NGINX Plus. A path needs to be configured
-in NGINX Plus first.
+Another metrics API is available with NGINX Plus. You must first configure a path in
+NGINX Plus.
```
server {
@@ -130,8 +131,9 @@ Fluent Bit v2.x.x
## Exported Metrics
-This documentation is copied from the nginx prometheus exporter metrics documentation:
-[https://github.com/nginxinc/nginx-prometheus-exporter/blob/master/README.md].
+This documentation is copied from the
+[NGINX Prometheus Exporter metrics documentation](https://github.com/nginxinc/nginx-prometheus-exporter/blob/main/README.md)
+on GitHub.
### Common metrics:
Name | Type | Description | Labels
diff --git a/pipeline/inputs/node-exporter-metrics.md b/pipeline/inputs/node-exporter-metrics.md
index 097f1ac9c..2ac3eff5c 100644
--- a/pipeline/inputs/node-exporter-metrics.md
+++ b/pipeline/inputs/node-exporter-metrics.md
@@ -81,6 +81,10 @@ The following table describes the available collectors as part of this plugin. A
| nvme | Exposes nvme statistics from `/proc`. | Linux | v2.2.0 |
| processes | Exposes processes statistics from `/proc`. | Linux | v2.2.0 |
+## Threading
+
+This input always runs in its own [thread](../../administration/multithreading.md#inputs).
+
## Getting Started
### Simple Configuration File
@@ -114,7 +118,7 @@ In the following configuration file, the input plugin _node_exporter_metrics col
host 0.0.0.0
port 2021
-
+
```
{% endtab %}
@@ -201,4 +205,3 @@ docker-compose down
Our current plugin implements a sub-set of the available collectors in the original Prometheus Node Exporter, if you would like that we prioritize a specific collector please open a Github issue by using the following template:\
\
\- [in_node_exporter_metrics](https://github.com/fluent/fluent-bit/issues/new?assignees=\&labels=\&template=feature_request.md\&title=in_node_exporter_metrics:%20add%20ABC%20collector)
-
diff --git a/pipeline/inputs/opentelemetry.md b/pipeline/inputs/opentelemetry.md
index b6dd0ad78..05e0bad92 100644
--- a/pipeline/inputs/opentelemetry.md
+++ b/pipeline/inputs/opentelemetry.md
@@ -20,6 +20,7 @@ Our compliant implementation fully supports OTLP/HTTP and OTLP/GRPC. Note that t
| buffer_chunk_size | Initial size and allocation strategy to store the payload (advanced users only) | 512K |
|successful_response_code | It allows to set successful response code. `200`, `201` and `204` are supported.| 201 |
| tag_from_uri | If true, tag will be created from uri. e.g. v1_metrics from /v1/metrics . | true |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
Important note: Raw traces means that any data forwarded to the traces endpoint (`/v1/traces`) will be packed and forwarded as a log message, and will NOT be processed by Fluent Bit. The traces endpoint by default expects a valid protobuf encoded payload, but you can set the `raw_traces` option in case you want to get trace telemetry data to any of Fluent Bit's supported outputs.
@@ -50,13 +51,13 @@ __OTLP/GRPC__
## Getting started
-The OpenTelemetry plugin currently supports the following telemetry data types:
+The OpenTelemetry input plugin supports the following telemetry data types:
-| Type | HTTP/JSON | HTTP/Protobuf |
-| ----------- | ------------- | --------------- |
-| Logs | Stable | Stable |
-| Metrics | Unimplemented | Stable |
-| Traces | Unimplemented | Stable |
+| Type | HTTP1/JSON | HTTP1/Protobuf | HTTP2/GRPC |
+| ------- | ---------- | -------------- | ---------- |
+| Logs | Stable | Stable | Stable |
+| Metrics | Unimplemented | Stable | Stable |
+| Traces | Unimplemented | Stable | Stable |
A sample config file to get started will look something like the following:
@@ -79,13 +80,13 @@ pipeline:
{% tab title="fluent-bit.conf" %}
```
[INPUT]
- name opentelemetry
- listen 127.0.0.1
- port 4318
+ name opentelemetry
+ listen 127.0.0.1
+ port 4318
[OUTPUT]
- name stdout
- match *
+ name stdout
+ match *
```
{% endtab %}
diff --git a/pipeline/inputs/podman-metrics.md b/pipeline/inputs/podman-metrics.md
index fb51e3328..4d6181eb9 100644
--- a/pipeline/inputs/podman-metrics.md
+++ b/pipeline/inputs/podman-metrics.md
@@ -13,6 +13,7 @@ description: The Podman Metrics input plugin allows you to collect metrics from
| path.config | Custom path to podman containers configuration file | /var/lib/containers/storage/overlay-containers/containers.json |
| path.sysfs | Custom path to sysfs subsystem directory | /sys/fs/cgroup |
| path.procfs | Custom path to proc subsystem directory | /proc |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
diff --git a/pipeline/inputs/process-exporter-metrics.md b/pipeline/inputs/process-exporter-metrics.md
index 5c933bce2..f74c0996c 100644
--- a/pipeline/inputs/process-exporter-metrics.md
+++ b/pipeline/inputs/process-exporter-metrics.md
@@ -42,6 +42,10 @@ macOS does not have the `proc` filesystem so this plugin will not work for it.
| thread\_wchan | Exposes thread\_wchan from `/proc`. |
| thread | Exposes thread statistics from `/proc`. |
+## Threading
+
+This input always runs in its own [thread](../../administration/multithreading.md#inputs).
+
## Getting Started
### Simple Configuration File
@@ -83,7 +87,8 @@ curl http://127.0.0.1:2021/metrics
### Container to Collect Host Metrics
When deploying Fluent Bit in a container you will need to specify additional settings to ensure that Fluent Bit has access to the process details.
-The following `docker` command deploys Fluent Bit with a specific mount path for `procfs` and settings enabled to ensure that Fluent Bit can collect from the host.
+The following `docker` command deploys Fluent Bit with a specific mount path for
+`procfs` and settings enabled to ensure that Fluent Bit can collect from the host.
These are then exposed over port 2021.
```
diff --git a/pipeline/inputs/process.md b/pipeline/inputs/process.md
index f4a618466..06ba33913 100644
--- a/pipeline/inputs/process.md
+++ b/pipeline/inputs/process.md
@@ -1,8 +1,10 @@
# Process Metrics
+
_Process_ input plugin allows you to check how healthy a process is. It does so by performing a service check at every certain interval of time specified by the user.
-The Process metrics plugin creates metrics that are log-based \(I.e. JSON payload\). If you are looking for Prometheus-based metrics please see the Node Exporter Metrics input plugin.
+The Process metrics plugin creates metrics that are log-based, such as JSON payload.
+For Prometheus-based metrics, see the Node Exporter Metrics input plugin.
## Configuration Parameters
@@ -16,6 +18,7 @@ The plugin supports the following configuration parameters:
| Alert | If enabled, it will only generate messages if the target process is down. By default this option is disabled. |
| Fd | If enabled, a number of fd is appended to each records. Default value is true. |
| Mem | If enabled, memory usage of the process is appended to each records. Default value is true. |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
## Getting Started
@@ -63,4 +66,3 @@ Fluent Bit v1.x.x
[2] proc.0: [1485780299, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1152000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
[3] proc.0: [1485780300, {"alive"=>true, "proc_name"=>"fluent-bit", "pid"=>10964, "mem.VmPeak"=>14740000, "mem.VmSize"=>14740000, "mem.VmLck"=>0, "mem.VmHWM"=>1152000, "mem.VmRSS"=>1148000, "mem.VmData"=>2276000, "mem.VmStk"=>88000, "mem.VmExe"=>1768000, "mem.VmLib"=>2328000, "mem.VmPTE"=>68000, "mem.VmSwap"=>0, "fd"=>18}]
```
-
diff --git a/pipeline/inputs/prometheus-remote-write.md b/pipeline/inputs/prometheus-remote-write.md
index f23ecc40b..b149977b7 100644
--- a/pipeline/inputs/prometheus-remote-write.md
+++ b/pipeline/inputs/prometheus-remote-write.md
@@ -17,6 +17,7 @@ This input plugin allows you to ingest a payload in the Prometheus remote-write
|successful\_response\_code | It allows to set successful response code. `200`, `201` and `204` are supported.| 201 |
| tag\_from\_uri | If true, tag will be created from uri, e.g. api\_prom\_push from /api/prom/push, and any tag specified in the config will be ignored. If false then a tag must be provided in the config for this input. | true |
| uri | Specify an optional HTTP URI for the target web server listening for prometheus remote write payloads, e.g: /api/prom/push | |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
A sample config file to get started will look something like the following:
@@ -26,14 +27,14 @@ A sample config file to get started will look something like the following:
{% tab title="fluent-bit.conf" %}
```
[INPUT]
- name prometheus_remote_write
- listen 127.0.0.1
- port 8080
- uri /api/prom/push
+ name prometheus_remote_write
+ listen 127.0.0.1
+ port 8080
+ uri /api/prom/push
[OUTPUT]
- name stdout
- match *
+ name stdout
+ match *
```
{% endtab %}
@@ -65,13 +66,13 @@ Communicating with TLS, you will need to use the tls related parameters:
```
[INPUT]
- Name prometheus_remote_write
- Listen 127.0.0.1
- Port 8080
- Uri /api/prom/push
- Tls On
- tls.crt_file /path/to/certificate.crt
- tls.key_file /path/to/certificate.key
+ Name prometheus_remote_write
+ Listen 127.0.0.1
+ Port 8080
+ Uri /api/prom/push
+ Tls On
+ tls.crt_file /path/to/certificate.crt
+ tls.key_file /path/to/certificate.key
```
Now, you should be able to send data over TLS to the remote write input.
diff --git a/pipeline/inputs/prometheus-scrape-metrics.md b/pipeline/inputs/prometheus-scrape-metrics.md
index d068de3ac..5f3305b44 100644
--- a/pipeline/inputs/prometheus-scrape-metrics.md
+++ b/pipeline/inputs/prometheus-scrape-metrics.md
@@ -12,6 +12,7 @@ The initial release of the Prometheus Scrape metric allows you to collect metric
| port | The port of the prometheus metric endpoint that you want to scrape | |
| scrape\_interval | The interval to scrape metrics | 10s |
| metrics\_path | The metrics URI endpoint, that must start with a forward slash.
Note: Parameters can also be added to the path by using ?
| /metrics |
+| threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Example
@@ -20,10 +21,10 @@ If an endpoint exposes Prometheus Metrics we can specify the configuration to sc
```
[INPUT]
name prometheus_scrape
- host 0.0.0.0
+ host 0.0.0.0
port 8201
- tag vault
- metrics_path /v1/sys/metrics?format=prometheus
+ tag vault
+ metrics_path /v1/sys/metrics?format=prometheus
scrape_interval 10s
[OUTPUT]
@@ -78,6 +79,3 @@ If an endpoint exposes Prometheus Metrics we can specify the configuration to sc
2022-03-26T23:01:29.836663788Z vault_runtime_total_gc_pause_ns = 1917611
2022-03-26T23:01:29.836663788Z vault_runtime_total_gc_runs = 19
```
-
-
-
diff --git a/pipeline/inputs/random.md b/pipeline/inputs/random.md
index 73be22a82..3cb055f17 100644
--- a/pipeline/inputs/random.md
+++ b/pipeline/inputs/random.md
@@ -11,6 +11,7 @@ The plugin supports the following configuration parameters:
| Samples | If set, it will only generate a specific number of samples. By default this value is set to _-1_, which will generate unlimited samples. |
| Interval\_Sec | Interval in seconds between samples generation. Default value is _1_. |
| Interval\_Nsec | Specify a nanoseconds interval for samples generation, it works in conjunction with the Interval\_Sec configuration key. Default value is _0_. |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
## Getting Started
@@ -78,4 +79,3 @@ Fluent Bit v1.x.x
[3] random.0: [1475893657, {"rand_value"=>1501010137543905482}]
[4] random.0: [1475893658, {"rand_value"=>16238242822364375212}]
```
-
diff --git a/pipeline/inputs/serial-interface.md b/pipeline/inputs/serial-interface.md
index 01c4451a6..9da195bd8 100644
--- a/pipeline/inputs/serial-interface.md
+++ b/pipeline/inputs/serial-interface.md
@@ -11,6 +11,7 @@ The **serial** input plugin, allows to retrieve messages/data from a _Serial_ in
| Min\_Bytes | The serial interface will expect at least _Min\_Bytes_ to be available before to process the message \(default: 1\) |
| Separator | Allows to specify a _separator_ string that's used to determinate when a message ends. |
| Format | Specify the format of the incoming data stream. The only option available is 'json'. Note that _Format_ and _Separator_ cannot be used at the same time. |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
## Getting Started
@@ -125,4 +126,3 @@ When the module is loaded, it will interconnect the following virtual interfaces
/dev/tnt4 <=> /dev/tnt5
/dev/tnt6 <=> /dev/tnt7
```
-
diff --git a/pipeline/inputs/splunk.md b/pipeline/inputs/splunk.md
index ae23faebb..38a7fcd75 100644
--- a/pipeline/inputs/splunk.md
+++ b/pipeline/inputs/splunk.md
@@ -12,7 +12,10 @@ The **splunk** input plugin handles [Splunk HTTP HEC](https://docs.splunk.com/Do
| buffer_max_size | Specify the maximum buffer size in KB to receive a JSON message. | 4M |
| buffer_chunk_size | This sets the chunk size for incoming incoming JSON messages. These chunks are then stored/managed in the space available by buffer_max_size. | 512K |
| successful_response_code | It allows to set successful response code. `200`, `201` and `204` are supported. | 201 |
-| splunk\_token | Add an Splunk token for HTTP HEC.` | |
+| splunk\_token | Specify a Splunk token for HTTP HEC authentication. If multiple tokens are specified (with commas and no spaces), usage will be divided across each of the tokens. | |
+| store\_token\_in\_metadata | Store Splunk HEC tokens in the Fluent Bit metadata. If set false, they will be stored as normal key-value pairs in the record data. | true |
+| splunk\_token\_key | Use the specified key for storing the Splunk token for HTTP HEC. This is only effective when `store_token_in_metadata` is false. | @splunk_token |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
diff --git a/pipeline/inputs/standard-input.md b/pipeline/inputs/standard-input.md
index 1259efd7f..9715ff685 100644
--- a/pipeline/inputs/standard-input.md
+++ b/pipeline/inputs/standard-input.md
@@ -204,3 +204,4 @@ The plugin supports the following configuration parameters:
| :--- | :--- | :--- |
| Buffer\_Size | Set the buffer size to read data. This value is used to increase buffer size. The value must be according to the [Unit Size](../../administration/configuring-fluent-bit/unit-sizes.md) specification. | 16k |
| Parser | The name of the parser to invoke instead of the default JSON input parser | |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
diff --git a/pipeline/inputs/statsd.md b/pipeline/inputs/statsd.md
index 80356a3a4..403a94cdc 100644
--- a/pipeline/inputs/statsd.md
+++ b/pipeline/inputs/statsd.md
@@ -15,6 +15,7 @@ The plugin supports the following configuration parameters:
| :--- | :--- | :--- |
| Listen | Listener network interface. | 0.0.0.0 |
| Port | UDP port where listening for connections | 8125 |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Configuration Examples
@@ -61,4 +62,3 @@ Fluent Bit will produce the following records:
[0] statsd.0: [1574905088.971380537, {"type"=>"counter", "bucket"=>"click", "value"=>10.000000, "sample_rate"=>0.100000}]
[0] statsd.0: [1574905141.863344517, {"type"=>"gauge", "bucket"=>"active", "value"=>99.000000, "incremental"=>0}]
```
-
diff --git a/pipeline/inputs/syslog.md b/pipeline/inputs/syslog.md
index cf2a8a0cc..af5471a85 100644
--- a/pipeline/inputs/syslog.md
+++ b/pipeline/inputs/syslog.md
@@ -18,6 +18,7 @@ The plugin supports the following configuration parameters:
| Buffer\_Max\_Size | Specify the maximum buffer size to receive a Syslog message. If not set, the default size will be the value of _Buffer\_Chunk\_Size_. | |
| Receive\_Buffer\_Size | Specify the maximum socket receive buffer size. If not set, the default value is OS-dependant, but generally too low to accept thousands of syslog messages per second without loss on _udp_ or _unix\_udp_ sockets. Note that on Linux the value is capped by `sysctl net.core.rmem_max`.| |
| Source\_Address\_Key| Specify the key where the source address will be injected. | |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
### Considerations
@@ -228,4 +229,3 @@ $OMUxSockSocket /tmp/fluent-bit.sock
```
Make sure that the socket file is readable by rsyslog \(tweak the `Unix_Perm` option shown above\).
-
diff --git a/pipeline/inputs/systemd.md b/pipeline/inputs/systemd.md
index b48554d2a..36066b090 100644
--- a/pipeline/inputs/systemd.md
+++ b/pipeline/inputs/systemd.md
@@ -19,6 +19,7 @@ The plugin supports the following configuration parameters:
| Read\_From\_Tail | Start reading new entries. Skip entries already stored in Journald. | Off |
| Lowercase | Lowercase the Journald field \(key\). | Off |
| Strip\_Underscores | Remove the leading underscore of the Journald field \(key\). For example the Journald field _\_PID_ becomes the key _PID_. | Off |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
diff --git a/pipeline/inputs/tail.md b/pipeline/inputs/tail.md
index 5df9c9424..c5019a75c 100644
--- a/pipeline/inputs/tail.md
+++ b/pipeline/inputs/tail.md
@@ -32,10 +32,11 @@ The plugin supports the following configuration parameters:
| Parser | Specify the name of a parser to interpret the entry as a structured message. | |
| Key | When a message is unstructured \(no parser applied\), it's appended as a string under the key name _log_. This option allows to define an alternative name for that key. | log |
| Inotify_Watcher | Set to false to use file stat watcher instead of inotify. | true |
-| Tag | Set a tag \(with regex-extract fields\) that will be placed on lines read. E.g. `kube....`. Note that "tag expansion" is supported: if the tag includes an asterisk \(\*\), that asterisk will be replaced with the absolute path of the monitored file, with slashes replaced by dots \(also see [Workflow of Tail + Kubernetes Filter](../filters/kubernetes.md#workflow-of-tail-kubernetes-filter)\). | |
+| Tag | Set a tag \(with regex-extract fields\) that will be placed on lines read. E.g. `kube....`. Note that "tag expansion" is supported: if the tag includes an asterisk \(\*\), that asterisk will be replaced with the absolute path of the monitored file, with slashes replaced by dots \(also see [Workflow of Tail + Kubernetes Filter](../filters/kubernetes.md#workflow-of-tail--kubernetes-filter)\). | |
| Tag\_Regex | Set a regex to extract fields from the file name. E.g. `(?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$` | |
| Static\_Batch\_Size | Set the maximum number of bytes to process per iteration for the monitored static files (files that already exists upon Fluent Bit start). | 50M |
-
+| File\_Cache\_Advise | Set the posix_fadvise in POSIX_FADV_DONTNEED mode. This will reduce the usage of the kernel file cache. This option is ignored if not running on Linux. | On |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
Note that if the database parameter `DB` is **not** specified, by default the plugin will start reading each target file from the beginning. This also might cause some unwanted behavior, for example when a line is bigger that `Buffer_Chunk_Size` and `Skip_Long_Lines` is not turned on, the file will be read from the beginning of each `Refresh_Interval` until the file is rotated.
@@ -81,7 +82,7 @@ If you are running Fluent Bit to process logs coming from containers like Docker
```yaml
pipeline:
inputs:
- - tail:
+ - name: tail
path: /var/log/containers/*.log
multiline.parser: docker, cri
```
@@ -127,7 +128,7 @@ $ fluent-bit -i tail -p path=/var/log/syslog -o stdout
### Configuration File
-In your main configuration file append the following _Input_ & _Output_ sections.
+In your main configuration file, append the following `Input` and `Output` sections:
{% tabs %}
{% tab title="fluent-bit.conf" %}
@@ -146,9 +147,9 @@ In your main configuration file append the following _Input_ & _Output_ sections
```yaml
pipeline:
inputs:
- - tail:
+ - name: tail
path: /var/log/syslog
-
+
outputs:
- stdout:
match: *
diff --git a/pipeline/inputs/tcp.md b/pipeline/inputs/tcp.md
index ac3375c90..67dba0eb5 100644
--- a/pipeline/inputs/tcp.md
+++ b/pipeline/inputs/tcp.md
@@ -6,8 +6,8 @@ The **tcp** input plugin allows to retrieve structured JSON or raw messages over
The plugin supports the following configuration parameters:
-| Key | Description | Default |
-| ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------- |
+| Key | Description | Default |
+| ------------ | ----------- | ------- |
| Listen | Listener network interface. | 0.0.0.0 |
| Port | TCP port where listening for connections | 5170 |
| Buffer\_Size | Specify the maximum buffer size in KB to receive a JSON message. If not set, the default size will be the value of _Chunk\_Size_. | |
@@ -15,6 +15,7 @@ The plugin supports the following configuration parameters:
| Format | Specify the expected payload format. It support the options _json_ and _none_. When using _json_, it expects JSON maps, when is set to _none_, it will split every record using the defined _Separator_ (option below). | json |
| Separator | When the expected _Format_ is set to _none_, Fluent Bit needs a separator string to split the records. By default it uses the breakline character (LF or 0x10). | |
| Source\_Address\_Key| Specify the key where the source address will be injected. | |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
diff --git a/pipeline/inputs/thermal.md b/pipeline/inputs/thermal.md
index 4c6447e2a..56af07975 100644
--- a/pipeline/inputs/thermal.md
+++ b/pipeline/inputs/thermal.md
@@ -20,6 +20,7 @@ The plugin supports the following configuration parameters:
| Interval\_NSec | Polling interval \(nanoseconds\). default: 0 |
| name\_regex | Optional name filter regex. default: None |
| type\_regex | Optional type filter regex. default: None |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). Default: `false`. |
## Getting Started
@@ -83,4 +84,4 @@ pipeline:
match: '*'
```
{% endtab %}
-{% endtabs %}
\ No newline at end of file
+{% endtabs %}
diff --git a/pipeline/inputs/udp.md b/pipeline/inputs/udp.md
index 888e5d257..e95faf9f2 100644
--- a/pipeline/inputs/udp.md
+++ b/pipeline/inputs/udp.md
@@ -15,6 +15,7 @@ The plugin supports the following configuration parameters:
| Format | Specify the expected payload format. It support the options _json_ and _none_. When using _json_, it expects JSON maps, when is set to _none_, it will split every record using the defined _Separator_ (option below). | json |
| Separator | When the expected _Format_ is set to _none_, Fluent Bit needs a separator string to split the records. By default it uses the breakline character (LF or 0x10). | |
| Source\_Address\_Key| Specify the key where the source address will be injected. | |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
## Getting Started
diff --git a/pipeline/inputs/windows-event-log-winevtlog.md b/pipeline/inputs/windows-event-log-winevtlog.md
index 2edaaee4b..c570009d2 100644
--- a/pipeline/inputs/windows-event-log-winevtlog.md
+++ b/pipeline/inputs/windows-event-log-winevtlog.md
@@ -18,6 +18,7 @@ The plugin supports the following configuration parameters:
| Use\_ANSI | Use ANSI encoding on eventlog messages. If you have issues receiving blank strings with old Windows versions (Server 2012 R2), setting this to True may solve the problem. \(optional\) | False |
| Event\_Query | Specify XML query for filtering events. | `*` |
| Read\_Limit\_Per\_Cycle | Specify read limit per cycle. | 512KiB |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
Note that if you do not set _db_, the plugin will tail channels on each startup.
diff --git a/pipeline/inputs/windows-event-log.md b/pipeline/inputs/windows-event-log.md
index 3cac1ccdb..4a6941a40 100644
--- a/pipeline/inputs/windows-event-log.md
+++ b/pipeline/inputs/windows-event-log.md
@@ -11,6 +11,7 @@ The plugin supports the following configuration parameters:
| Channels | A comma-separated list of channels to read from. | |
| Interval_Sec | Set the polling interval for each channel. (optional) | 1 |
| DB | Set the path to save the read offsets. (optional) | |
+| Threaded | Indicates whether to run this input in its own [thread](../../administration/multithreading.md#inputs). | `false` |
Note that if you do not set _db_, the plugin will read channels from the beginning on each startup.
diff --git a/pipeline/inputs/windows-exporter-metrics.md b/pipeline/inputs/windows-exporter-metrics.md
index 61713457e..f1b5ed126 100644
--- a/pipeline/inputs/windows-exporter-metrics.md
+++ b/pipeline/inputs/windows-exporter-metrics.md
@@ -63,6 +63,10 @@ The following table describes the available collectors as part of this plugin. A
| paging\_file | Exposes paging\_file statistics. | Windows | v2.1.9 |
| process | Exposes process statistics. | Windows | v2.1.9 |
+## Threading
+
+This input always runs in its own [thread](../../administration/multithreading.md#inputs).
+
## Getting Started
### Simple Configuration File
diff --git a/pipeline/outputs/azure.md b/pipeline/outputs/azure.md
index eda87d29d..3e4bf7b04 100644
--- a/pipeline/outputs/azure.md
+++ b/pipeline/outputs/azure.md
@@ -20,6 +20,7 @@ To get more details about how to setup Azure Log Analytics, please refer to the
| Log_Type_Key | If included, the value for this key will be looked upon in the record and if present, will over-write the `log_type`. If not found then the `log_type` value will be used. | |
| Time\_Key | Optional parameter to specify the key name where the timestamp will be stored. | @timestamp |
| Time\_Generated | If enabled, the HTTP request header 'time-generated-field' will be included so Azure can override the timestamp with the key specified by 'time_key' option. | off |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Getting Started
@@ -61,4 +62,3 @@ Another example using the `Log_Type_Key` with [record-accessor](https://docs.flu
Customer_ID abc
Shared_Key def
```
-
diff --git a/pipeline/outputs/azure_blob.md b/pipeline/outputs/azure_blob.md
index c775379aa..1c23806ff 100644
--- a/pipeline/outputs/azure_blob.md
+++ b/pipeline/outputs/azure_blob.md
@@ -31,6 +31,7 @@ We expose different configuration properties. The following table lists all the
| emulator\_mode | If you want to send data to an Azure emulator service like [Azurite](https://github.com/Azure/Azurite), enable this option so the plugin will format the requests to the expected format. | off |
| endpoint | If you are using an emulator, this option allows you to specify the absolute HTTP address of such service. e.g: [http://127.0.0.1:10000](http://127.0.0.1:10000). | |
| tls | Enable or disable TLS encryption. Note that Azure service requires this to be turned on. | off |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Getting Started
@@ -128,4 +129,3 @@ Azurite Queue service is successfully listening at http://127.0.0.1:10001
127.0.0.1 - - [03/Sep/2020:17:40:03 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log HTTP/1.1" 201 -
127.0.0.1 - - [03/Sep/2020:17:40:04 +0000] "PUT /devstoreaccount1/logs/kubernetes/var.log.containers.app-default-96cbdef2340.log?comp=appendblock HTTP/1.1" 201 -
```
-
diff --git a/pipeline/outputs/azure_kusto.md b/pipeline/outputs/azure_kusto.md
index 5fd4075fc..19cf72157 100644
--- a/pipeline/outputs/azure_kusto.md
+++ b/pipeline/outputs/azure_kusto.md
@@ -63,6 +63,7 @@ By default, Kusto will insert incoming ingestions into a table by inferring the
| tag_key | The key name of tag. If `include_tag_key` is false, This property is ignored. | `tag` |
| include_time_key | If enabled, a timestamp is appended to output. The key name is used `time_key` property. | `On` |
| time_key | The key name of time. If `include_time_key` is false, This property is ignored. | `timestamp` |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### Configuration File
diff --git a/pipeline/outputs/azure_logs_ingestion.md b/pipeline/outputs/azure_logs_ingestion.md
index e008ac4da..dbf7678b9 100644
--- a/pipeline/outputs/azure_logs_ingestion.md
+++ b/pipeline/outputs/azure_logs_ingestion.md
@@ -37,6 +37,7 @@ To get more details about how to setup these components, please refer to the fol
| time\_key | _Optional_ - Specify the key name where the timestamp will be stored. | `@timestamp` |
| time\_generated | _Optional_ - If enabled, will generate a timestamp and append it to JSON. The key name is set by the 'time_key' parameter. | `true` |
| compress | _Optional_ - Enable HTTP payload gzip compression. | `true` |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Getting Started
@@ -58,7 +59,7 @@ Use this configuration to quickly get started:
Name tail
Path /path/to/your/sample.log
Tag sample
- Key RawData
+ Key RawData
# Or use other plugins Plugin
# [INPUT]
# Name cpu
diff --git a/pipeline/outputs/bigquery.md b/pipeline/outputs/bigquery.md
index 8ef7a469f..dd2c278a9 100644
--- a/pipeline/outputs/bigquery.md
+++ b/pipeline/outputs/bigquery.md
@@ -59,6 +59,7 @@ You must configure workload identity federation in GCP before using it with Flue
| pool\_id | GCP workload identity pool where the identity provider was created. Used to construct the full resource name of the identity provider. | |
| provider\_id | GCP workload identity provider. Used to construct the full resource name of the identity provider. Currently only AWS accounts are supported. | |
| google\_service\_account | Email address of the Google service account to impersonate. The workload identity provider must have permissions to impersonate this service account, and the service account must have permissions to access Google BigQuery resources (e.g. `write` access to tables) | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
See Google's [official documentation](https://cloud.google.com/bigquery/docs/reference/rest/v2/tabledata/insertAll) for further details.
@@ -77,4 +78,3 @@ If you are using a _Google Cloud Credentials File_, the following configuration
dataset_id my_dataset
table_id dummy_table
```
-
diff --git a/pipeline/outputs/chronicle.md b/pipeline/outputs/chronicle.md
index d2935fc00..5298ec584 100644
--- a/pipeline/outputs/chronicle.md
+++ b/pipeline/outputs/chronicle.md
@@ -1,5 +1,3 @@
----
-
# Chronicle
The Chronicle output plugin allows ingesting security logs into [Google Chronicle](https://chronicle.security/) service. This connector is designed to send unstructured security logs.
@@ -36,6 +34,7 @@ Fluent Bit's Chronicle output plugin uses a JSON credentials file for authentica
| log\_type | The log type to parse logs as. Google Chronicle supports parsing for [specific log types only](https://cloud.google.com/chronicle/docs/ingestion/parser-list/supported-default-parsers). | |
| region | The GCP region in which to store security logs. Currently, there are several supported regions: `US`, `EU`, `UK`, `ASIA`. Blank is handled as `US`. | |
| log\_key | By default, the whole log record will be sent to Google Chronicle. If you specify a key name with this option, then only the value of that key will be sent to Google Chronicle. | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
See Google's [official documentation](https://cloud.google.com/chronicle/docs/reference/ingestion-api) for further details.
diff --git a/pipeline/outputs/cloudwatch.md b/pipeline/outputs/cloudwatch.md
index 74a17c673..bfcd2ba2d 100644
--- a/pipeline/outputs/cloudwatch.md
+++ b/pipeline/outputs/cloudwatch.md
@@ -34,6 +34,7 @@ See [here](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b
| profile | Option to specify an AWS Profile for credentials. Defaults to `default` |
| auto\_retry\_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. This option defaults to `true`. |
| external\_id | Specify an external ID for the STS API, can be used with the role\_arn parameter if your role requires an external ID. |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. Default: `1`. |
## Getting Started
@@ -60,6 +61,16 @@ In your main configuration file append the following _Output_ section:
log_stream_prefix from-fluent-bit-
auto_create_group On
```
+#### Intergration with Localstack (Cloudwatch Logs)
+
+For an instance of Localstack running at `http://localhost:4566`, the following configuration needs to be added to the `[OUTPUT]` section:
+
+```text
+endpoint localhost
+port 4566
+```
+
+Any testing credentials can be exported as local variables, such as `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`.
### Permissions
@@ -80,28 +91,6 @@ The following AWS IAM permissions are required to use this plugin:
}
```
-### Worker support
-
-Fluent Bit 1.7 adds a new feature called `workers` which enables outputs to have dedicated threads. This `cloudwatch_logs` plugin has partial support for workers in Fluent Bit 2.1.11 and prior. **2.1.11 and prior, the plugin can support a single worker; enabling multiple workers will lead to errors/indeterminate behavior.**
-Starting from Fluent Bit 2.1.12, the `cloudwatch_logs` plugin added full support for workers, meaning that more than one worker can be configured.
-
-Example:
-
-```
-[OUTPUT]
- Name cloudwatch_logs
- Match *
- region us-east-1
- log_group_name fluent-bit-cloudwatch
- log_stream_prefix from-fluent-bit-
- auto_create_group On
- workers 1
-```
-
-If you enable workers, you are enabling one or more dedicated threads for your CloudWatch output.
-We recommend starting with 1 worker, evaluating the performance, and then enabling more workers if needed.
-For most users, the plugin can provide sufficient throughput with 0 or 1 workers.
-
### Log Stream and Group Name templating using record\_accessor syntax
Sometimes, you may want the log group or stream name to be based on the contents of the log record itself. This plugin supports templating log group and stream names using Fluent Bit [record\_accessor](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor) syntax.
@@ -271,4 +260,4 @@ You can use our SSM Public Parameters to find the Amazon ECR image URI in your r
aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/
```
-For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images).
+For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images).
\ No newline at end of file
diff --git a/pipeline/outputs/datadog.md b/pipeline/outputs/datadog.md
index a89649a35..0d6a0fb49 100644
--- a/pipeline/outputs/datadog.md
+++ b/pipeline/outputs/datadog.md
@@ -23,8 +23,9 @@ Before you begin, you need a [Datadog account](https://app.datadoghq.com/signup)
| tag_key | The key name of tag. If `include_tag_key` is false, This property is ignored. | `tagkey` |
| dd_service | _Recommended_ - The human readable name for your service generating the logs (e.g. the name of your application or database). If unset, Datadog will look for the service using [Service Remapper](https://docs.datadoghq.com/logs/log_configuration/pipelines/?tab=service#service-attribute)." | |
| dd_source | _Recommended_ - A human readable name for the underlying technology of your service (e.g. `postgres` or `nginx`). If unset, Datadog will look for the source in the [`ddsource` attribute](https://docs.datadoghq.com/logs/log_configuration/pipelines/?tab=source#source-attribute). | |
-| dd_tags | _Optional_ - The [tags](https://docs.datadoghq.com/tagging/) you want to assign to your logs in Datadog. If unset, Datadog will look for the tags in the [`ddtags' attribute](https://docs.datadoghq.com/api/latest/logs/#send-logs). | |
+| dd_tags | _Optional_ - The [tags](https://docs.datadoghq.com/tagging/) you want to assign to your logs in Datadog. If unset, Datadog will look for the tags in the [`ddtags` attribute](https://docs.datadoghq.com/api/latest/logs/#send-logs). | |
| dd_message_key | By default, the plugin searches for the key 'log' and remap the value to the key 'message'. If the property is set, the plugin will search the property name key. | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### Configuration File
diff --git a/pipeline/outputs/elasticsearch.md b/pipeline/outputs/elasticsearch.md
index c015d09fa..c18f6b720 100644
--- a/pipeline/outputs/elasticsearch.md
+++ b/pipeline/outputs/elasticsearch.md
@@ -4,79 +4,92 @@ description: Send logs to Elasticsearch (including Amazon OpenSearch Service)
# Elasticsearch
-The **es** output plugin, allows to ingest your records into an [Elasticsearch](http://www.elastic.co) database. The following instructions assumes that you have a fully operational Elasticsearch service running in your environment.
+The **es** output plugin lets you ingest your records into an
+[Elasticsearch](http://www.elastic.co) database. To use this plugin, you must have an
+operational Elasticsearch service running in your environment.
## Configuration Parameters
-| Key | Description | default |
+| Key | Description | Default |
| :--- | :--- | :--- |
-| Host | IP address or hostname of the target Elasticsearch instance | 127.0.0.1 |
-| Port | TCP port of the target Elasticsearch instance | 9200 |
-| Path | Elasticsearch accepts new data on HTTP query path "/\_bulk". But it is also possible to serve Elasticsearch behind a reverse proxy on a subpath. This option defines such path on the fluent-bit side. It simply adds a path prefix in the indexing HTTP POST URI. | Empty string |
-| header | Add additional arbitrary HTTP header key/value pair. Multiple headers can be set. | |
-| compress | Set payload compression mechanism. Option available is 'gzip' | |
-| Buffer\_Size | Specify the buffer size used to read the response from the Elasticsearch HTTP service. This option is useful for debugging purposes where is required to read full responses, note that response size grows depending of the number of records inserted. To set an _unlimited_ amount of memory set this value to **False**, otherwise the value must be according to the [Unit Size](../../administration/configuring-fluent-bit/unit-sizes.md) specification. | 512KB |
-| Pipeline | Newer versions of Elasticsearch allows to setup filters called pipelines. This option allows to define which pipeline the database should use. For performance reasons is strongly suggested to do parsing and filtering on Fluent Bit side, avoid pipelines. | |
-| AWS\_Auth | Enable AWS Sigv4 Authentication for Amazon OpenSearch Service | Off |
-| AWS\_Region | Specify the AWS region for Amazon OpenSearch Service | |
-| AWS\_STS\_Endpoint | Specify the custom sts endpoint to be used with STS API for Amazon OpenSearch Service | |
-| AWS\_Role\_ARN | AWS IAM Role to assume to put records to your Amazon cluster | |
-| AWS\_External\_ID | External ID for the AWS IAM Role specified with `aws_role_arn` | |
-| AWS\_Service\_Name | Service name to be used in AWS Sigv4 signature. For integration with Amazon OpenSearch Serverless, set to `aoss`. See the [FAQ](opensearch.md#faq) section on Amazon OpenSearch Serverless for more information. | es |
-| AWS\_Profile | AWS profile name | default |
-| Cloud\_ID | If you are using Elastic's Elasticsearch Service you can specify the cloud\_id of the cluster running. The Cloud ID string has the format `:`. Once decoded, the `base64_info` string has the format `$$`.
- | |
-| Cloud\_Auth | Specify the credentials to use to connect to Elastic's Elasticsearch Service running on Elastic Cloud | |
-| HTTP\_User | Optional username credential for Elastic X-Pack access | |
-| HTTP\_Passwd | Password for user defined in HTTP\_User | |
-| Index | Index name | fluent-bit |
-| Type | Type name | \_doc |
-| Logstash\_Format | Enable Logstash format compatibility. This option takes a boolean value: True/False, On/Off | Off |
-| Logstash\_Prefix | When Logstash\_Format is enabled, the Index name is composed using a prefix and the date, e.g: If Logstash\_Prefix is equals to 'mydata' your index will become 'mydata-YYYY.MM.DD'. The last string appended belongs to the date when the data is being generated. | logstash |
-| Logstash\_Prefix\_Key | When included: the value of the key in the record will be evaluated as key reference and overrides Logstash\_Prefix for index generation. If the key/value is not found in the record then the Logstash\_Prefix option will act as a fallback. The parameter is expected to be a [record accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md). | |
-| Logstash\_Prefix\_Separator | Set a separator between logstash_prefix and date.| - |
-| Logstash\_DateFormat | Time format \(based on [strftime](http://man7.org/linux/man-pages/man3/strftime.3.html)\) to generate the second part of the Index name. | %Y.%m.%d |
-| Time\_Key | When Logstash\_Format is enabled, each record will get a new timestamp field. The Time\_Key property defines the name of that field. | @timestamp |
-| Time\_Key\_Format | When Logstash\_Format is enabled, this property defines the format of the timestamp. | %Y-%m-%dT%H:%M:%S |
-| Time\_Key\_Nanos | When Logstash\_Format is enabled, enabling this property sends nanosecond precision timestamps. | Off |
-| Include\_Tag\_Key | When enabled, it append the Tag name to the record. | Off |
-| Tag\_Key | When Include\_Tag\_Key is enabled, this property defines the key name for the tag. | \_flb-key |
-| Generate\_ID | When enabled, generate `_id` for outgoing records. This prevents duplicate records when retrying ES. | Off |
-| Id\_Key | If set, `_id` will be the value of the key from incoming record and `Generate_ID` option is ignored. | |
-| Write\_Operation | The write\_operation can be any of: create (default), index, update, upsert. | create |
-| Replace\_Dots | When enabled, replace field name dots with underscore, required by Elasticsearch 2.0-2.3. | Off |
-| Trace\_Output | Print all elasticsearch API request payloads to stdout \(for diag only\) | Off |
-| Trace\_Error | If elasticsearch return an error, print the elasticsearch API request and response \(for diag only\) | Off |
-| Current\_Time\_Index | Use current time for index generation instead of message record | Off |
-| Suppress\_Type\_Name | When enabled, mapping types is removed and `Type` option is ignored. If using Elasticsearch 8.0.0 or higher - it [no longer supports mapping types](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html), so it shall be set to On. | Off |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
-
-> The parameters _index_ and _type_ can be confusing if you are new to Elastic, if you have used a common relational database before, they can be compared to the _database_ and _table_ concepts. Also see [the FAQ below](elasticsearch.md#faq)
+| `Host` | IP address or hostname of the target Elasticsearch instance | `127.0.0.1` |
+| `Port` | TCP port of the target Elasticsearch instance | `9200` |
+| `Path` | Elasticsearch accepts new data on HTTP query path `/_bulk`. You can also serve Elasticsearch behind a reverse proxy on a sub-path. Define the path by adding a path prefix in the indexing HTTP POST URI. | Empty string |
+| `header` | Add additional arbitrary HTTP header key/value pair. Multiple headers can be set. | _none_ |
+| `compress` | Set payload compression mechanism. Option available is `gzip`. | _none_ |
+| `Buffer_Size` | Specify the buffer size used to read the response from the Elasticsearch HTTP service. Use for debugging purposes where required to read full responses. Response size grows depending of the number of records inserted. To use an unlimited amount of memory, set this value to `False`. Otherwise set the value according to the [Unit Size](../../administration/configuring-fluent-bit/unit-sizes.md). | `512KB` |
+| `Pipeline` | Define which pipeline the database should use. For performance reasons, it's strongly suggested to do parsing and filtering on Fluent Bit side, and avoid pipelines. | _none_ |
+| `AWS_Auth` | Enable AWS Sigv4 Authentication for Amazon OpenSearch Service. | `Off` |
+| `AWS_Region` | Specify the AWS region for Amazon OpenSearch Service. | _none_ |
+| `AWS_STS_Endpoint` | Specify the custom STS endpoint to be used with STS API for Amazon OpenSearch Service | _none_ |
+| `AWS_Role_ARN` | AWS IAM Role to assume to put records to your Amazon cluster | _none_ |
+| `AWS_External_ID` | External ID for the AWS IAM Role specified with `aws_role_arn` | _none_ |
+| `AWS_Service_Name` | Service name to use in AWS Sigv4 signature. For integration with Amazon OpenSearch Serverless, set to `aoss`. See [Amazon OpenSearch Serverless](opensearch.md) for more information. | `es` |
+| `AWS_Profile` | AWS profile name | `default` |
+| `Cloud_ID` | If using Elastic's Elasticsearch Service you can specify the `cloud_id` of the cluster running. The string has the format `:`. Once decoded, the `base64_info` string has the format `$$`. | _none_ |
+| `Cloud_Auth` | Specify the credentials to use to connect to Elastic's Elasticsearch Service running on Elastic Cloud | _none_ |
+| `HTTP_User` | Optional username credential for Elastic X-Pack access | _none_ |
+| `HTTP_Passwd` | Password for user defined in `HTTP_User` | _none_ |
+| `Index` | Index name | `fluent-bit` |
+| `Type` | Type name | `_doc` |
+| `Logstash_Format` | Enable Logstash format compatibility. This option takes a Boolean value: `True/False`, `On/Off` | `Off` |
+| `Logstash_Prefix` | When `Logstash_Format` is enabled, the Index name is composed using a prefix and the date, e.g: If `Logstash_Prefix` is equal to `mydata` your index will become `mydata-YYYY.MM.DD`. The last string appended belongs to the date when the data is being generated. | `logstash` |
+| `Logstash_Prefix_Key` | When included: the value of the key in the record will be evaluated as key reference and overrides `Logstash_Prefix` for index generation. If the key/value isn't found in the record then the `Logstash_Prefix` option will act as a fallback. The parameter is expected to be a [record accessor](../../administration/configuring-fluent-bit/classic-mode/record-accessor.md). | _none_ |
+| `Logstash_Prefix_Separator` | Set a separator between `Logstash_Prefix` and date.| `-` |
+| `Logstash_DateFormat` | Time format based on [strftime](http://man7.org/linux/man-pages/man3/strftime.3.html) to generate the second part of the Index name. | `%Y.%m.%d` |
+| `Time_Key` | When `Logstash_Format` is enabled, each record will get a new timestamp field. The `Time_Key` property defines the name of that field. | `@timestamp` |
+| `Time_Key_Format` | When `Logstash_Format` is enabled, this property defines the format of the timestamp. | `%Y-%m-%dT%H:%M:%S` |
+| `Time_Key_Nanos` | When `Logstash_Format` is enabled, enabling this property sends nanosecond precision timestamps. | `Off` |
+| `Include_Tag_Key` | When enabled, it append the Tag name to the record. | `Off` |
+| `Tag_Key` | When `Include_Tag_Key` is enabled, this property defines the key name for the tag. | `_flb-key` |
+| `Generate_ID` | When enabled, generate `_id` for outgoing records. This prevents duplicate records when retrying ES. | `Off` |
+| `Id_Key` | If set, `_id` will be the value of the key from incoming record and `Generate_ID` option is ignored. | _none_ |
+| `Write_Operation` | `Write_operation` can be any of: `create`, `index`, `update`, `upsert`. | `create` |
+| `Replace_Dots` | When enabled, replace field name dots with underscore. Required by Elasticsearch 2.0-2.3. | `Off` |
+| `Trace_Output` | Print all ElasticSearch API request payloads to `stdout` for diagnostics. | `Off` |
+| `Trace_Error` | If ElasticSearch returns an error, print the ElasticSearch API request and response for diagnostics. | `Off` |
+| `Current_Time_Index` | Use current time for index generation instead of message record. | `Off` |
+| `Suppress_Type_Name` | When enabled, mapping types is removed and `Type` option is ignored. Elasticsearch 8.0.0 or higher [no longer supports mapping types](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html), and is set to `On`. | `Off` |
+| `Workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
+
+If you have used a common relational database, the parameters `index` and `type` can
+be compared to the `database` and `table` concepts.
### TLS / SSL
-Elasticsearch output plugin supports TLS/SSL, for more details about the properties available and general configuration, please refer to the [TLS/SSL](../../administration/transport-security.md) section.
+Elasticsearch output plugin supports TLS/SSL. For more details about the properties
+available and general configuration, refer to[TLS/SSL](../../administration/transport-security.md).
-### write\_operation
+### `write_operation`
-The write\_operation can be any of:
+The `write_operation` can be any of:
-| Operation | Description |
-| ------------- | ----------- |
-| create (default) | adds new data - if the data already exists (based on its id), the op is skipped.|
-| index | new data is added while existing data (based on its id) is replaced (reindexed).|
-| update | updates existing data (based on its id). If no data is found, the op is skipped.|
-| upsert | known as merge or insert if the data does not exist, updates if the data exists (based on its id).|
+| Operation | Description |
+| ----------- | ----------- |
+| `create` | Adds new data. If the data already exists (based on its id), the op is skipped.|
+| `index` | New data is added while existing data (based on its id) is replaced (reindexed).|
+| `update` | Updates existing data (based on its id). If no data is found, the op is skipped. |
+| `upsert` | Merge or insert if the data doesn't exist, updates if the data exists (based on its id).|
-**Please note, `Id_Key` or `Generate_ID` is required in update, and upsert scenario.**
+{% hint style="info" %}
-## Getting Started
+`Id_Key` or `Generate_ID` is required for `update` and `upsert`.
-In order to insert records into a Elasticsearch service, you can run the plugin from the command line or through the configuration file:
+{% endhint %}
+
+## Get started
+
+To insert records into an Elasticsearch service, you run the plugin from the
+command line or through the configuration file:
### Command Line
-The **es** plugin, can read the parameters from the command line in two ways, through the **-p** argument \(property\) or setting them directly through the service URI. The URI format is the following:
+The **es** plugin can read the parameters from the command line in two ways:
+
+- Through the `-p` argument (property).
+- Setting them directly through the service URI.
+
+The URI format is the following:
```text
es://host:port/index/type
@@ -84,21 +97,21 @@ es://host:port/index/type
Using the format specified, you could start Fluent Bit through:
-```text
-$ fluent-bit -i cpu -t cpu -o es://192.168.2.3:9200/my_index/my_type \
+```shell copy
+fluent-bit -i cpu -t cpu -o es://192.168.2.3:9200/my_index/my_type \
-o stdout -m '*'
```
-which is similar to do:
+Which is similar to the following command:
-```text
-$ fluent-bit -i cpu -t cpu -o es -p Host=192.168.2.3 -p Port=9200 \
+```shell copy
+fluent-bit -i cpu -t cpu -o es -p Host=192.168.2.3 -p Port=9200 \
-p Index=my_index -p Type=my_type -o stdout -m '*'
```
### Configuration File
-In your main configuration file append the following _Input_ & _Output_ sections. You can visualize this configuration [here](https://link.calyptia.com/qhq)
+In your main configuration file append the following `Input` and `Output` sections.
```python
[INPUT]
@@ -114,11 +127,13 @@ In your main configuration file append the following _Input_ & _Output_ sections
Type my_type
```
-![example configuration visualization from calyptia](../../.gitbook/assets/image%20%282%29.png)
+![example configuration visualization from Calyptia](../../.gitbook/assets/image%20%282%29.png)
## About Elasticsearch field names
-Some input plugins may generate messages where the field names contains dots, since Elasticsearch 2.0 this is not longer allowed, so the current **es** plugin replaces them with an underscore, e.g:
+Some input plugins can generate messages where the field names contains dots. For
+Elasticsearch 2.0, this isn't allowed. The current **es** plugin replaces
+them with an underscore:
```text
{"cpu0.p_cpu"=>17.000000}
@@ -130,58 +145,21 @@ becomes
{"cpu0_p_cpu"=>17.000000}
```
-## FAQ
-
-### Elasticsearch rejects requests saying "the final mapping would have more than 1 type"
-
-Since Elasticsearch 6.0, you cannot create multiple types in a single index. This means that you cannot set up your configuration as below anymore.
-
-```text
-[OUTPUT]
- Name es
- Match foo.*
- Index search
- Type type1
-
-[OUTPUT]
- Name es
- Match bar.*
- Index search
- Type type2
-```
-
-If you see an error message like below, you'll need to fix your configuration to use a single type on each index.
+## Use Fluent Bit ElasticSearch plugin with other services
-> Rejecting mapping update to \[search\] as the final mapping would have more than 1 type
+Connect to Amazon OpenSearch or Elastic Cloud with the ElasticSearch plugin.
-For details, please read [the official blog post on that issue](https://www.elastic.co/guide/en/elasticsearch/reference/6.7/removal-of-types.html).
+### Amazon OpenSearch Service
-### Elasticsearch rejects requests saying "Document mapping type name can't start with '\_'"
+The Amazon OpenSearch Service adds an extra security layer where HTTP requests must
+be signed with AWS Sigv4. Fluent Bit v1.5 introduced full support for Amazon
+OpenSearch Service with IAM Authentication.
-Fluent Bit v1.5 changed the default mapping type from `flb_type` to `_doc`, which matches the recommendation from Elasticsearch from version 6.2 forwards \([see commit with rationale](https://github.com/fluent/fluent-bit/commit/04ed3d8104ca8a2f491453777ae6e38e5377817e#diff-c9ae115d3acaceac5efb949edbb21196)\). This doesn't work in Elasticsearch versions 5.6 through 6.1 \([see Elasticsearch discussion and fix](https://discuss.elastic.co/t/cant-use-doc-as-type-despite-it-being-declared-the-preferred-method/113837/9)\). Ensure you set an explicit map \(such as `doc` or `flb_type`\) in the configuration, as seen on the last line:
-
-```text
-[OUTPUT]
- Name es
- Match *
- Host vpc-test-domain-ke7thhzoo7jawsrhmm6mb7ite7y.us-west-2.es.amazonaws.com
- Port 443
- Index my_index
- AWS_Auth On
- AWS_Region us-west-2
- tls On
- Type doc
-```
-
-### Fluent Bit + Amazon OpenSearch Service
-
-The Amazon OpenSearch Service adds an extra security layer where HTTP requests must be signed with AWS Sigv4. Fluent Bit v1.5 introduced full support for Amazon OpenSearch Service with IAM Authentication.
-
-See [here](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b0edb2f9acd7cdfdbc3/administration/aws-credentials.md) for details on how AWS credentials are fetched.
+See [details](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b0edb2f9acd7cdfdbc3/administration/aws-credentials.md) on how AWS credentials are fetched.
Example configuration:
-```text
+```text copy
[OUTPUT]
Name es
Match *
@@ -194,16 +172,20 @@ Example configuration:
tls On
```
-Notice that the `Port` is set to `443`, `tls` is enabled, and `AWS_Region` is set.
+Be aware that the `Port` is set to `443`, `tls` is enabled, and `AWS_Region` is set.
-### Fluent Bit + Elastic Cloud
+### Use Fluent Bit with Elastic Cloud
-Fluent Bit supports connecting to [Elastic Cloud](https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html) providing just the `cloud_id` and the `cloud_auth` settings.
-`cloud_auth` uses the `elastic` user and password provided when the cluster was created, for details refer to the [Cloud ID usage page](https://www.elastic.co/guide/en/cloud/current/ec-cloud-id.html).
+Fluent Bit supports connecting to
+[Elastic Cloud](https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html)
+by providing the `cloud_id` and the `cloud_auth` settings. `cloud_auth` uses the
+`elastic` user and password provided when the cluster was created. For details refer
+to the
+[Cloud ID usage page](https://www.elastic.co/guide/en/cloud/current/ec-cloud-id.html).
Example configuration:
-```text
+```text copy
[OUTPUT]
Name es
Include_Tag_Key true
@@ -215,35 +197,99 @@ Example configuration:
cloud_auth elastic:2vxxxxxxxxYV
```
-### Validation Failed: 1: an id must be provided if version type or value are set
+In Elastic Cloud version 8 and great, the type option must be removed by setting
+`Suppress_Type_Name On`.
+
+Without this you will see errors like:
+
+```text
+{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]"}],"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]"},"status":400}
+```
-Since v1.8.2, Fluent Bit started using `create` method (instead of `index`) for data submission.
-This makes Fluent Bit compatible with Datastream introduced in Elasticsearch 7.9.
+## Troubleshooting
-If you see `action_request_validation_exception` errors on your pipeline with Fluent Bit >= v1.8.2, you can fix it up by turning on `Generate_ID` as follows:
+Use the following information to help resolve errors using the ElasticSearch plugin.
+
+### Using multiple types in a single index
+
+Elasticsearch 6.0 can't create multiple types in a single index. An error message
+like the following indicates you need to update your configuration to use a single
+type on each index.
+
+```text
+Rejecting mapping update to [products] as the final mapping would have more than 1 type:
+```
+
+This means that you can't set up your configuration like the following:.
```text
[OUTPUT]
- Name es
- Match *
- Host 192.168.12.1
- Generate_ID on
+ Name es
+ Match foo.*
+ Index search
+ Type type1
+
+[OUTPUT]
+ Name es
+ Match bar.*
+ Index search
+ Type type2
```
-### Action/metadata contains an unknown parameter type
+For details, read [the official blog post on that issue](https://www.elastic.co/guide/en/elasticsearch/reference/6.7/removal-of-types.html).
-Elastic Cloud is now on version 8 so the type option must be removed by setting `Suppress_Type_Name On` as indicated above.
+### Mapping type names can't start with underscores (`_`)
-Without this you will see errors like:
+Fluent Bit v1.5 changed the default mapping type from `flb_type` to `_doc`, matching
+the recommendation from Elasticsearch for version 6.2 and greater
+([see commit with
+rationale](https://github.com/fluent/fluent-bit/commit/04ed3d8104ca8a2f491453777ae6e38e5377817e#diff-c9ae115d3acaceac5efb949edbb21196)).
+
+This doesn't work in Elasticsearch versions 5.6 through 6.1
+([discussion and fix](https://discuss.elastic.co/t/cant-use-doc-as-type-despite-it-being-declared-the-preferred-method/113837/9)).
+
+Ensure you set an explicit map such as `doc` or `flb_type` in the configuration,
+as seen on the last line:
+
+```text copy
+[OUTPUT]
+ Name es
+ Match *
+ Host vpc-test-domain-ke7thhzoo7jawsrhmm6mb7ite7y.us-west-2.es.amazonaws.com
+ Port 443
+ Index my_index
+ AWS_Auth On
+ AWS_Region us-west-2
+ tls On
+ Type doc
+```
+
+### Validation failures
+
+In Fluent Bit v1.8.2 and greater, Fluent Bit started using `create` method (instead
+of `index`) for data submission. This makes Fluent Bit compatible with Datastream,
+introduced in Elasticsearch 7.9. You might see errors like:
```text
-{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]"}],"type":"illegal_argument_exception","reason":"Action/metadata line [1] contains an unknown parameter [_type]"},"status":400}
+Validation Failed: 1: an id must be provided if version type or value are set
+```
+
+If you see `action_request_validation_exception` errors on your pipeline with
+Fluent Bit versions greater than v1.8.2, correct them by turning on `Generate_ID`
+as follows:
+
+```text copy
+[OUTPUT]
+ Name es
+ Match *
+ Host 192.168.12.1
+ Generate_ID on
```
-### Logstash_Prefix_Key
+### `Logstash_Prefix_Key`
The following snippet demonstrates using the namespace name as extracted by the
-`kubernetes` filter as logstash prefix:
+`kubernetes` filter as `logstash` prefix:
```text
[OUTPUT]
@@ -255,4 +301,5 @@ The following snippet demonstrates using the namespace name as extracted by the
# ...
```
-For records that do nor have the field `kubernetes.namespace_name`, the default prefix, `logstash` will be used.
+For records that don't have the field `kubernetes.namespace_name`, the default prefix
+`logstash` will be used.
diff --git a/pipeline/outputs/file.md b/pipeline/outputs/file.md
index 5dde1b862..475609aec 100644
--- a/pipeline/outputs/file.md
+++ b/pipeline/outputs/file.md
@@ -12,7 +12,7 @@ The plugin supports the following configuration parameters:
| File | Set file name to store the records. If not set, the file name will be the _tag_ associated with the records. |
| Format | The format of the file content. See also Format section. Default: out\_file. |
| Mkdir | Recursively create output directory if it does not exist. Permissions set to 0755. |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 1 |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
## Format
@@ -111,4 +111,3 @@ In your main configuration file append the following Input & Output sections:
Match *
Path output_dir
```
-
diff --git a/pipeline/outputs/firehose.md b/pipeline/outputs/firehose.md
index e896610c9..d4a8d831a 100644
--- a/pipeline/outputs/firehose.md
+++ b/pipeline/outputs/firehose.md
@@ -28,6 +28,7 @@ See [here](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b
| auto\_retry\_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. This option defaults to `true`. |
| external\_id | Specify an external ID for the STS API, can be used with the role_arn parameter if your role requires an external ID. |
| profile | AWS profile name to use. Defaults to `default`. |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. Default: `1`. |
## Getting Started
@@ -132,4 +133,3 @@ aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/
```
For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images).
-
diff --git a/pipeline/outputs/flowcounter.md b/pipeline/outputs/flowcounter.md
index 69bc75ebd..a6b12e462 100644
--- a/pipeline/outputs/flowcounter.md
+++ b/pipeline/outputs/flowcounter.md
@@ -9,6 +9,7 @@ The plugin supports the following configuration parameters:
| Key | Description | Default |
| :--- | :--- | :--- |
| Unit | The unit of duration. \(second/minute/hour/day\) | minute |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Getting Started
@@ -42,7 +43,7 @@ In your main configuration file append the following Input & Output sections:
Once Fluent Bit is running, you will see the reports in the output interface similar to this:
```bash
-$ fluent-bit -i cpu -o flowcounter
+$ fluent-bit -i cpu -o flowcounter
Fluent Bit v1.x.x
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
@@ -52,4 +53,3 @@ Fluent Bit v1.x.x
[2016/12/23 11:01:20] [ info] [engine] started
[out_flowcounter] cpu.0:[1482458540, {"counts":60, "bytes":7560, "counts/minute":1, "bytes/minute":126 }]
```
-
diff --git a/pipeline/outputs/forward.md b/pipeline/outputs/forward.md
index 4e6d297f3..df861c52a 100644
--- a/pipeline/outputs/forward.md
+++ b/pipeline/outputs/forward.md
@@ -22,8 +22,8 @@ The following parameters are mandatory for either Forward for Secure Forward mod
| Tag | Overwrite the tag as we transmit. This allows the receiving pipeline start fresh, or to attribute source. | |
| Send_options | Always send options (with "size"=count of messages) | False |
| Require_ack_response | Send "chunk"-option and wait for "ack" response from server. Enables at-least-once and receiving server can control rate of traffic. (Requires Fluentd v0.14.0+ server) | False |
-| Compress | Set to "gzip" to enable gzip compression. Incompatible with Time_as_Integer=True and tags set dynamically using the [Rewrite Tag](https://app.gitbook.com/s/-LKKSx-3LBTCtaHbg0gl-887967055/pipeline/filters/rewrite-tag.md) filter. (Requires Fluentd v0.14.7+ server) | |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| Compress | Set to 'gzip' to enable gzip compression. Incompatible with `Time_as_Integer=True` and tags set dynamically using the [Rewrite Tag](../filters/rewrite-tag.md) filter. Requires Fluentd server v0.14.7 or later. | _none_ |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
## Secure Forward Mode Configuration Parameters
diff --git a/pipeline/outputs/gelf.md b/pipeline/outputs/gelf.md
index ee115ec10..0aad41bff 100644
--- a/pipeline/outputs/gelf.md
+++ b/pipeline/outputs/gelf.md
@@ -14,6 +14,7 @@ According to [GELF Payload Specification](https://go2docs.graylog.org/5-0/gettin
| Host | IP address or hostname of the target Graylog server | 127.0.0.1 |
| Port | The port that your Graylog GELF input is listening on | 12201 |
| Mode | The protocol to use (`tls`, `tcp` or `udp`) | udp |
+| Gelf\_Tag\_Key | Key to be used for tag. (_Optional in GELF_) | |
| Gelf_Short_Message_Key | A short descriptive message (**MUST be set in GELF**) | short_message |
| Gelf_Timestamp_Key | Your log timestamp (_SHOULD be set in GELF_) | timestamp |
| Gelf_Host_Key | Key which its value is used as the name of the host, source or application that sent this message. (**MUST be set in GELF**) | host |
@@ -21,6 +22,7 @@ According to [GELF Payload Specification](https://go2docs.graylog.org/5-0/gettin
| Gelf_Level_Key | Key to be used as the log level. Its value must be in [standard syslog levels](https://en.wikipedia.org/wiki/Syslog#Severity_level) (between 0 and 7). (_Optional in GELF_) | level |
| Packet_Size | If transport protocol is `udp`, you can set the size of packets to be sent. | 1420 |
| Compress | If transport protocol is `udp`, you can set this if you want your UDP packets to be compressed. | true |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### TLS / SSL
diff --git a/pipeline/outputs/http.md b/pipeline/outputs/http.md
index 59ed7b5f5..bbbdd8e79 100644
--- a/pipeline/outputs/http.md
+++ b/pipeline/outputs/http.md
@@ -33,7 +33,7 @@ The **http** output plugin allows to flush your records into a HTTP endpoint. Fo
| gelf\_level\_key | Specify the key to use for the `level` in _gelf_ format | |
| body\_key | Specify the key to use as the body of the request (must prefix with "$"). The key must contain either a binary or raw string, and the content type can be specified using headers\_key (which must be passed whenever body\_key is present). When this option is present, each msgpack record will create a separate request. | |
| headers\_key | Specify the key to use as the headers of the request (must prefix with "$"). The key must contain a map, which will have the contents merged on the request headers. This can be used for many purposes, such as specifying the content-type of the data contained in body\_key. | |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
### TLS / SSL
diff --git a/pipeline/outputs/influxdb.md b/pipeline/outputs/influxdb.md
index 53a8fe41b..2b59703f4 100644
--- a/pipeline/outputs/influxdb.md
+++ b/pipeline/outputs/influxdb.md
@@ -19,6 +19,7 @@ The **influxdb** output plugin, allows to flush your records into a [InfluxDB](h
| Tag\_Keys | Space separated list of keys that needs to be tagged | |
| Auto\_Tags | Automatically tag keys where value is _string_. This option takes a boolean value: True/False, On/Off. | Off |
| Uri | Custom URI endpoint | |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### TLS / SSL
@@ -207,4 +208,3 @@ key value
method "MATCH"
method "POST"
```
-
diff --git a/pipeline/outputs/kafka-rest-proxy.md b/pipeline/outputs/kafka-rest-proxy.md
index 399d57108..b03d49e9d 100644
--- a/pipeline/outputs/kafka-rest-proxy.md
+++ b/pipeline/outputs/kafka-rest-proxy.md
@@ -15,6 +15,7 @@ The **kafka-rest** output plugin, allows to flush your records into a [Kafka RES
| Time\_Key\_Format | Defines the format of the timestamp. | %Y-%m-%dT%H:%M:%S |
| Include\_Tag\_Key | Append the Tag name to the final record. | Off |
| Tag\_Key | If Include\_Tag\_Key is enabled, this property defines the key name for the tag. | \_flb-key |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### TLS / SSL
@@ -49,4 +50,3 @@ In your main configuration file append the following _Input_ & _Output_ sections
Topic fluent-bit
Message_Key my_key
```
-
diff --git a/pipeline/outputs/kafka.md b/pipeline/outputs/kafka.md
index cfe3e4f75..4599b62da 100644
--- a/pipeline/outputs/kafka.md
+++ b/pipeline/outputs/kafka.md
@@ -6,7 +6,7 @@ Kafka output plugin allows to ingest your records into an [Apache Kafka](https:/
| Key | Description | default |
| :--- | :--- | :--- |
-| format | Specify data format, options available: json, msgpack. | json |
+| format | Specify data format, options available: json, msgpack, raw. | json |
| message\_key | Optional key to store the message | |
| message\_key\_field | If set, the value of Message\_Key\_Field in the record will indicate the message key. If not set nor found in the record, Message\_Key will be used \(if set\). | |
| timestamp\_key | Set the key to store the record timestamp | @timestamp |
@@ -17,6 +17,8 @@ Kafka output plugin allows to ingest your records into an [Apache Kafka](https:/
| dynamic\_topic | adds unknown topics \(found in Topic\_Key\) to Topics. So in Topics only a default topic needs to be configured | Off |
| queue\_full\_retries | Fluent Bit queues data into rdkafka library, if for some reason the underlying library cannot flush the records the queue might fills up blocking new addition of records. The `queue_full_retries` option set the number of local retries to enqueue the data. The default value is 10 times, the interval between each retry is 1 second. Setting the `queue_full_retries` value to `0` set's an unlimited number of retries. | 10 |
| rdkafka.{property} | `{property}` can be any [librdkafka properties](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md) | |
+| raw\_log\_key | When using the raw format and set, the value of raw\_log\_key in the record will be send to kafka as the payload. | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
> Setting `rdkafka.log.connection.close` to `false` and `rdkafka.request.required.acks` to 1 are examples of recommended settings of librdfkafka properties.
@@ -114,3 +116,28 @@ specific avro schema.
rdkafka.log_level 7
rdkafka.metadata.broker.list 192.168.1.3:9092
```
+
+#### Kafka Configuration File with Raw format
+
+This example Fluent Bit configuration file creates example records with the
+_payloadkey_ and _msgkey_ keys. The _msgkey_ value is used as the Kafka message
+key, and the _payloadkey_ value as the payload.
+
+
+```text
+[INPUT]
+ Name example
+ Tag example.data
+ Dummy {"payloadkey":"Data to send to kafka", "msgkey": "Key to use in the message"}
+
+
+[OUTPUT]
+ Name kafka
+ Match *
+ Brokers 192.168.1.3:9092
+ Topics test
+ Format raw
+
+ Raw_Log_Key payloadkey
+ Message_Key_Field msgkey
+```
diff --git a/pipeline/outputs/kinesis.md b/pipeline/outputs/kinesis.md
index b21766678..14c8d0aa7 100644
--- a/pipeline/outputs/kinesis.md
+++ b/pipeline/outputs/kinesis.md
@@ -29,6 +29,7 @@ See [here](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b
| auto\_retry\_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. This option defaults to `true`. |
| external\_id | Specify an external ID for the STS API, can be used with the role_arn parameter if your role requires an external ID. |
| profile | AWS profile name to use. Defaults to `default`. |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. Default: `1`. |
## Getting Started
@@ -71,23 +72,6 @@ The following AWS IAM permissions are required to use this plugin:
}
```
-### Worker support
-
-Fluent Bit 1.7 adds a new feature called `workers` which enables outputs to have dedicated threads. This `kinesis_streams` plugin fully supports workers.
-
-Example:
-
-```text
-[OUTPUT]
- Name kinesis_streams
- Match *
- region us-east-1
- stream my-stream
- workers 2
-```
-
-If you enable a single worker, you are enabling a dedicated thread for your Kinesis output. We recommend starting with without workers, evaluating the performance, and then adding workers one at a time until you reach your desired/needed throughput. For most users, no workers or a single worker will be sufficient.
-
### AWS for Fluent Bit
Amazon distributes a container image with Fluent Bit and these plugins.
@@ -133,4 +117,3 @@ aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/
```
For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images).
-
diff --git a/pipeline/outputs/logdna.md b/pipeline/outputs/logdna.md
index 3416dff2a..96026d7c7 100644
--- a/pipeline/outputs/logdna.md
+++ b/pipeline/outputs/logdna.md
@@ -78,6 +78,11 @@ Before to get started with the plugin configuration, make sure to obtain the pro
if not found, the default value is used.
Fluent Bit |
+
+ workers |
+ The number of workers to perform flush operations for this output. |
+ `0` |
+
@@ -150,4 +155,3 @@ Your record will be available and visible in your LogDNA dashboard after a few s
In your LogDNA dashboard, go to the top filters and mark the Tags `aa` and `bb`, then you will be able to see your records as the example below:
![](../../.gitbook/assets/logdna.png)
-
diff --git a/pipeline/outputs/loki.md b/pipeline/outputs/loki.md
index 09c40b9d9..ec68f1bb2 100644
--- a/pipeline/outputs/loki.md
+++ b/pipeline/outputs/loki.md
@@ -24,12 +24,14 @@ Be aware there is a separate Golang output plugin provided by [Grafana](https://
| labels | Stream labels for API request. It can be multiple comma separated of strings specifying `key=value` pairs. In addition to fixed parameters, it also allows to add custom record keys \(similar to `label_keys` property\). More details in the Labels section. | job=fluent-bit |
| label\_keys | Optional list of record keys that will be placed as stream labels. This configuration property is for records key only. More details in the Labels section. | |
| label\_map\_path | Specify the label map file path. The file defines how to extract labels from each record. More details in the Labels section. | |
+| structured\_metadata | Optional comma-separated list of `key=value` strings specifying structured metadata for the log line. Like the `labels` parameter, values can reference record keys using record accessors. See [Structured metadata](#structured-metadata) for more information. | |
| remove\_keys | Optional list of keys to remove. | |
-| drop\_single\_key | If set to true and after extracting labels only a single key remains, the log line sent to Loki will be the value of that key in line\_format. | off |
+| drop\_single\_key | If set to true and after extracting labels only a single key remains, the log line sent to Loki will be the value of that key in line\_format. If set to `raw` and the log line is a string, the log line will be sent unquoted. | off |
| line\_format | Format to use when flattening the record to a log line. Valid values are `json` or `key_value`. If set to `json`, the log line sent to Loki will be the Fluent Bit record dumped as JSON. If set to `key_value`, the log line will be each item in the record concatenated together \(separated by a single space\) in the format. | json |
| auto\_kubernetes\_labels | If set to true, it will add all Kubernetes labels to the Stream labels | off |
| tenant\_id\_key | Specify the name of the key from the original record that contains the Tenant ID. The value of the key is set as `X-Scope-OrgID` of HTTP header. It is useful to set Tenant ID dynamically. ||
| compress | Set payload compression mechanism. The only available option is gzip. Default = "", which means no compression. ||
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Labels
@@ -176,6 +178,88 @@ Based in the JSON example provided above, the internal stream labels will be:
job="fluentbit", team="Santiago Wanderers"
```
+## Drop Single Key
+
+If there is only one key remaining after removing keys, you can use the `drop_single_key` property to send its value to Loki, rather than a single key=value pair.
+
+Consider this simple JSON example:
+
+```json
+{"key":"value"}
+```
+
+If the value is a string, `line_format` is `json`, and `drop_single_key` is `true`, it will be sent as a quoted string.
+
+```python
+[OUTPUT]
+ name loki
+ match *
+ drop_single_key on
+ line_format json
+```
+
+The outputted line would show in Loki as:
+
+```json
+"value"
+```
+
+If `drop_single_key` is `raw`, or `line_format` is `key_value`, it will show in Loki as:
+
+```text
+value
+```
+
+If you want both structured JSON and plain-text logs in Loki, you should set `drop_single_key` to `raw` and `line_format` to `json`.
+Loki does not interpret a quoted string as valid JSON, and so to remove the quotes without `drop_single_key` set to raw, you would need to use a query like this:
+
+```C
+{"job"="fluent-bit"} | regexp `^"?(?P.*?)"?$` | line_format "{{.log}}"
+```
+
+If `drop_single_key` is `off`, it will show in Loki as:
+
+```json
+{"key":"value"}
+```
+
+You can get the same behavior this flag provides in Loki with `drop_single_key` set to `off` with this query:
+
+```C
+{"job"="fluent-bit"} | json | line_format "{{.log}}"
+```
+
+### Structured metadata
+
+[Structured metadata](https://grafana.com/docs/loki/latest/get-started/labels/structured-metadata/)
+lets you attach custom fields to individual log lines without embedding the
+information in the content of the log line. This capability works well for high
+cardinality data that isn't suited for using labels. While not a label, the
+`structured_metadata` configuration parameter operates similarly to the `labels`
+parameter. Both parameters are comma-delimited `key=value` lists, and both can use
+record accessors to reference keys within the record being processed.
+
+The following configuration:
+
+- Defines fixed values for the cluster and region labels.
+- Uses the record accessor pattern to set the namespace label to the namespace name as
+ determined by the Kubernetes metadata filter (not shown).
+- Uses a structured metadata field to hold the Kubernetes pod name.
+
+```python
+[OUTPUT]
+ name loki
+ match *
+ labels cluster=my-k8s-cluster, region=us-east-1, namespace=$kubernetes['namespace_name']
+ structured_metadata pod=$kubernetes['pod_name']
+```
+
+
+Other common uses for structured metadata include trace and span IDs, process and thread IDs, and log levels.
+
+Structured metadata is officially supported starting with Loki 3.0, and shouldn't be used
+with Loki deployments prior to Loki 3.0.
+
## Networking and TLS Configuration
This plugin inherit core Fluent Bit features to customize the network behavior and optionally enable TLS in the communication channel. For more details about the specific options available refer to the following articles:
@@ -252,4 +336,3 @@ Fluent Bit v1.7.0
[2020/10/14 20:57:46] [debug] [http] request payload (272 bytes)
[2020/10/14 20:57:46] [ info] [output:loki:loki.0] 127.0.0.1:3100, HTTP status=204
```
-
diff --git a/pipeline/outputs/nats.md b/pipeline/outputs/nats.md
index c2586e45a..10d17a004 100644
--- a/pipeline/outputs/nats.md
+++ b/pipeline/outputs/nats.md
@@ -2,12 +2,13 @@
The **nats** output plugin, allows to flush your records into a [NATS Server](https://docs.nats.io/nats-concepts/intro) end point. The following instructions assumes that you have a fully operational NATS Server in place.
-In order to flush records, the **nats** plugin requires to know two parameters:
+## Configuration parameters
| parameter | description | default |
| :--- | :--- | :--- |
| host | IP address or hostname of the NATS Server | 127.0.0.1 |
| port | TCP port of the target NATS Server | 4222 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
In order to override the default configuration values, the plugin uses the optional Fluent Bit network address format, e.g:
@@ -64,4 +65,3 @@ Each record is an individual entity represented in a JSON array that contains a
[1457108506,{"tag":"fluentbit","cpu_p":6.500000,"user_p":4.500000,"system_p":2}]
]
```
-
diff --git a/pipeline/outputs/new-relic.md b/pipeline/outputs/new-relic.md
index 29219f6c8..074acce00 100644
--- a/pipeline/outputs/new-relic.md
+++ b/pipeline/outputs/new-relic.md
@@ -72,6 +72,7 @@ Before to get started with the plugin configuration, make sure to obtain the pro
| compress | Set the compression mechanism for the payload. This option allows two values: `gzip` \(enabled by default\) or `false` to disable compression. | gzip |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
| :--- | :--- | :--- |
@@ -117,4 +118,3 @@ Fluent Bit v1.5.0
[2020/04/10 10:58:35] [ info] [output:nrlogs:nrlogs.0] log-api.newrelic.com:443, HTTP status=202
{"requestId":"feb312fe-004e-b000-0000-0171650764ac"}
```
-
diff --git a/pipeline/outputs/observe.md b/pipeline/outputs/observe.md
index 2e722422e..47be2503f 100644
--- a/pipeline/outputs/observe.md
+++ b/pipeline/outputs/observe.md
@@ -2,7 +2,7 @@
Observe employs the **http** output plugin, allowing you to flush your records [into Observe](https://docs.observeinc.com/en/latest/content/data-ingestion/forwarders/fluentbit.html).
-For now the functionality is pretty basic and it issues a POST request with the data records in [MessagePack](http://msgpack.org) (or JSON) format.
+For now the functionality is pretty basic and it issues a POST request with the data records in [MessagePack](http://msgpack.org) (or JSON) format.
The following are the specific HTTP parameters to employ:
@@ -19,6 +19,7 @@ The following are the specific HTTP parameters to employ:
| header | The specific header to instructs Observe how to decode incoming payloads | X-Observe-Decoder fluent |
| compress | Set payload compression mechanism. Option available is 'gzip' | gzip |
| tls.ca_file | **For use with Windows**: provide path to root cert | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### Configuration File
@@ -41,5 +42,5 @@ In your main configuration file, append the following _Input_ & _Output_ section
# For Windows: provide path to root cert
#tls.ca_file C:\fluent-bit\isrgrootx1.pem
-
+
```
diff --git a/pipeline/outputs/oci-logging-analytics.md b/pipeline/outputs/oci-logging-analytics.md
index 54abb039a..4f8246ceb 100644
--- a/pipeline/outputs/oci-logging-analytics.md
+++ b/pipeline/outputs/oci-logging-analytics.md
@@ -20,6 +20,7 @@ Following are the top level configuration properties of the plugin:
| profile_name | OCI Config Profile Name to be used from the configuration file | DEFAULT |
| namespace | OCI Tenancy Namespace in which the collected log data is to be uploaded | |
| proxy | define proxy if required, in http://host:port format, supports only http protocol | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
The following parameters are to set the Logging Analytics resources that must be used to process your logs by OCI Logging Analytics.
@@ -28,7 +29,7 @@ The following parameters are to set the Logging Analytics resources that must be
| oci_config_in_record | If set to true, the following oci_la_* will be read from the record itself instead of the output plugin configuration. | false |
| oci_la_log_group_id | The OCID of the Logging Analytics Log Group where the logs must be stored. This is a mandatory parameter | |
| oci_la_log_source_name | The Logging Analytics Source that must be used to process the log records. This is a mandatory parameter | |
-| oci_la_entity_id | The OCID of the Logging Analytics Entity | |
+| oci_la_entity_id | The OCID of the Logging Analytics Entity | |
| oci_la_entity_type | The entity type of the Logging Analytics Entity | |
| oci_la_log_path | Specify the original location of the log files | |
| oci_la_global_metadata | Use this parameter to specify additional global metadata along with original log content to Logging Analytics. The format is 'key_name value'. This option can be set multiple times | |
@@ -86,11 +87,13 @@ In case of multiple inputs, where oci_la_* properties can differ, you can add th
[INPUT]
Name dummy
Tag dummy
+
[Filter]
Name modify
Match *
Add oci_la_log_source_name
Add oci_la_log_group_id
+
[Output]
Name oracle_log_analytics
Match *
@@ -109,6 +112,7 @@ You can attach certain metadata to the log events collected from various inputs.
[INPUT]
Name dummy
Tag dummy
+
[Output]
Name oracle_log_analytics
Match *
@@ -138,12 +142,12 @@ The above configuration will generate a payload that looks like this
"metadata": {
"key1": "value1",
"key2": "value2"
- },
- "logSourceName": "example_log_source",
- "logRecords": [
- "dummy"
- ]
- }
+ },
+ "logSourceName": "example_log_source",
+ "logRecords": [
+ "dummy"
+ ]
+ }
]
}
```
@@ -156,11 +160,13 @@ With oci_config_in_record option set to true, the metadata key-value pairs will
[INPUT]
Name dummy
Tag dummy
+
[FILTER]
Name Modify
Match *
Add olgm.key1 val1
Add olgm.key2 val2
+
[FILTER]
Name nest
Match *
@@ -168,11 +174,13 @@ With oci_config_in_record option set to true, the metadata key-value pairs will
Wildcard olgm.*
Nest_under oci_la_global_metadata
Remove_prefix olgm.
+
[Filter]
Name modify
Match *
Add oci_la_log_source_name
Add oci_la_log_group_id
+
[Output]
Name oracle_log_analytics
Match *
@@ -184,4 +192,4 @@ With oci_config_in_record option set to true, the metadata key-value pairs will
tls.verify Off
```
-The above configuration first injects the necessary metadata keys and values in the record directly, with a prefix olgm. attached to the keys in order to segregate the metadata keys from rest of the record keys. Then, using a nest filter only the metadata keys are selected by the filter and nested under oci_la_global_metadata key in the record, and the prefix olgm. is removed from the metadata keys.
\ No newline at end of file
+The above configuration first injects the necessary metadata keys and values in the record directly, with a prefix olgm. attached to the keys in order to segregate the metadata keys from rest of the record keys. Then, using a nest filter only the metadata keys are selected by the filter and nested under oci_la_global_metadata key in the record, and the prefix olgm. is removed from the metadata keys.
diff --git a/pipeline/outputs/openobserve.md b/pipeline/outputs/openobserve.md
new file mode 100644
index 000000000..94a10c91a
--- /dev/null
+++ b/pipeline/outputs/openobserve.md
@@ -0,0 +1,49 @@
+---
+title: OpenObserve
+description: Send logs to OpenObserve using Fluent Bit
+---
+
+# OpenObserve
+
+Use the OpenObserve output plugin to ingest logs into [OpenObserve](https://openobserve.ai/).
+
+Before you begin, you need an [OpenObserve account](https://cloud.openobserve.ai/), an
+`HTTP_User`, and an `HTTP_Passwd`. You can find these fields under **Ingestion** in
+OpenObserve Cloud. Alternatively, you can achieve this with various installation
+types as mentioned in the
+[OpenObserve documentation](https://openobserve.ai/docs/quickstart/)
+
+## Configuration Parameters
+
+| Key | Description | Default |
+| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------- |
+| Host | Required. The OpenObserve server where you are sending logs. | `localhost` |
+| TLS | Required: Enable end-to-end security using TLS. Set to `on` to enable TLS communication with OpenObserve. | `on` |
+| compress | Recommended: Compresses the payload in GZIP format. OpenObserve supports and recommends setting this to `gzip` for optimized log ingestion. | _none_ |
+| HTTP_User | Required: Username for HTTP authentication. | _none_ |
+| HTTP_Passwd | Required: Password for HTTP authentication. | _none_ |
+| URI | Required: The API path used to send logs. | `/api/default/default/_json` |
+| Format | Required: The format of the log payload. OpenObserve expects JSON. | `json` |
+| json_date_key | Optional: The JSON key used for timestamps in the logs. | `timestamp` |
+| json_date_format | Optional: The format of the date in logs. OpenObserve supports ISO 8601. | `iso8601` |
+| include_tag_key | If `true`, a tag is appended to the output. The key name is used in the `tag_key` property. | `false` |
+
+### Configuration File
+
+Use this configuration file to get started:
+
+```
+[OUTPUT]
+ Name http
+ Match *
+ URI /api/default/default/_json
+ Host localhost
+ Port 5080
+ tls on
+ Format json
+ Json_date_key timestamp
+ Json_date_format iso8601
+ HTTP_User
+ HTTP_Passwd
+ compress gzip
+```
\ No newline at end of file
diff --git a/pipeline/outputs/opensearch.md b/pipeline/outputs/opensearch.md
index e238486e0..0b0142d3d 100644
--- a/pipeline/outputs/opensearch.md
+++ b/pipeline/outputs/opensearch.md
@@ -45,7 +45,7 @@ The following instructions assumes that you have a fully operational OpenSearch
| Trace\_Error | When enabled print the OpenSearch API calls to stdout when OpenSearch returns an error \(for diag only\) | Off |
| Current\_Time\_Index | Use current time for index generation instead of message record | Off |
| Suppress\_Type\_Name | When enabled, mapping types is removed and `Type` option is ignored. | Off |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
| Compress | Set payload compression mechanism. The only available option is `gzip`. Default = "", which means no compression. | |
> The parameters _index_ and _type_ can be confusing if you are new to OpenSearch, if you have used a common relational database before, they can be compared to the _database_ and _table_ concepts. Also see [the FAQ below](opensearch.md#faq)
@@ -199,7 +199,7 @@ With data access permissions, IAM policies are not needed to access the collecti
### Issues with the OpenSearch cluster
-Occasionally the Fluent Bit service may generate errors without any additional detail in the logs to explain the source of the issue, even with the service's log_level attribute set to [Debug](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file).
+Occasionally the Fluent Bit service may generate errors without any additional detail in the logs to explain the source of the issue, even with the service's log_level attribute set to [Debug](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file).
For example, in this scenario the logs show that a connection was successfully established with the OpenSearch domain, and yet an error is still returned:
```
@@ -218,9 +218,9 @@ This behavior could be indicative of a hard-to-detect issue with index shard usa
While OpenSearch index shards and disk space are related, they are not directly tied to one another.
-OpenSearch domains are limited to 1000 index shards per data node, regardless of the size of the nodes. And, importantly, shard usage is not proportional to disk usage: an individual index shard can hold anywhere from a few kilobytes to dozens of gigabytes of data.
+OpenSearch domains are limited to 1000 index shards per data node, regardless of the size of the nodes. And, importantly, shard usage is not proportional to disk usage: an individual index shard can hold anywhere from a few kilobytes to dozens of gigabytes of data.
-In other words, depending on the way index creation and shard allocation are configured in the OpenSearch domain, all of the available index shards could be used long before the data nodes run out of disk space and begin exhibiting disk-related performance issues (e.g. nodes crashing, data corruption, or the dashboard going offline).
+In other words, depending on the way index creation and shard allocation are configured in the OpenSearch domain, all of the available index shards could be used long before the data nodes run out of disk space and begin exhibiting disk-related performance issues (e.g. nodes crashing, data corruption, or the dashboard going offline).
The primary issue that arises when a domain is out of available index shards is that new indexes can no longer be created (though logs can still be added to existing indexes).
@@ -231,7 +231,7 @@ When that happens, the Fluent Bit OpenSearch output may begin showing confusing
If any of those symptoms are present, consider using the OpenSearch domain's API endpoints to troubleshoot possible shard issues.
-Running this command will show both the shard count and disk usage on all of the nodes in the domain.
+Running this command will show both the shard count and disk usage on all of the nodes in the domain.
```
GET _cat/allocation?v
```
diff --git a/pipeline/outputs/opentelemetry.md b/pipeline/outputs/opentelemetry.md
index a70d84396..c41fe95dd 100644
--- a/pipeline/outputs/opentelemetry.md
+++ b/pipeline/outputs/opentelemetry.md
@@ -35,6 +35,7 @@ Important Note: At the moment only HTTP endpoints are supported.
| logs_span_id_metadata_key |Specify a SpanId key to look up in the metadata.| $SpanId |
| logs_trace_id_metadata_key |Specify a TraceId key to look up in the metadata.| $TraceId |
| logs_attributes_metadata_key |Specify an Attributes key to look up in the metadata.| $Attributes |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Getting Started
diff --git a/pipeline/outputs/postgresql.md b/pipeline/outputs/postgresql.md
index 6bb581ed8..16eac7ffc 100644
--- a/pipeline/outputs/postgresql.md
+++ b/pipeline/outputs/postgresql.md
@@ -62,6 +62,7 @@ Make sure that the `fluentbit` user can connect to the `fluentbit` database on t
| `min_pool_size` | Minimum number of connection in async mode | 1 |
| `max_pool_size` | Maximum amount of connections in async mode | 4 |
| `cockroachdb` | Set to `true` if you will connect the plugin with a CockroachDB | false |
+| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### Libpq
@@ -129,4 +130,3 @@ Here follows a list of useful resources from the PostgreSQL documentation:
* [libpq - Environment variables](https://www.postgresql.org/docs/current/libpq-envars.html)
* [libpq - password file](https://www.postgresql.org/docs/current/libpq-pgpass.html)
* [Trigger functions](https://www.postgresql.org/docs/current/plpgsql-trigger.html)
-
diff --git a/pipeline/outputs/prometheus-exporter.md b/pipeline/outputs/prometheus-exporter.md
index 7db7c6d2d..feac59d76 100644
--- a/pipeline/outputs/prometheus-exporter.md
+++ b/pipeline/outputs/prometheus-exporter.md
@@ -4,7 +4,7 @@ description: An output plugin to expose Prometheus Metrics
# Prometheus Exporter
-The prometheus exporter allows you to take metrics from Fluent Bit and expose them such that a Prometheus instance can scrape them.
+The prometheus exporter allows you to take metrics from Fluent Bit and expose them such that a Prometheus instance can scrape them.
Important Note: The prometheus exporter only works with metric plugins, such as Node Exporter Metrics
@@ -13,6 +13,7 @@ Important Note: The prometheus exporter only works with metric plugins, such as
| host | This is address Fluent Bit will bind to when hosting prometheus metrics. Note: `listen` parameter is deprecated from v1.9.0. | 0.0.0.0 |
| port | This is the port Fluent Bit will bind to when hosting prometheus metrics | 2021 |
| add\_label | This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
## Getting Started
diff --git a/pipeline/outputs/prometheus-remote-write.md b/pipeline/outputs/prometheus-remote-write.md
index 0d430457e..b866f7193 100644
--- a/pipeline/outputs/prometheus-remote-write.md
+++ b/pipeline/outputs/prometheus-remote-write.md
@@ -25,7 +25,7 @@ Important Note: The prometheus exporter only works with metric plugins, such as
| header | Add a HTTP header key/value pair. Multiple headers can be set. | |
| log_response_payload | Log the response payload within the Fluent Bit log | false |
| add_label | This allows you to add custom labels to all metrics exposed through the prometheus exporter. You may have multiple of these fields | |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
## Getting Started
@@ -93,7 +93,7 @@ With Logz.io [hosted prometheus](https://logz.io/solutions/infrastructure-monito
[OUTPUT]
name prometheus_remote_write
host listener.logz.io
- port 8053
+ port 8053
match *
header Authorization Bearer
tls on
@@ -109,7 +109,7 @@ With [Coralogix Metrics](https://coralogix.com/platform/metrics/) you may need t
[OUTPUT]
name prometheus_remote_write
host metrics-api.coralogix.com
- uri prometheus/api/v1/write?appLabelName=path&subSystemLabelName=path&severityLabelName=severity
+ uri prometheus/api/v1/write?appLabelName=path&subSystemLabelName=path&severityLabelName=severity
match *
port 443
tls on
@@ -133,3 +133,25 @@ With [Levitate](https://last9.io/levitate-tsdb), you must use the Levitate clust
http_user
http_passwd
```
+
+### Add Prometheus like Labels
+
+Ordinary prometheus clients add some of the labels as below:
+
+```
+[OUTPUT]
+ Name prometheus_remote_write
+ Match your.metric
+ Host xxxxxxx.yyyyy.zzzz
+ Port 443
+ Uri /api/v1/write
+ Header Authorization Bearer YOUR_LICENSE_KEY
+ Log_response_payload True
+ Tls On
+ Tls.verify On
+ # add user-defined labels
+ add_label instance ${HOSTNAME}
+ add_label job fluent-bit
+```
+
+`instance` label can be emulated with `add_label instance ${HOSTNAME}`. And other labels can be added with `add_label ` setting.
diff --git a/pipeline/outputs/s3.md b/pipeline/outputs/s3.md
index 469123d87..28f020fba 100644
--- a/pipeline/outputs/s3.md
+++ b/pipeline/outputs/s3.md
@@ -1,105 +1,157 @@
---
-description: Send logs, data, metrics to Amazon S3
+description: Send logs, data, and metrics to Amazon S3
---
# Amazon S3
-![](<../../.gitbook/assets/image (9).png>)
+![AWS logo](<../../.gitbook/assets/image (9).png>)
-The Amazon S3 output plugin allows you to ingest your records into the [S3](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html) cloud object store.
+The Amazon S3 output plugin lets you ingest records into the
+[S3](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html)
+cloud object store.
-The plugin can upload data to S3 using the [multipart upload API](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html) or using S3 [PutObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API\_PutObject.html). Multipart is the default and is recommended; Fluent Bit will stream data in a series of 'parts'. This limits the amount of data it has to buffer on disk at any point in time. By default, every time 5 MiB of data have been received, a new 'part' will be uploaded. The plugin can create files up to gigabytes in size from many small chunks/parts using the multipart API. All aspects of the upload process are configurable using the configuration options.
+The plugin can upload data to S3 using the
+[multipart upload API](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html)
+or [`PutObject`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html).
+Multipart is the default and is recommended. Fluent Bit will stream data in a series
+of _parts_. This limits the amount of data buffered on disk at any point in time.
+By default, every time 5 MiB of data have been received, a new part will be uploaded.
+The plugin can create files up to gigabytes in size from many small chunks or parts
+using the multipart API. All aspects of the upload process are configurable.
-The plugin allows you to specify a maximum file size, and a timeout for uploads. A file will be created in S3 when the max size is reached, or the timeout is reached- whichever comes first.
+The plugin lets you specify a maximum file size, and a timeout for uploads. A
+file will be created in S3 when the maximum size or the timeout is reached, whichever
+comes first.
Records are stored in files in S3 as newline delimited JSON.
-See [here](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b0edb2f9acd7cdfdbc3/administration/aws-credentials.md) for details on how AWS credentials are fetched.
+See [AWS
+Credentials](https://github.com/fluent/fluent-bit-docs/tree/43c4fe134611da471e706b0edb2f9acd7cdfdbc3/administration/aws-credentials.md)
+for details about fetching AWS credentials.
-**NOTE**: _The_ [_Prometheus success/retry/error metrics values_](administration/monitoring.md) _outputted by Fluent Bit's built-in http server are meaningless for the S3 output_. This is because S3 has its own buffering and retry mechanisms. The Fluent Bit AWS S3 maintainers apologize for this feature gap; you can [track our progress fixing it on GitHub](https://github.com/fluent/fluent-bit/issues/6141).
+{% hint style="info" %}
+The [Prometheus success/retry/error metrics values](administration/monitoring.md)
+output by the built-in http server in Fluent Bit are meaningless for S3 output. S3 has
+its own buffering and retry mechanisms. The Fluent Bit AWS S3 maintainers apologize
+for this feature gap; you can [track our progress fixing it on GitHub](https://github.com/fluent/fluent-bit/issues/6141).
+{% endhint %}
## Configuration Parameters
-| Key | Description | Default |
-|----------------------------------| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------- |
-| region | The AWS region of your S3 bucket | us-east-1 |
-| bucket | S3 Bucket name | None |
-| json\_date\_key | Specify the name of the time key in the output record. To disable the time key just set the value to `false`. | date |
-| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java\_sql\_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | iso8601 |
-| total\_file\_size | Specifies the size of files in S3. Minimum size is 1M. With `use_put_object On` the maximum size is 1G. With multipart upload mode, the maximum size is 50G. | 100M |
-| upload\_chunk\_size | The size of each 'part' for multipart uploads. Max: 50M | 5,242,880 bytes |
-| upload\_timeout | Whenever this amount of time has elapsed, Fluent Bit will complete an upload and create a new file in S3. For example, set this value to 60m and you will get a new file every hour. | 10m |
-| store\_dir | Directory to locally buffer data before sending. When multipart uploads are used, data will only be buffered until the `upload_chunk_size` is reached. S3 will also store metadata about in progress multipart uploads in this directory; this allows pending uploads to be completed even if Fluent Bit stops and restarts. It will also store the current $INDEX value if enabled in the S3 key format so that the $INDEX can keep incrementing from its previous value after Fluent Bit restarts. | /tmp/fluent-bit/s3 |
-| store\_dir\_limit\_size | The size of the limitation for disk usage in S3. Limit the amount of s3 buffers in the `store_dir` to limit disk usage. Note: Use `store_dir_limit_size` instead of `storage.total_limit_size` which can be used to other plugins, because S3 has its own buffering system. | 0, which means unlimited |
-| s3\_key\_format | Format string for keys in S3. This option supports a UUID, strftime time formatters, a syntax for selecting parts of the Fluent log tag using a syntax inspired by the rewrite\_tag filter. Add $UUID in the format string to insert a random string. Add $INDEX in the format string to insert an integer that increments each upload. The $INDEX value will be saved in the store\_dir so that if Fluent Bit restarts the value will keep incrementing from the previous run. Add $TAG in the format string to insert the full log tag; add $TAG\[0] to insert the first part of the tag in the s3 key. The tag is split into “parts” using the characters specified with the `s3_key_format_tag_delimiters` option. Add extension directly after the last piece of the format string to insert a key suffix. If you want to specify a key suffix and you are in `use_put_object` mode, you must specify $UUID as well. More explanations can be found in the S3 Key Format explainer section further down in this document. See the in depth examples and tutorial in the documentation. Time in s3\_key is the timestamp of the first record in the S3 file. | /fluent-bit-logs/$TAG/%Y/%m/%d/%H/%M/%S |
-| s3\_key\_format\_tag\_delimiters | A series of characters which will be used to split the tag into 'parts' for use with the s3\_key\_format option. See the in depth examples and tutorial in the documentation. | . |
-| static\_file\_path | Disables behavior where UUID string is automatically appended to end of S3 key name when $UUID is not provided in s3\_key\_format. $UUID, time formatters, $TAG, and other dynamic key formatters all work as expected while this feature is set to true. | false |
-| use\_put\_object | Use the S3 PutObject API, instead of the multipart upload API. When this option is on, key extension is only available when $UUID is specified in `s3_key_format`. If $UUID is not included, a random string will be appended at the end of the format string and the key extension cannot be customized in this case. | false |
-| role\_arn | ARN of an IAM role to assume (ex. for cross account access). | None |
-| endpoint | Custom endpoint for the S3 API. An endpoint can contain scheme and port. | None |
-| sts\_endpoint | Custom endpoint for the STS API. | None |
-| profile | Option to specify an AWS Profile for credentials. | default |
-| canned\_acl | [Predefined Canned ACL policy](https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl) for S3 objects. | None |
-| compression | Compression type for S3 objects. 'gzip' is currently the only supported value by default. If Apache Arrow support was enabled at compile time, you can also use 'arrow'. For gzip compression, the Content-Encoding HTTP Header will be set to 'gzip'. Gzip compression can be enabled when `use_put_object` is 'on' or 'off' (PutObject and Multipart). Arrow compression can only be enabled with `use_put_object On`. | None |
-| content\_type | A standard MIME type for the S3 object; this will be set as the Content-Type HTTP header. | None |
-| send\_content\_md5 | Send the Content-MD5 header with PutObject and UploadPart requests, as is required when Object Lock is enabled. | false |
-| auto\_retry\_requests | Immediately retry failed requests to AWS services once. This option does not affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput when there are transient/random networking issues. | true |
-| log\_key | By default, the whole log record will be sent to S3. If you specify a key name with this option, then only the value of that key will be sent to S3. For example, if you are using Docker, you can specify log\_key log and only the log message will be sent to S3. | None |
-| preserve\_data\_ordering | Normally, when an upload request fails, there is a high chance for the last received chunk to be swapped with a later chunk, resulting in data shuffling. This feature prevents this shuffling by using a queue logic for uploads. | true |
-| storage\_class | Specify the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/API/API\_PutObject.html#AmazonS3-PutObject-request-header-StorageClass) for S3 objects. If this option is not specified, objects will be stored with the default 'STANDARD' storage class. | None |
-| retry\_limit | Integer value to set the maximum number of retries allowed. Note: this configuration is released since version 1.9.10 and 2.0.1. For previous version, the number of retries is 5 and is not configurable. | 1 |
-| external\_id | Specify an external ID for the STS API, can be used with the role\_arn parameter if your role requires an external ID. | None |
+| Key | Description | Default |
+|--------------------| --------------------------------- | ----------- |
+| `region` | The AWS region of your S3 bucket. | `us-east-1` |
+| `bucket` | S3 Bucket name | _none_ |
+| `json_date_key` | Specify the time key name in the output record. To disable the time key, set the value to `false`. | `date` |
+| `json_date_format` | Specify the format of the date. Accepted values: `double`, `epoch`, `iso8601` (2018-05-30T09:39:52.000681Z), `_java_sql_timestamp_` (2018-05-30 09:39:52.000681). | `iso8601` |
+| `total_file_size` | Specify file size in S3. Minimum size is `1M`. With `use_put_object On` the maximum size is `1G`. With multipart uploads, the maximum size is `50G`. | `100M` |
+| `upload_chunk_size` | The size of each part for multipart uploads. Max: 50M | 5,242,880 bytes |
+| `upload_timeout` | When this amount of time elapses, Fluent Bit uploads and creates a new file in S3. Set to `60m` to upload a new file every hour. | `10m`|
+| `store_dir` | Directory to locally buffer data before sending. When using multipart uploads, data buffers until reaching the `upload_chunk_size`. S3 stores metadata about in progress multipart uploads in this directory, allowing pending uploads to be completed if Fluent Bit stops and restarts. It stores the current `$INDEX` value if enabled in the S3 key format so the `$INDEX` keeps incrementing from its previous value after Fluent Bit restarts. | `/tmp/fluent-bit/s3` |
+| `store_dir_limit_size` | Size limit for disk usage in S3. Limit theS3 buffers in the `store_dir` to limit disk usage. Use `store_dir_limit_size` instead of `storage.total_limit_size` which can be used for other plugins | `0` (unlimited) |
+| `s3_key_format` | Format string for keys in S3. This option supports a UUID, strftime time formatters, a syntax for selecting parts of the Fluent log tag using a syntax inspired by the `rewrite_tag` filter. Add `$UUID` in the format string to insert a random string. Add `$INDEX` in the format string to insert an integer that increments each upload. The `$INDEX` value saves in the `store_dir`. Add `$TAG` in the format string to insert the full log tag. Add `$TAG[0]` to insert the first part of the tag in theS3 key. The tag is split into parts using the characters specified with the `s3_key_format_tag_delimiters` option. Add the extension directly after the last piece of the format string to insert a key suffix. To specify a key suffix in `use_put_object` mode, you must specify `$UUID`. See [S3 Key Format](#allowing-a-file-extension-in-the-s3-key-format-with-usduuid). Time in `s3_key` is the timestamp of the first record in the S3 file. | `/fluent-bit-logs/$TAG/%Y/%m/%d/%H/%M/%S` |
+| `s3_key_format_tag_delimiters` | A series of characters used to split the tag into parts for use with `s3_key_format`. option. | `.` |
+| `static_file_path` | Disables behavior where UUID string appendeds to the end of the S3 key name when `$UUID` is not provided in `s3_key_format`. `$UUID`, time formatters, `$TAG`, and other dynamic key formatters all work as expected while this feature is set to true. | `false` |
+| `use_put_object` | Use the S3 `PutObject` API instead of the multipart upload API. When enabled, the key extension is only available when `$UUID` is specified in `s3_key_format`. If `$UUID` isn't included, a random string appends format string and the key extension can't be customized. | `false` |
+| `role_arn` | ARN of an IAM role to assume (for example, for cross account access.) | _none_ |
+| `endpoint` | Custom endpoint for the S3 API. Endpoints can contain scheme and port. | _none_ |
+| `sts_endpoint` | Custom endpoint for the STS API. | _none_ |
+| `profile` | Option to specify an AWS Profile for credentials. | `default` |
+| `canned_acl` | [Predefined Canned ACL policy](https://docs.aws.amazon.com/AmazonS3/latest/dev/acl-overview.html#canned-acl) for S3 objects. | _none_ |
+| `compression` | Compression type for S3 objects. `gzip` is currently the only supported value by default. If Apache Arrow support was enabled at compile time, you can use `arrow`. For gzip compression, the Content-Encoding HTTP Header will be set to `gzip`. Gzip compression can be enabled when `use_put_object` is `on` or `off` (`PutObject` and Multipart). Arrow compression can only be enabled with `use_put_object On`. | _none_ |
+| `content_type` | A standard MIME type for the S3 object, set as the Content-Type HTTP header. | _none_ |
+| `send_content_md5` | Send the Content-MD5 header with `PutObject` and UploadPart requests, as is required when Object Lock is enabled. | `false` |
+| `auto_retry_requests` | Immediately retry failed requests to AWS services once. This option doesn't affect the normal Fluent Bit retry mechanism with backoff. Instead, it enables an immediate retry with no delay for networking errors, which may help improve throughput during transient network issues. | `true` |
+| `log_key` | By default, the whole log record will be sent to S3. When specifing a key name with this option, only the value of that key sends to S3. For example, when using Docker you can specify `log_key log` and only the log message sends to S3. | _none_ |
+| `preserve_data_ordering` | When an upload request fails, the last received chunk might swap with a later chunk, resulting in data shuffling. This feature prevents shuffling by using a queue logic for uploads. | `true` |
+| `storage_class` | Specify the [storage class](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html#AmazonS3-PutObject-request-header-StorageClass) for S3 objects. If this option isn't specified, objects store with the default `STANDARD` storage class. | _none_ |
+| `retry_limit` | Integer value to set the maximum number of retries allowed. Requires versions 1.9.10 and 2.0.1 or later. For previous version, the number of retries is `5` and isn't configurable. | `1` |
+| `external_id` | Specify an external ID for the STS API. Can be used with the `role_arn` parameter if your role requires an external ID. | _none_ |
+| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
## TLS / SSL
-To skip TLS verification, set `tls.verify` as `false`. For more details about the properties available and general configuration, please refer to the [TLS/SSL](../../administration/transport-security.md) section.
+To skip TLS verification, set `tls.verify` as `false`. For more details about the
+properties available and general configuration, refer to
+[TLS/SSL](../../administration/transport-security.md).
## Permissions
The plugin requires the following AWS IAM permissions:
-```
+```text
{
- "Version": "2012-10-17",
- "Statement": [{
- "Effect": "Allow",
- "Action": [
- "s3:PutObject"
- ],
- "Resource": "*"
- }]
+ "Version": "2012-10-17",
+ "Statement": [{
+ "Effect": "Allow",
+ "Action": [
+ "s3:PutObject"
+ ],
+ "Resource": "*"
+ }]
}
```
## Differences between S3 and other Fluent Bit outputs
-The s3 output plugin is special because its use case is to upload files of non-trivial size to an Amazon S3 bucket. This is in contrast to most other outputs which send many requests to upload data in batches of a few Megabytes or less.
-
-When Fluent Bit recieves logs, it stores them in chunks, either in memory or the filesystem depending on your settings. A chunk is usually around 2 MB in size. Fluent Bit sends the chunks in order to each output that matches their tag. Most outputs then send the chunk immediately to their destination. A chunk is sent to the output's "flush callback function", which must return one of `FLB_OK`, `FLB_RETRY`, or `FLB_ERROR`. Fluent Bit keeps count of the return values from each outputs "flush callback function"; these counters are the data source for Fluent Bit's error, retry, and success metrics available in prometheus format via its monitoring interface.
-
-The S3 output plugin is a Fluent Bit output plugin and thus it conforms to the Fluent Bit output plugin specification. However, since the S3 use case is to upload large files, generally much larger than 2 MB, its behavior is different. The S3 "flush callback function" simply buffers the incoming chunk to the filesystem, and returns an `FLB_OK`. _Consequently, the prometheus metrics available via the Fluent Bit http server are meaningless for S3._ In addition, the `storage.total_limit_size` parameter is not meaningful for S3 since it has its own buffering system in the `store_dir`. Instead, use `store_dir_limit_size`. Finally, *S3 always requires a write-able filesystem*; running Fluent Bit on a read-only filesystem will not work with the S3 output.
-
-S3 uploads are primarily initiated via the S3 "timer callback function", which runs separately from its "flush callback function". Because S3 has its own system of buffering and its own callback to upload data, the normal sequential data ordering of chunks provided by the Fluent Bit engine may be compromised. Consequently, S3 has the `presevere_data_ordering` option which will ensure data is uploaded in the original order it was collected by Fluent Bit.
+The S3 output plugin is used to upload large files to an Amazon S3 bucket, while
+most other outputs which send many requests to upload data in batches of a few
+megabytes or less.
+
+When Fluent Bit receives logs, it stores them in chunks, either in memory or the
+filesystem depending on your settings. Chunks are usually around 2 MB in size.
+Fluent Bit sends chunks, in order, to each output that matches their tag. Most outputs
+then send the chunk immediately to their destination. A chunk is sent to the output's
+`flush` callback function, which must return one of `FLB_OK`, `FLB_RETRY`, or
+`FLB_ERROR`. Fluent Bit keeps count of the return values from each output's
+`flush` callback function. These counters are the data source for Fluent Bit's error, retry,
+and success metrics available in Prometheus format through its monitoring interface.
+
+The S3 output plugin conforms to the Fluent Bit output plugin specification.
+Since S3's use case is to upload large files (over 2 MB), its behavior is different.
+S3's `flush` callback function buffers the incoming chunk to the filesystem, and
+returns an `FLB_OK`. This means Prometheus metrics available from the Fluent
+Bit HTTP server are meaningless for S3. In addition, the `storage.total_limit_size`
+parameter is not meaningful for S3 since it has its own buffering system in the
+`store_dir`. Instead, use `store_dir_limit_size`. S3 requires a writeable filesystem.
+Running Fluent Bit on a read-only filesystem won't work with the S3 output.
+
+S3 uploads primarily initiate using the S3
+[`timer`](https://docs.aws.amazon.com/iotevents/latest/apireference/API_iotevents-data_Timer.html)
+callback function, which runs separately from its `flush`.
+
+S3 has its own buffering system and its own callback to upload data, so the normal
+sequential data ordering of chunks provided by the Fluent Bit engine may be
+compromised. S3 has the `presevere_data_ordering` option which ensures data is
+uploaded in the original order it was collected by Fluent Bit.
### Summary: Uniqueness in S3 Plugin
-1. _The HTTP Monitoring interface output metrics are not meaningful for S3_: AWS understands that this is non-ideal; we have [opened an issue with a design](https://github.com/fluent/fluent-bit/issues/6141) that will allow S3 to manage its own output metrics.
-2. _You must use `store_dir_limit_size` to limit the space on disk used by S3 buffer files_.
-3. _The original ordering of data inputted to Fluent Bit may not be preserved unless you enable `preserve_data_ordering On`_.
+- The HTTP Monitoring interface output metrics are not meaningful for S3. AWS
+ understands that this is non-ideal; we have
+ [opened an issue with a design](https://github.com/fluent/fluent-bit/issues/6141)
+ to allow S3 to manage its own output metrics.
+- You must use `store_dir_limit_size` to limit the space on disk used by S3 buffer files.
+- The original ordering of data inputted to Fluent Bit may not be preserved unless you enable
+`preserve_data_ordering On`.
## S3 Key Format and Tag Delimiters
-In Fluent Bit, all logs have an associated tag. The `s3_key_format` option lets you inject the tag into the s3 key using the following syntax:
+In Fluent Bit, all logs have an associated tag. The `s3_key_format` option lets you
+inject the tag into the S3 key using the following syntax:
-* `$TAG` => the full tag
-* `$TAG[n]` => the nth part of the tag (index starting at zero). This syntax is copied from the rewrite tag filter. By default, “parts” of the tag are separated with dots, but you can change this with `s3_key_format_tag_delimiters`.
+- `$TAG`: The full tag.
+- `$TAG[n]`: The nth part of the tag (index starting at zero). This syntax is copied
+ from the rewrite tag filter. By default, “parts” of the tag are separated with
+ dots, but you can change this with `s3_key_format_tag_delimiters`.
-In the example below, assume the date is January 1st, 2020 00:00:00 and the tag associated with the logs in question is `my_app_name-logs.prod`.
+In the following example, assume the date is January 1st, 2020 00:00:00 and the tag
+associated with the logs in question is `my_app_name-logs.prod`.
-```
+```python
[OUTPUT]
- Name s3
- Match *
+ Name s3
+ Match *
bucket my-bucket
region us-west-2
total_file_size 250M
@@ -107,34 +159,49 @@ In the example below, assume the date is January 1st, 2020 00:00:00 and the tag
s3_key_format_tag_delimiters .-
```
-With the delimiters as . and -, the tag will be split into parts as follows:
+With the delimiters as `.` and `-`, the tag splits into parts as follows:
-* `$TAG[0]` = my\_app\_name
-* `$TAG[1]` = logs
-* `$TAG[2]` = prod
+- `$TAG[0]` = `my_app_name`
+- `$TAG[1]` = `logs`
+- `$TAG[2]` = `prod`
-So the key in S3 will be `/prod/my_app_name/2020/01/01/00/00/00/bgdHN1NM.gz`.
+The key in S3 will be `/prod/my_app_name/2020/01/01/00/00/00/bgdHN1NM.gz`.
### Allowing a file extension in the S3 Key Format with $UUID
-The Fluent Bit S3 output was designed to ensure that previous uploads will never be over-written by a subsequent upload. Consequently, the `s3_key_format` supports time formatters, `$UUID`, and `$INDEX`. `$INDEX` is special because it is saved in the `store_dir`; if you restart Fluent Bit with the same disk, then it can continue incrementing the index from its last value in the previous run.
+The Fluent Bit S3 output was designed to ensure that previous uploads will never be
+overwritten by a subsequent upload. The `s3_key_format` supports time formatters,
+`$UUID`, and `$INDEX`. `$INDEX` is special because it is saved in the `store_dir`. If
+you restart Fluent Bit with the same disk, it can continue incrementing the
+index from its last value in the previous run.
-For files uploaded with the PutObject API, the S3 output requires that a unique random string be present in the S3 key. This is because many of the use cases for PutObject uploads involve a short time period between uploads such that a timestamp in the S3 key may not be unique enough between uploads. For example, if you only specify minute granularity timestamps in the S3 key, with a small upload size, it is possible to have two uploads that have timestamps set in the same minute. This "requirement" can be disabled with `static_file_path On`.
+For files uploaded with the `PutObject` API, the S3 output requires that a unique
+random string be present in the S3 key. Many of the use cases for
+`PutObject` uploads involve a short time period between uploads, so a timestamp
+in the S3 key may not be unique enough between uploads. For example, if you only
+specify minute granularity timestamps in the S3 key, with a small upload size, it is
+possible to have two uploads that have timestamps set in the same minute. This
+requirement can be disabled with `static_file_path On`.
-There are three cases where the PutObject API is used:
+The `PutObject` API is used in these cases:
-1. When you explicitly set `use_put_object On`
-2. On startup when the S3 output finds old buffer files in the `store_dir` from a previous run and attempts to send all of them at once.
-3. On shutdown, when to prevent data loss the S3 output attempts to send all currently buffered data at once.
+- When you explicitly set `use_put_object On`.
+- On startup when the S3 output finds old buffer files in the `store_dir` from
+ a previous run and attempts to send all of them at once.
+- On shutdown. To prevent data loss the S3 output attempts to send all currently
+ buffered data at once.
-Consequently, you should always specify `$UUID` somewhere in your S3 key format. Otherwise, if the PutObject API is used, S3 will append a random 8 character UUID to the end of your S3 key. This means that a file extension set at the end of an S3 key will have the random UUID appended to it. This behavior can be disabled with `static_file_path On`.
+You should always specify `$UUID` somewhere in your S3 key format. Otherwise, if the
+`PutObject` API is used, S3 appends a random eight-character UUID to the end of your
+S3 key. This means that a file extension set at the end of an S3 key will have the
+random UUID appended to it. Disabled this with `static_file_path On`.
-Let's walk through this via an example. First case, we attempt to set a `.gz` extension without specifying `$UUID`.
+For example, we attempt to set a `.gz` extension without specifying `$UUID`:
-```
+```python
[OUTPUT]
- Name s3
- Match *
+ Name s3
+ Match *
bucket my-bucket
region us-west-2
total_file_size 50M
@@ -143,189 +210,244 @@ Let's walk through this via an example. First case, we attempt to set a `.gz` ex
s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S.gz
```
-In the case where pending data is uploaded on shutdown, if the tag was `app`, the S3 key in the S3 bucket might be:
+In the case where pending data is uploaded on shutdown, if the tag was `app`, the S3
+key in the S3 bucket might be:
-```
+```text
/app/2022/12/25/00_00_00.gz-apwgylqg
```
-The S3 output appended a random string to the "extension", since this upload on shutdown used the PutObject API.
-
-There are two ways of disabling this behavior. Option 1, use `static_file_path`:
-
-```
-[OUTPUT]
- Name s3
- Match *
- bucket my-bucket
- region us-west-2
- total_file_size 50M
- use_put_object Off
- compression gzip
- s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S.gz
- static_file_path On
-```
-
-Option 2, explicitly define where the random UUID will go in the S3 key format:
-
-```
-[OUTPUT]
- Name s3
- Match *
- bucket my-bucket
- region us-west-2
- total_file_size 50M
- use_put_object Off
- compression gzip
- s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S/$UUID.gz
-```
+The S3 output appended a random string to the file extension, since this upload
+on shutdown used the `PutObject` API.
+
+There are two ways of disabling this behavior:
+
+- Use `static_file_path`:
+
+ ```python
+ [OUTPUT]
+ Name s3
+ Match *
+ bucket my-bucket
+ region us-west-2
+ total_file_size 50M
+ use_put_object Off
+ compression gzip
+ s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S.gz
+ static_file_path On
+ ```
+
+- Explicitly define where the random UUID will go in the S3 key format:
+
+ ```python
+ [OUTPUT]
+ Name s3
+ Match *
+ bucket my-bucket
+ region us-west-2
+ total_file_size 50M
+ use_put_object Off
+ compression gzip
+ s3_key_format /$TAG/%Y/%m/%d/%H_%M_%S/$UUID.gz
+ ```
## Reliability
-The `store_dir` is used to temporarily store data before it is uploaded. If Fluent Bit is stopped suddenly it will try to send all data and complete all uploads before it shuts down. If it can not send some data, on restart it will look in the `store_dir` for existing data and will try to send it.
-
-Multipart uploads are ideal for most use cases because they allow the plugin to upload data in small chunks over time. For example, 1 GB file can be created from 200 5MB chunks. While the file size in S3 will be 1 GB, only 5 MB will be buffered on disk at any one point in time.
-
-There is one minor drawback to multipart uploads- the file and data will not be visible in S3 until the upload is completed with a [CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API\_CompleteMultipartUpload.html) call. The plugin will attempt to make this call whenever Fluent Bit is shut down to ensure your data is available in s3. It will also store metadata about each upload in the `store_dir`, ensuring that uploads can be completed when Fluent Bit restarts (assuming it has access to persistent disk and the `store_dir` files will still be present on restart).
+The `store_dir` is used to temporarily store data before upload. If Fluent Bit
+stops suddenly, it will try to send all data and complete all uploads before it
+shuts down. If it can not send some data, on restart it will look in the `store_dir`
+for existing data and try to send it.
+
+Multipart uploads are ideal for most use cases because they allow the plugin to
+upload data in small chunks over time. For example, 1 GB file can be created
+from 200 5 MB chunks. While the file size in S3 will be 1 GB, only
+5 MB will be buffered on disk at any one point in time.
+
+One drawback to multipart uploads is that the file and data aren't visible in S3
+until the upload is completed with a
+[CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html)
+call. The plugin attempts to make this call whenever Fluent Bit is shut down to
+ensure your data is available in S3. It also stores metadata about each upload in
+the `store_dir`, ensuring that uploads can be completed when Fluent Bit restarts
+(assuming it has access to persistent disk and the `store_dir` files will still be
+present on restart).
### Using S3 without persisted disk
-If you run Fluent Bit in an environment without persistent disk, or without the ability to restart Fluent Bit and give it access to the data stored in the `store_dir` from previous executions- some considerations apply. This might occur if you run Fluent Bit on [AWS Fargate](https://aws.amazon.com/fargate/).
+If you run Fluent Bit in an environment without persistent disk, or without the
+ability to restart Fluent Bit and give it access to the data stored in the
+`store_dir` from previous executions, some considerations apply. This might occur if
+you run Fluent Bit on [AWS Fargate](https://aws.amazon.com/fargate/).
-In these situations, we recommend using the PutObject API, and sending data frequently, to avoid local buffering as much as possible. This will limit data loss in the event Fluent Bit is killed unexpectedly.
+In these situations, we recommend using the `PutObject` API and sending data
+frequently, to avoid local buffering as much as possible. This will limit data loss
+in the event Fluent Bit is killed unexpectedly.
The following settings are recommended for this use case:
-```
+```python
[OUTPUT]
- Name s3
- Match *
- bucket your-bucket
- region us-east-1
- total_file_size 1M
- upload_timeout 1m
- use_put_object On
+ Name s3
+ Match *
+ bucket your-bucket
+ region us-east-1
+ total_file_size 1M
+ upload_timeout 1m
+ use_put_object On
```
## S3 Multipart Uploads
-With `use_put_object Off` (default), S3 will attempt to send files using multipart uploads. For each file, S3 first calls [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html), then a series of calls to [UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html) for each fragment (targeted to be `upload_chunk_size` bytes), and finally [CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html) to create the final file in S3.
-
-### Fallback to PutObject
-
-S3 [requires](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html) each [UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html) fragment to be at least 5,242,880 bytes, otherwise the upload is rejected.
-
-Consequently, the S3 output must sometimes fallback to the [PutObject API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html).
-
-Uploads are triggered by three settings:
-1. `total_file_size` and `upload_chunk_size`: When S3 has buffered data in the `store_dir` that meets the desired `total_file_size` (for `use_put_object On`) or the `upload_chunk_size` (for Multipart), it will trigger an upload operation.
-2. `upload_timeout`: Whenever locally buffered data has been present on the filesystem in the `store_dir` longer than the configured `upload_timeout`, it will be sent. This happens regardless of whether or not the desired byte size has been reached. Consequently, if you configure a small `upload_timeout`, your files may be smaller than the `total_file_size`. The timeout is evaluated against the time at which S3 started buffering data for each unqiue tag (that is, the time when new data was buffered for the unique tag after the last upload). The timeout is also evaluated against the [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html) time, so a multipart upload will be completed after `upload_timeout` has elapsed, even if the desired size has not yet been reached.
-
-If your `upload_timeout` triggers an upload before the pending buffered data reaches the `upload_chunk_size`, it may be too small for a multipart upload. S3 will consequently fallback to use the [PutObject API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html).
-
-When you enable compression, S3 applies the compression algorithm at send time. The size settings noted above trigger uploads based on the size of buffered data, not the final compressed size. Consequently, it is possible that after compression, buffered data no longer meets the required minimum S3 [UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html) size. If this occurs, you will see a log message like:
-
-
-```
-[ info] [output:s3:s3.0] Pre-compression upload_chunk_size= 5630650, After compression, chunk is only 1063320 bytes, the chunk was too small, using PutObject to upload
-```
-
-If you encounter this frequently, use the numbers in the messages to guess your compression factor. For example, in this case, the buffered data was reduced from 5,630,650 bytes to 1,063,320 bytes. The compressed size is 1/5 the actual data size, so configuring `upload_chunk_size 30M` should ensure each part is large enough after compression to be over the min required part size of 5,242,880 bytes.
-
-The S3 API allows the last part in an upload to be less than the 5,242,880 byte minimum. Therefore, if a part is too small for an existing upload, the S3 output will upload that part and then complete the upload.
-
-### upload_timeout constrains total multipart upload time for a single file
-
-The `upload_timeout` is evaluated against the [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html) time. So a multipart upload will be completed after `upload_timeout` has elapsed, even if the desired size has not yet been reached.
+With `use_put_object Off` (default), S3 will attempt to send files using multipart
+uploads. For each file, S3 first calls
+[CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html),
+then a series of calls to
+[UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html) for
+each fragment (targeted to be `upload_chunk_size` bytes), and finally
+[CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html)
+to create the final file in S3.
+
+### Fallback to `PutObject`
+
+S3 [requires](https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html) each
+[UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html)
+fragment to be at least 5,242,880 bytes, otherwise the upload is rejected.
+
+The S3 output must sometimes fallback to the [`PutObject`
+API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_`PutObject`.html).
+
+Uploads are triggered by these settings:
+
+- `total_file_size` and `upload_chunk_size`: When S3 has buffered data in the
+ `store_dir` that meets the desired `total_file_size` (for `use_put_object On`) or
+ the `upload_chunk_size` (for Multipart), it will trigger an upload operation.
+- `upload_timeout`: Whenever locally buffered data has been present on the filesystem
+ in the `store_dir` longer than the configured `upload_timeout`, it will be sent
+ even when the desired byte size hasn't been reached.
+ If you configure a small `upload_timeout`, your files may be smaller
+ than the `total_file_size`. The timeout is evaluated against the time at which S3
+ started buffering data for each unqiue tag (that is, the time when new data was
+ buffered for the unique tag after the last upload). The timeout is also evaluated
+ against the
+ [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html)
+ time, so a multipart upload will be completed after `upload_timeout` has elapsed,
+ even if the desired size has not yet been reached.
+
+If your `upload_timeout` triggers an upload before the pending buffered data reaches
+the `upload_chunk_size`, it may be too small for a multipart upload. S3 will
+fallback to use the [`PutObject` API](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html).
+
+When you enable compression, S3 applies the compression algorithm at send time. The
+size settings trigger uploads based on the size of buffered data, not the
+final compressed size. It's possible that after compression, buffered data no longer
+meets the required minimum S3
+[UploadPart](https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html)
+size. If this occurs, you will see a log message like:
+
+```text
+[ info] [output:s3:s3.0] Pre-compression upload_chunk_size= 5630650, After
+compression, chunk is only 1063320 bytes, the chunk was too small, using PutObject to upload
+```
+
+If you encounter this frequently, use the numbers in the messages to guess your
+compression factor. In this example, the buffered data was reduced from
+5,630,650 bytes to 1,063,320 bytes. The compressed size is one-fifth the actual data size.
+Configuring `upload_chunk_size 30M` should ensure each part is large enough after
+compression to be over the minimum required part size of 5,242,880 bytes.
+
+The S3 API allows the last part in an upload to be less than the 5,242,880 byte
+minimum. If a part is too small for an existing upload, the S3 output will
+upload that part and then complete the upload.
+
+### `upload_timeout` constrains total multipart upload time for a single file
+
+The `upload_timeout` evaluated against the
+[CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html)
+time. A multipart upload will be completed after `upload_timeout` elapses, even if
+the desired size has not yet been reached.
### Completing uploads
-When [CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html) is called, an `UploadID` is returned. S3 stores these IDs for active uploads in the `store_dir`. Until [CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html) is called, the uploaded data will not be visible in S3.
-
-On shutdown, S3 output will attempt to complete all pending uploads. If it fails to complete an upload, the ID will remain buffered in the `store_dir` in a directory called `multipart_upload_metadata`. If you restart the S3 output with the same `store_dir` it will discover the old UploadIDs and complete the pending uploads. The [S3 documentation](https://aws.amazon.com/blogs/aws-cloud-financial-management/discovering-and-deleting-incomplete-multipart-uploads-to-lower-amazon-s3-costs/) also has suggestions on discovering and deleting/completing dangling uploads in your buckets.
-
-## Worker support
-
-Fluent Bit 1.7 adds a new feature called `workers` which enables outputs to have dedicated threads. This `s3` plugin has partial support for workers. **The plugin can only support a single worker; enabling multiple workers will lead to errors/indeterminate behavior.**
-
-Example:
-
-```
-[OUTPUT]
- Name s3
- Match *
- bucket your-bucket
- region us-east-1
- total_file_size 1M
- upload_timeout 1m
- use_put_object On
- workers 1
-```
-
-If you enable a single worker, you are enabling a dedicated thread for your S3 output. We recommend starting without workers, evaluating the performance, and then enabling a worker if needed. For most users, the plugin can provide sufficient throughput without workers.
+When
+[CreateMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CreateMultipartUpload.html)
+is called, an `UploadID` is returned. S3 stores these IDs for active uploads in the
+`store_dir`. Until
+[CompleteMultipartUpload](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html)
+is called, the uploaded data isn't visible in S3.
+
+On shutdown, S3 output attempts to complete all pending uploads. If an upload fails
+to complete, the ID remains buffered in the `store_dir` in a directory called
+`multipart_upload_metadata`. If you restart the S3 output with the same `store_dir`
+it will discover the old UploadIDs and complete the pending uploads. The [S3
+documentation](https://aws.amazon.com/blogs/aws-cloud-financial-management/discovering-and-deleting-incomplete-multipart-uploads-to-lower-amazon-s3-costs/)
+has suggestions on discovering and deleting or completing dangling uploads in your
+buckets.
## Usage with MinIO
-[MinIO](https://min.io/) is a high-performance, S3 compatible object storage and you can build your app with S3 functionality without S3.
+[MinIO](https://min.io/) is a high-performance, S3 compatible object storage and you
+can build your app with S3 functionality without S3.
-Assume you run [a MinIO server](https://docs.min.io/docs/minio-quickstart-guide.html) at localhost:9000, and create a bucket of `your-bucket` by referring [the client docs](https://docs.min.io/docs/minio-client-quickstart-guide).
+The following example runs [a MinIO server](https://docs.min.io/docs/minio-quickstart-guide.html)
+at `localhost:9000`, and create a bucket of `your-bucket`.
Example:
-```
+```python
[OUTPUT]
- Name s3
- Match *
- bucket your-bucket
- endpoint http://localhost:9000
+ Name s3
+ Match *
+ bucket your-bucket
+ endpoint http://localhost:9000
```
-Then, the records will be stored into the MinIO server.
+The records store in the MinIO server.
-## Getting Started
+## Get Started
-In order to send records into Amazon S3, you can run the plugin from the command line or through the configuration file.
+To send records into Amazon S3, you can run the plugin from the command line or
+through the configuration file.
### Command Line
-The **s3** plugin, can read the parameters from the command line through the **-p** argument (property), e.g:
+The S3 plugin reads parameters from the command line through the `-p` argument:
-```
-$ fluent-bit -i cpu -o s3 -p bucket=my-bucket -p region=us-west-2 -p -m '*' -f 1
+```text
+fluent-bit -i cpu -o s3 -p bucket=my-bucket -p region=us-west-2 -p -m '*' -f 1
```
### Configuration File
-In your main configuration file append the following _Output_ section:
+In your main configuration file append the following `Output` section:
-```
+```python
[OUTPUT]
- Name s3
- Match *
- bucket your-bucket
- region us-east-1
- store_dir /home/ec2-user/buffer
- total_file_size 50M
- upload_timeout 10m
+ Name s3
+ Match *
+ bucket your-bucket
+ region us-east-1
+ store_dir /home/ec2-user/buffer
+ total_file_size 50M
+ upload_timeout 10m
```
-An example that using PutObject instead of multipart:
+An example using `PutObject` instead of multipart:
-```
+```python
[OUTPUT]
- Name s3
- Match *
- bucket your-bucket
- region us-east-1
- store_dir /home/ec2-user/buffer
- use_put_object On
- total_file_size 10M
- upload_timeout 10m
+ Name s3
+ Match *
+ bucket your-bucket
+ region us-east-1
+ store_dir /home/ec2-user/buffer
+ use_put_object On
+ total_file_size 10M
+ upload_timeout 10m
```
## AWS for Fluent Bit
-Amazon distributes a container image with Fluent Bit and this plugins.
+Amazon distributes a container image with Fluent Bit and plugins.
### GitHub
@@ -333,76 +455,91 @@ Amazon distributes a container image with Fluent Bit and this plugins.
### Amazon ECR Public Gallery
-[aws-for-fluent-bit](https://gallery.ecr.aws/aws-observability/aws-for-fluent-bit)
+Our images are available in the Amazon ECR Public Gallery as
+[aws-for-fluent-bit](https://gallery.ecr.aws/aws-observability/aws-for-fluent-bit).
-Our images are available in Amazon ECR Public Gallery. You can download images with different tags by following command:
+You can download images with different tags using the following command:
-```
+```text
docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:
```
-For example, you can pull the image with latest version by:
+For example, you can pull the image with latest version with:
-```
+```text
docker pull public.ecr.aws/aws-observability/aws-for-fluent-bit:latest
```
-If you see errors for image pull limits, try log into public ECR with your AWS credentials:
+If you see errors for image pull limits, try signing in to public ECR with your
+AWS credentials:
-```
+```text
aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws
```
-You can check the [Amazon ECR Public official doc](https://docs.aws.amazon.com/AmazonECR/latest/public/get-set-up-for-amazon-ecr.html) for more details.
+See the
+[Amazon ECR Public official documentation](https://docs.aws.amazon.com/AmazonECR/latest/public/get-set-up-for-amazon-ecr.html)
+for more details.
### Docker Hub
[amazon/aws-for-fluent-bit](https://hub.docker.com/r/amazon/aws-for-fluent-bit/tags)
+is also available from the Docker Hub.
### Amazon ECR
-You can use our SSM Public Parameters to find the Amazon ECR image URI in your region:
+Use our SSM Public Parameters to find the Amazon ECR image URI in your region:
-```
+```text
aws ssm get-parameters-by-path --path /aws/service/aws-for-fluent-bit/
```
-For more see [the AWS for Fluent Bit github repo](https://github.com/aws/aws-for-fluent-bit#public-images).
+For more information, see the
+[AWS for Fluent Bit GitHub repo](https://github.com/aws/aws-for-fluent-bit#public-images).
## Advanced usage
### Use Apache Arrow for in-memory data processing
-Starting from Fluent Bit v1.8, the Amazon S3 plugin includes the support for [Apache Arrow](https://arrow.apache.org/). The support is currently not enabled by default, as it depends on a shared version of `libarrow` as the prerequisite.
+With Fluent Bit v1.8 or greater, the Amazon S3 plugin includes the support for
+[Apache Arrow](https://arrow.apache.org/). Support isn't enabled by
+default, and has a dependency on a shared version of `libarrow`.
-To use this feature, `FLB_ARROW` must be turned on at compile time:
+To use this feature, `FLB_ARROW` must be turned on at compile time. Use the following
+commands:
-```
-$ cd build/
-$ cmake -DFLB_ARROW=On ..
-$ cmake --build .
+```text
+cd build/
+cmake -DFLB_ARROW=On ..
+cmake --build .
```
-Once compiled, Fluent Bit can upload incoming data to S3 in Apache Arrow format. For example:
+After being compiled, Fluent Bit can upload incoming data to S3 in Apache Arrow format.
-```
+For example:
+
+```python
[INPUT]
- Name cpu
+ Name cpu
[OUTPUT]
- Name s3
- Bucket your-bucket-name
- total_file_size 1M
- use_put_object On
- upload_timeout 60s
- Compression arrow
+ Name s3
+ Bucket your-bucket-name
+ total_file_size 1M
+ use_put_object On
+ upload_timeout 60s
+ Compression arrow
```
-As shown in this example, setting `Compression` to `arrow` makes Fluent Bit to convert payload into Apache Arrow format.
+Setting `Compression` to `arrow` makes Fluent Bit convert payload into Apache Arrow
+format.
-The stored data is very easy to load, analyze and process using popular data processing tools (such as Python pandas, Apache Spark and Tensorflow). The following code uses `pyarrow` to analyze the uploaded data:
+Load, analyze, and process stored data using popular data
+processing tools such as Python pandas, Apache Spark and Tensorflow.
-```
+The following example uses `pyarrow` to analyze the uploaded data:
+
+```text
>>> import pyarrow.feather as feather
>>> import pyarrow.fs as fs
>>>
@@ -410,7 +547,7 @@ The stored data is very easy to load, analyze and process using popular data pro
>>> file = s3.open_input_file("my-bucket/fluent-bit-logs/cpu.0/2021/04/27/09/36/15-object969o67ZF")
>>> df = feather.read_feather(file)
>>> print(df.head())
- date cpu_p user_p system_p cpu0.p_cpu cpu0.p_user cpu0.p_system
+ date cpu_p user_p system_p cpu0.p_cpu cpu0.p_user cpu0.p_system
0 2021-04-27T09:33:53.539346Z 1.0 1.0 0.0 1.0 1.0 0.0
1 2021-04-27T09:33:54.539330Z 0.0 0.0 0.0 0.0 0.0 0.0
2 2021-04-27T09:33:55.539305Z 1.0 0.0 1.0 1.0 0.0 1.0
diff --git a/pipeline/outputs/skywalking.md b/pipeline/outputs/skywalking.md
index 9919567a5..1d6206bf1 100644
--- a/pipeline/outputs/skywalking.md
+++ b/pipeline/outputs/skywalking.md
@@ -11,6 +11,7 @@ The **Apache SkyWalking** output plugin, allows to flush your records to a [Apac
| auth_token | Authentication token if needed for Apache SkyWalking OAP | None |
| svc_name | Service name that fluent-bit belongs to | sw-service |
| svc_inst_name | Service instance name of fluent-bit | fluent-bit |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### TLS / SSL
@@ -57,6 +58,6 @@ This message is packed into the following protocol format and written to the OAP
"json": {
"json": "{\"log\": \"This is the original log message\"}"
}
- }
+ }
}]
```
diff --git a/pipeline/outputs/slack.md b/pipeline/outputs/slack.md
index 0ef7d9d9d..5cbee7f03 100644
--- a/pipeline/outputs/slack.md
+++ b/pipeline/outputs/slack.md
@@ -17,6 +17,7 @@ Once you have obtained the Webhook address you can place it in the configuration
| Key | Description | Default |
| :--- | :--- | :--- |
| webhook | Absolute address of the Webhook provided by Slack | |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### Configuration File
@@ -28,4 +29,3 @@ Get started quickly with this configuration file:
match *
webhook https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX
```
-
diff --git a/pipeline/outputs/splunk.md b/pipeline/outputs/splunk.md
index f038909fc..545e85d4c 100644
--- a/pipeline/outputs/splunk.md
+++ b/pipeline/outputs/splunk.md
@@ -23,7 +23,7 @@ Connectivity, transport and authentication configuration properties:
| compress | Set payload compression mechanism. The only available option is `gzip`. | |
| channel | Specify X-Splunk-Request-Channel Header for the HTTP Event Collector interface. | |
| http_debug_bad_request | If the HTTP server response code is 400 (bad request) and this flag is enabled, it will print the full HTTP request and response to the stdout interface. This feature is available for debugging purposes. | |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
Content and Splunk metadata \(fields\) handling configuration properties:
@@ -168,9 +168,9 @@ The following configuration gathers CPU metrics, nests the appropriate field, ad
name cpu
tag cpu
-# Move CPU metrics to be nested under "fields" and
+# Move CPU metrics to be nested under "fields" and
# add the prefix "metric_name:" to all metrics
-# NOTE: you can change Wildcard field to only select metric fields
+# NOTE: you can change Wildcard field to only select metric fields
[FILTER]
Name nest
Match cpu
@@ -183,18 +183,18 @@ The following configuration gathers CPU metrics, nests the appropriate field, ad
[FILTER]
Name modify
Match cpu
- Set index cpu-metrics
+ Set index cpu-metrics
Set source fluent-bit
Set sourcetype custom
# ensure splunk_send_raw is on
[OUTPUT]
- name splunk
+ name splunk
match *
host
port 8088
splunk_send_raw on
- splunk_token f9bd5bdb-c0b2-4a83-bcff-9625e5e908db
+ splunk_token f9bd5bdb-c0b2-4a83-bcff-9625e5e908db
tls on
tls.verify off
```
diff --git a/pipeline/outputs/stackdriver.md b/pipeline/outputs/stackdriver.md
index 759e629d8..54fe89a38 100644
--- a/pipeline/outputs/stackdriver.md
+++ b/pipeline/outputs/stackdriver.md
@@ -32,7 +32,7 @@ Before to get started with the plugin configuration, make sure to obtain the pro
| severity\_key | Specify the name of the key from the original record that contains the severity information. | `logging.googleapis.com/severity`. See [Stackdriver Special Fields][StackdriverSpecialFields] for more info. |
| project_id_key | The value of this field is used by the Stackdriver output plugin to find the gcp project id from jsonPayload and then extract the value of it to set the PROJECT_ID within LogEntry logName, which controls the gcp project that should receive these logs. | `logging.googleapis.com/projectId`. See [Stackdriver Special Fields][StackdriverSpecialFields] for more info. |
| autoformat\_stackdriver\_trace | Rewrite the _trace_ field to include the projectID and format it for use with Cloud Trace. When this flag is enabled, the user can get the correct result by printing only the traceID (usually 32 characters). | false |
-| Workers | Enables dedicated thread(s) for this output. | 1 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
| custom\_k8s\_regex | Set a custom regex to extract field like pod\_name, namespace\_name, container\_name and docker\_id from the local\_resource\_id in logs. This is helpful if the value of pod or node name contains dots. | `(?[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?[^_]+)_(?.+)-(?[a-z0-9]{64})\.log$` |
| resource_labels | An optional list of comma separated strings specifying resource labels plaintext assignments (`new=value`) and/or mappings from an original field in the log entry to a destination field (`destination=$original`). Nested fields and environment variables are also supported using the [record accessor syntax](https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/record-accessor). If configured, *all* resource labels will be assigned using this API only, with the exception of `project_id`. See [Resource Labels](#resource-labels) for more details. | |
| compress | Set payload compression mechanism. The only available option is `gzip`. Default = "", which means no compression.| |
diff --git a/pipeline/outputs/standard-output.md b/pipeline/outputs/standard-output.md
index 44ddf0580..69e3e44f2 100644
--- a/pipeline/outputs/standard-output.md
+++ b/pipeline/outputs/standard-output.md
@@ -9,7 +9,7 @@ The **stdout** output plugin allows to print to the standard output the data rec
| Format | Specify the data format to be printed. Supported formats are _msgpack_, _json_, _json\_lines_ and _json\_stream_. | msgpack |
| json\_date\_key | Specify the name of the time key in the output record. To disable the time key just set the value to `false`. | date |
| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 1 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
### Command Line
@@ -35,4 +35,3 @@ Fluent Bit v1.x.x
```
No more, no less, it just works.
-
diff --git a/pipeline/outputs/syslog.md b/pipeline/outputs/syslog.md
index 8c5b56c91..9c6f17e23 100644
--- a/pipeline/outputs/syslog.md
+++ b/pipeline/outputs/syslog.md
@@ -31,6 +31,7 @@ You must be aware of the structure of your original record so you can configure
| syslog\_sd\_key | The key name from the original record that contains a map of key/value pairs to use as Structured Data \(SD\) content. The key name is included in the resulting SD field as shown in examples below. This configuration is optional. | |
| syslog\_message\_key | The key name from the original record that contains the message to deliver. Note that this property is **mandatory**, otherwise the message will be empty. | |
| allow\_longer\_sd\_id| If true, Fluent-bit allows SD-ID that is longer than 32 characters. Such long SD-ID violates RFC 5424.| false |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
### TLS / SSL
@@ -123,7 +124,7 @@ Example configuration file:
syslog_hostname_key hostname
syslog_appname_key appname
syslog_procid_key procid
- syslog_msgid_key msgid
+ syslog_msgid_key msgid
syslog_sd_key uls@0
syslog_message_key log
```
@@ -156,19 +157,19 @@ Example output:
### Adding Structured Data Authentication Token
-Some services use the structured data field to pass authentication tokens (e.g. `[@41018]`), which would need to be added to each log message dynamically.
-However, this requires setting the token as a key rather than as a value.
+Some services use the structured data field to pass authentication tokens (e.g. `[@41018]`), which would need to be added to each log message dynamically.
+However, this requires setting the token as a key rather than as a value.
Here's an example of how that might be achieved, using `AUTH_TOKEN` as a [variable](../../administration/configuring-fluent-bit/classic-mode/variables.md):
{% tabs %}
{% tab title="fluent-bit.conf" %}
```text
-[FILTER]
+[FILTER]
name lua
match *
call append_token
code function append_token(tag, timestamp, record) record["${AUTH_TOKEN}"] = {} return 2, timestamp, record end
-
+
[OUTPUT]
name syslog
match *
@@ -213,4 +214,4 @@ Here's an example of how that might be achieved, using `AUTH_TOKEN` as a [variab
tls.crt_file: /path/to/my.crt
```
{% endtab %}
-{% endtabs %}
\ No newline at end of file
+{% endtabs %}
diff --git a/pipeline/outputs/tcp-and-tls.md b/pipeline/outputs/tcp-and-tls.md
index 545063593..55de1b07c 100644
--- a/pipeline/outputs/tcp-and-tls.md
+++ b/pipeline/outputs/tcp-and-tls.md
@@ -11,7 +11,7 @@ The **tcp** output plugin allows to send records to a remote TCP server. The pay
| Format | Specify the data format to be printed. Supported formats are _msgpack_ _json_, _json\_lines_ and _json\_stream_. | msgpack |
| json\_date\_key | Specify the name of the time key in the output record. To disable the time key just set the value to `false`. | date |
| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double |
-| Workers | Enables dedicated thread(s) for this output. Default value is set since version 1.8.13. For previous versions is 0. | 2 |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `2` |
## TLS Configuration Parameters
diff --git a/pipeline/outputs/treasure-data.md b/pipeline/outputs/treasure-data.md
index ff2a070bf..22991f239 100644
--- a/pipeline/outputs/treasure-data.md
+++ b/pipeline/outputs/treasure-data.md
@@ -12,6 +12,7 @@ The plugin supports the following configuration parameters:
| Database | Specify the name of your target database. | |
| Table | Specify the name of your target table where the records will be stored. | |
| Region | Set the service region, available values: US and JP | US |
+| Workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Getting Started
@@ -41,4 +42,3 @@ In your main configuration file append the following _Input_ & _Output_ sections
Database fluentbit
Table cpu_samples
```
-
diff --git a/pipeline/outputs/vivo-exporter.md b/pipeline/outputs/vivo-exporter.md
index 69c00dfcb..156ae257a 100644
--- a/pipeline/outputs/vivo-exporter.md
+++ b/pipeline/outputs/vivo-exporter.md
@@ -9,6 +9,8 @@ Vivo Exporter is an output plugin that exposes logs, metrics, and traces through
| `empty_stream_on_read` | If enabled, when an HTTP client consumes the data from a stream, the stream content will be removed. | Off |
| `stream_queue_size` | Specify the maximum queue size per stream. Each specific stream for logs, metrics and traces can hold up to `stream_queue_size` bytes. | 20M |
| `http_cors_allow_origin` | Specify the value for the HTTP Access-Control-Allow-Origin header (CORS). | |
+| `workers` | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `1` |
+
### Getting Started
@@ -25,7 +27,7 @@ Here is a simple configuration of Vivo Exporter, note that this example is not b
match *
empty_stream_on_read off
stream_queue_size 20M
-
http_cors_allow_origin *
+ http_cors_allow_origin *
```
### How it works
diff --git a/pipeline/outputs/websocket.md b/pipeline/outputs/websocket.md
index fc5d4ab08..a5a049df1 100644
--- a/pipeline/outputs/websocket.md
+++ b/pipeline/outputs/websocket.md
@@ -13,6 +13,7 @@ The **websocket** output plugin allows to flush your records into a WebSocket en
| Format | Specify the data format to be used in the HTTP request body, by default it uses _msgpack_. Other supported formats are _json_, _json\_stream_ and _json\_lines_ and _gelf_. | msgpack |
| json\_date\_key | Specify the name of the date field in output | date |
| json\_date\_format | Specify the format of the date. Supported formats are _double_, _epoch_, _iso8601_ (eg: _2018-05-30T09:39:52.000681Z_) and _java_sql_timestamp_ (eg: _2018-05-30 09:39:52.000681_) | double |
+| workers | The number of [workers](../../administration/multithreading.md#outputs) to perform flush operations for this output. | `0` |
## Getting Started
@@ -63,6 +64,7 @@ Websocket plugin is working with tcp keepalive mode, please refer to [networking
Listen 0.0.0.0
Port 5170
Format json
+
[OUTPUT]
Name websocket
Match *
diff --git a/pipeline/parsers/configuring-parser.md b/pipeline/parsers/configuring-parser.md
index 43c7de0cb..9af5dd37f 100644
--- a/pipeline/parsers/configuring-parser.md
+++ b/pipeline/parsers/configuring-parser.md
@@ -32,6 +32,7 @@ Multiple parsers can be defined and each section has it own properties. The foll
| Time\_Format | Specify the format of the time field so it can be recognized and analyzed properly. Fluent-bit uses `strptime(3)` to parse time so you can refer to [strptime documentation](https://linux.die.net/man/3/strptime) for available modifiers. |
| Time\_Offset | Specify a fixed UTC time offset \(e.g. -0600, +0200, etc.\) for local dates. |
| Time\_Keep | By default when a time key is recognized and parsed, the parser will drop the original time field. Enabling this option will make the parser to keep the original time field and it value in the log entry. |
+| Time\_System\_Timezone | If there is no timezone (`%z`) specified in the given `Time_Format`, enabling this option will make the parser detect and use the system's configured timezone. The configured timezone is detected from the [`TZ` environment variable](https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html). |
| Types | Specify the data type of parsed field. The syntax is `types : : ...`. The supported types are `string`\(default\), `integer`, `bool`, `float`, `hex`. The option is supported by `ltsv`, `logfmt` and `regex`. |
| Decode\_Field | Decode a field value, the only decoder available is `json`. The syntax is: `Decode_Field json `. |
| Skip\_Empty\_Values | Specify a boolean which determines if the parser should skip empty values. The default is `true`. |
diff --git a/pipeline/parsers/decoders.md b/pipeline/parsers/decoders.md
index 38e9244f6..4fb4016f9 100644
--- a/pipeline/parsers/decoders.md
+++ b/pipeline/parsers/decoders.md
@@ -1,29 +1,35 @@
# Decoders
-There are certain cases where the log messages being parsed contains encoded data, a typical use case can be found in containerized environments with Docker: application logs it data in JSON format but becomes an escaped string, Consider the following example
+There are cases where the log messages being parsed contain encoded data. A typical
+use case can be found in containerized environments with Docker. Docker logs its
+data in JSON format, which uses escaped strings.
-Original message generated by the application:
+Consider the following message generated by the application:
```text
{"status": "up and running"}
```
-Then the Docker log message become encapsulated as follows:
+The Docker log message encapsulates something like this:
```text
{"log":"{\"status\": \"up and running\"}\r\n","stream":"stdout","time":"2018-03-09T01:01:44.851160855Z"}
```
-as you can see the original message is handled as an escaped string. Ideally in Fluent Bit we would like to keep having the original structured message and not a string.
+The original message is handled as an escaped string. Fluent Bit wants to use the
+original structured message and not a string.
## Getting Started
-Decoders are a built-in feature available through the Parsers file, each Parser definition can optionally set one or multiple decoders. There are two type of decoders type:
+Decoders are a built-in feature available through the Parsers file. Each parser
+definition can optionally set one or more decoders. There are two types of decoders:
-* Decode\_Field: if the content can be decoded in a structured message, append that structure message \(keys and values\) to the original log message.
-* Decode\_Field\_As: any content decoded \(unstructured or structured\) will be replaced in the same key/value, no extra keys are added.
+- `Decode_Field`: If the content can be decoded in a structured message, append
+ the structured message (keys and values) to the original log message.
+- `Decode_Field_As`: Any decoded content (unstructured or structured) will be
+ replaced in the same key/value, and no extra keys are added.
-Our pre-defined Docker Parser have the following definition:
+Our pre-defined Docker parser has the following definition:
```text
[PARSER]
@@ -37,35 +43,40 @@ Our pre-defined Docker Parser have the following definition:
Decode_Field_As escaped log
```
-Each line in the parser with a key _Decode\_Field_ instruct the parser to apply a specific decoder on a given field, optionally it offer the option to take an extra action if the decoder cannot succeed.
+Each line in the parser with a key `Decode_Field` instructs the parser to apply
+a specific decoder on a given field. Optionally, it offers the option to take an
+extra action if the decoder doesn't succeed.
-### Decoders
+### Decoder options
-| Name | Description |
-| :--- | :--- |
-| json | handle the field content as a JSON map. If it find a JSON map it will replace the content with a structured map. |
-| escaped | decode an escaped string. |
-| escaped\_utf8 | decode a UTF8 escaped string. |
+| Name | Description |
+| -------------- | ----------- |
+| `json` | Handle the field content as a JSON map. If it finds a JSON map, it replaces the content with a structured map. |
+| `escaped` | Decode an escaped string. |
+| `escaped_utf8` | Decode a UTF8 escaped string. |
### Optional Actions
-By default if a decoder fails to decode the field or want to try a next decoder, is possible to define an optional action. Available actions are:
+If a decoder fails to decode the field or, you want to try another decoder, you can
+define an optional action. Available actions are:
| Name | Description |
-| :--- | :--- |
-| try\_next | if the decoder failed, apply the next Decoder in the list for the same field. |
-| do\_next | if the decoder succeeded or failed, apply the next Decoder in the list for the same field. |
+| -----| ----------- |
+| `try_next` | if the decoder failed, apply the next decoder in the list for the same field. |
+| `do_next` | if the decoder succeeded or failed, apply the next decoder in the list for the same field. |
-Note that actions are affected by some restrictions:
+Actions are affected by some restrictions:
-* on Decode\_Field\_As, if succeeded, another decoder of the same type in the same field can be applied only if the data continues being an unstructured message \(raw text\).
-* on Decode\_Field, if succeeded, can only be applied once for the same field. By nature Decode\_Field aims to decode a structured message.
+- `Decode_Field_As`: If successful, another decoder of the same type and the same
+ field can be applied only if the data continues being an unstructured message (raw text).
+- `Decode_Field`: If successful, can only be applied once for the same field.
+ `Decode`_Field` is intended to decode a structured message.
### Examples
-### escaped\_utf8
+#### `escaped_utf8`
-Example input \(from `/path/to/log.log` in configuration below\)
+Example input from `/path/to/log.log`:
```text
{"log":"\u0009Checking indexes...\n","stream":"stdout","time":"2018-02-19T23:25:29.1845444Z"}
@@ -73,18 +84,18 @@ Example input \(from `/path/to/log.log` in configuration below\)
{"log":"\u0009Done\n","stream":"stdout","time":"2018-02-19T23:25:29.1845622Z"}
```
-Example output
+Example output:
```text
-[24] tail.0: [1519082729.184544400, {"log"=>" Checking indexes...
+[24] tail.0: [1519082729.184544400, {"log"=>" Checking indexes...
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845444Z"}]
[25] tail.0: [1519082729.184553600, {"log"=>" Validated: _audit _internal _introspection _telemetry _thefishbucket history main snmp_data summary
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845536Z"}]
-[26] tail.0: [1519082729.184562200, {"log"=>" Done
+[26] tail.0: [1519082729.184562200, {"log"=>" Done
", "stream"=>"stdout", "time"=>"2018-02-19T23:25:29.1845622Z"}]
```
-Configuration file
+Decoder configuration file:
```text
[SERVICE]
@@ -100,7 +111,7 @@ Configuration file
Match *
```
-The `fluent-bit-parsers.conf` file,
+The `fluent-bit-parsers.conf` file:
```text
[PARSER]
@@ -110,4 +121,3 @@ The `fluent-bit-parsers.conf` file,
Time_Format %Y-%m-%dT%H:%M:%S %z
Decode_Field_as escaped_utf8 log
```
-
diff --git a/pipeline/parsers/regular-expression.md b/pipeline/parsers/regular-expression.md
index 8cce3eeae..99deb4bb7 100644
--- a/pipeline/parsers/regular-expression.md
+++ b/pipeline/parsers/regular-expression.md
@@ -1,28 +1,38 @@
# Regular Expression
-The **regex** parser allows to define a custom Ruby Regular Expression that will use a named capture feature to define which content belongs to which key name.
+The **Regex** parser lets you define a custom Ruby regular expression that uses
+a named capture feature to define which content belongs to which key name.
-Fluent Bit uses [Onigmo](https://github.com/k-takata/Onigmo) regular expression library on Ruby mode, for testing purposes you can use the following web editor to test your expressions:
+Use [Tail Multiline](../inputs/tail.md#multiline) when you need to support regexes
+across multiple lines from a `tail`. The [Tail](../inputs/tail.md) input plugin
+treats each line as a separate entity.
-[http://rubular.com/](http://rubular.com/)
+{% hint style="warning" %}
+Security Warning: Onigmo is a backtracking regex engine. When using expensive
+regex patterns Onigmo can take a long time to perform pattern matching. Read
+["ReDoS"](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS)
+on OWASP for additional information.
+{% end hint %}
-Important: do not attempt to add multiline support in your regular expressions if you are using [Tail](../inputs/tail.md) input plugin since each line is handled as a separated entity. Instead use Tail [Multiline](../inputs/tail.md#multiline) support configuration feature.
+Setting the format to **regex** requires a `regex` configuration key.
-Security Warning: Onigmo is a _backtracking_ regex engine. You need to be careful not to use expensive regex patterns, or Onigmo can take very long time to perform pattern matching. For details, please read the article ["ReDoS"](https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS) on OWASP.
-
-> Note: understanding how regular expressions works is out of the scope of this content.
+## Configuration Parameters
-From a configuration perspective, when the format is set to **regex**, is mandatory and expected that a _Regex_ configuration key exists.
+The regex parser supports the following configuration parameters:
-## Configuration Parameters
+| Key | Description | Default Value |
+| --- | ----------- | ------------- |
+| `Skip_Empty_Values` | If enabled, the parser ignores empty value of the record. | `True` |
-The regex parser supports the following configuration parameters.
+Fluent Bit uses the [Onigmo](https://github.com/k-takata/Onigmo) regular expression
+library on Ruby mode.
-|Key|Description|Default Value|
-|-------|------------|--------|
-|`Skip_Empty_Values`|If enabled, the parser ignores empty value of the record.| True|
+You can use only alphanumeric characters and underscore in group names. For example,
+a group name like `(?.*)` causes an error due to the invalid dash (`-`)
+character. Use the [Rubular](http://rubular.com/) web editor to test your expressions.
-The following parser configuration example aims to provide rules that can be applied to an Apache HTTP Server log entry:
+The following parser configuration example provides rules that can be applied to an
+Apache HTTP Server log entry:
```python
[PARSER]
@@ -34,13 +44,14 @@ The following parser configuration example aims to provide rules that can be app
Types code:integer size:integer
```
-As an example, takes the following Apache HTTP Server log entry:
+As an example, review the following Apache HTTP Server log entry:
```text
192.168.2.20 - - [29/Jul/2015:10:27:10 -0300] "GET /cgi-bin/try/ HTTP/1.0" 200 3395
```
-The above content do not provide a defined structure for Fluent Bit, but enabling the proper parser we can help to make a structured representation of it:
+This log entry doesn't provide a defined structure for Fluent Bit. Enabling the
+proper parser can help to make a structured representation of the entry:
```text
[1154104030, {"host"=>"192.168.2.20",
@@ -54,8 +65,3 @@ The above content do not provide a defined structure for Fluent Bit, but enablin
}
]
```
-
-A common pitfall is that you cannot use characters other than alphabets, numbers and underscore in group names. For example, a group name like `(?.*)` will cause an error due to containing an invalid character \(`-`\).
-
-In order to understand, learn and test regular expressions like the example above, we suggest you try the following Ruby Regular Expression Editor: [http://rubular.com/r/X7BH0M4Ivm](http://rubular.com/r/X7BH0M4Ivm)
-
diff --git a/pipeline/processors/content-modifier.md b/pipeline/processors/content-modifier.md
index 88943f3a1..cb42784ea 100644
--- a/pipeline/processors/content-modifier.md
+++ b/pipeline/processors/content-modifier.md
@@ -1,17 +1,45 @@
# Content Modifier
-The **content_modifier** processor allows you to manipulate the metadata/attributes and content of Logs and Traces.
+The **content_modifier** processor allows you to manipulate the messages, metadata/attributes and content of Logs and Traces.
+
+
Similar to the functionality exposed by filters, this processor presents a unified mechanism to perform such operations for data manipulation. The most significant difference is that processors perform better than filters, and when chaining them, there are no encoding/decoding performance penalties.
Note that processors and this specific component can only be enabled using the new YAML configuration format. Classic mode configuration format doesn't support processors.
+## Contexts
+
+The processor, works on top of what we call a __context__, meaning _the place_ where the content modification will happen. We provide different contexts to manipulate the desired information, the following contexts are available:
+
+| Context Name | Signal | Description |
+| -- | -- | -- |
+| `attributes` | Logs | Modify the attributes or metadata of a Log record. |
+| `body` | Logs | Modify the content of a Log record. |
+| `span_name` | Traces | Modify the name of a Span. |
+| `span_kind` | Traces | Modify the kind of a Span. |
+| `span_status` | Traces | Modify the status of a Span. |
+| `span_attributes` | Traces | Modify the attributes of a Span. |
+
+
+### OpenTelemetry Contexts
+
+In addition, we provide special contexts to operate on data that follows an __OpenTelemetry Log Schema__, all of them operates on shared data across a group of records:
+
+| Context Name | Signal | Description |
+| -- | -- | -- |
+| `otel_resource_attributes` | Logs | Modify the attributes of the Log Resource. |
+| `otel_scope_name` | Logs | Modify the name of a Log Scope. |
+| `otel_scope_version` | Logs | Modify version of a Log Scope. |
+| `otel_scope_attributes` | Logs | Modify the attributes of a Log Scope. |
+
+> TIP: if your data is not following the OpenTelemetry Log Schema and your backend or destination for your logs expects to be in an OpenTelemetry schema, take a look at the processor called OpenTelemetry Envelope that you can use in conjunbction with this processor to transform your data to be compatible with OpenTelemetry Log schema.
+
## Configuration Parameters
| Key | Description |
| :---------- | :--- |
-| action | Define the operation to run on the target content. This field is mandatory; for more details about the actions available, check the table below. |
-| context | Specify which component of the Telemetry type will be affected. When processing Logs the following contexts are available: `attributes` or `body`. When processing Traces the following contexts are available: `span_name`, `span_kind`, `span_status`, `span_attributes`. |
+| context | Specify the context where the modifications will happen (more details above).The following contexts are available: `attributes`, `body`, `span_name`, `span_kind`, `span_status`, `span_attributes`, `otel_resource_attributes`, `otel_scope_name`, `otel_scope_version`, `otel_scope_attributes`. |
| key | Specify the name of the key that will be used to apply the modification. |
| value | Based on the action type, `value` might required and represent different things. Check the detailed information for the specific actions. |
| pattern | Defines a regular expression pattern. This property is only used by the `extract` action. |
@@ -23,13 +51,13 @@ The actions specify the type of operation to run on top of a specific key or con
| Action | Description |
| ------- | ------------------------------------------------------------ |
-| insert | Insert a new key with a value into the target context. The `key` and `value` parameters are required. |
-| upsert | Given a specific key with a value, the `upsert` operation will try to update the value of the key. If the key does not exist, the key will be created. The `key` and `value` parameters are required. |
-| delete | Delete a key from the target context. The `key` parameter is required. |
-| rename | Change the name of a key. The `value` set in the configuration will represent the new name. The `key` and `value` parameters are required. |
-| hash | Replace the key value with a hash generated by the SHA-256 algorithm, the binary value generated is finally set as an hex string representation. The `key` parameter is required. |
-| extract | Allows to extact the value of a single key as a list of key/value pairs. This action needs the configuration of a regular expression in the `pattern` property . The `key` and `pattern` parameters are required. For more details check the examples below. |
-| convert | Convert the data type of a key value. The `key` and `converted_type` parameters are required. |
+| `insert` | Insert a new key with a value into the target context. The `key` and `value` parameters are required. |
+| `upsert` | Given a specific key with a value, the `upsert` operation will try to update the value of the key. If the key does not exist, the key will be created. The `key` and `value` parameters are required. |
+| `delete` | Delete a key from the target context. The `key` parameter is required. |
+| `rename` | Change the name of a key. The `value` set in the configuration will represent the new name. The `key` and `value` parameters are required. |
+| `hash` | Replace the key value with a hash generated by the SHA-256 algorithm, the binary value generated is finally set as an hex string representation. The `key` parameter is required. |
+| `extract` | Allows to extact the value of a single key as a list of key/value pairs. This action needs the configuration of a regular expression in the `pattern` property . The `key` and `pattern` parameters are required. For more details check the examples below. |
+| `convert` | Convert the data type of a key value. The `key` and `converted_type` parameters are required. |
#### Insert example
@@ -74,7 +102,7 @@ pipeline:
action: upsert
key: "key2"
value: "example"
-
+
outputs:
- name : stdout
match: '*'
@@ -97,8 +125,8 @@ pipeline:
logs:
- name: content_modifier
action: delete
- key: "key2"
-
+ key: "key2"
+
outputs:
- name : stdout
match: '*'
@@ -168,7 +196,7 @@ pipeline:
action: extract
key: "http.url"
pattern: ^(?https?):\/\/(?[^\/\?]+)(?\/[^?]*)?(?:\?(?.*))?
-
+
outputs:
- name : stdout
match: '*'
@@ -198,7 +226,7 @@ pipeline:
action: convert
key: key2
converted_type: boolean
-
+
outputs:
- name : stdout
match: '*'
diff --git a/pipeline/processors/labels.md b/pipeline/processors/labels.md
new file mode 100644
index 000000000..cc1c663a0
--- /dev/null
+++ b/pipeline/processors/labels.md
@@ -0,0 +1,111 @@
+# Labels
+
+
+The **labels** processor lets you manipulate the labels of metrics.
+
+Similar to filters, this processor presents a enriching/modifying mechanism to
+perform operations for labels manipulation. The most significant difference is
+that processors perform better than filters, and when chaining them there are no
+encoding or decoding performance penalties.
+
+{% hint style="info" %}
+**Note:** Both processors and this specific component can be enabled only by using
+the YAML configuration format. Classic mode configuration format doesn't support
+processors.
+{% endhint %}
+
+## Configuration Parameters
+
+| Key | Description |
+| :----- | :---------- |
+| update | Update an existing key with a value into metrics. The key/value pair is required. If the specified key doesn't exist, the operation silently fails and has no effect. |
+| insert | Insert a new key with a value into metrics. The key/value pair is required. |
+| upsert | Upsert a specific key with a value, the `upsert` operation will try to update the value of the key. If the key does not exist, the key will be created. The key-value pair is required. |
+| delete | Delete a key from the labels of metrics. The key/value pair is required. If the specified key doesn't exist, the operation silently fails and has no effect. |
+| hash | Replace the key value with a hash generated by the SHA-256 algorithm from the specified label name. The generated binary value is set as a hex string. |
+
+#### Update example
+
+Change the value of the `name` to `fluentbit`:
+
+```yaml
+pipeline:
+ inputs:
+ - name: fluentbit_metrics
+ processors:
+ metrics:
+ - name: labels
+ update: name fluentbit
+ outputs:
+ - name : stdout
+ match: '*'
+```
+
+#### Insert example
+
+The following example appends the key `agent` with the value `fluentbit` as the label
+of metrics:
+
+```yaml
+pipeline:
+ inputs:
+ - name: fluentbit_metrics
+ processors:
+ metrics:
+ - name: labels
+ insert: agent fluentbit
+ outputs:
+ - name : stdout
+ match: '*'
+```
+
+#### Upsert example
+
+Upsert the value of `name` and insert `fluentbit`:
+
+```yaml
+pipeline:
+ inputs:
+ - name: fluentbit_metrics
+ processors:
+ metrics:
+ - name: labels
+ upsert: name fluentbit
+ outputs:
+ - name : stdout
+ match: '*'
+```
+
+#### Delete example
+
+Delete containing `name` key from metrics:
+
+```yaml
+pipeline:
+ inputs:
+ - name: fluentbit_metrics
+ processors:
+ metrics:
+ - name: labels
+ delete: name
+ outputs:
+ - name : stdout
+ match: '*'
+```
+
+#### Hash example
+
+Apply the SHA-1 algorithm for the value of the key `hostname`:
+
+```yaml
+pipeline:
+ inputs:
+ - name: fluentbit_metrics
+ processors:
+ metrics:
+ - name: labels
+ hash: hostname
+ outputs:
+ - name : stdout
+ match: '*'
+```
diff --git a/pipeline/processors/metrics-selector.md b/pipeline/processors/metrics-selector.md
index 962577ad3..262075f9a 100644
--- a/pipeline/processors/metrics-selector.md
+++ b/pipeline/processors/metrics-selector.md
@@ -2,6 +2,8 @@
The **metric_selector** processor allows you to select metrics to include or exclude (similar to the `grep` filter for logs).
+
+
## Configuration Parameters
The native processor plugin supports the following configuration parameters:
@@ -9,9 +11,10 @@ The native processor plugin supports the following configuration parameters:
| Key | Description | Default |
| :---------- | :--- | :--- |
| Metric\_Name | Keep metrics in which the metric of name matches with the actual name or the regular expression. | |
-| Context | Specify matching context. Currently, metric_name is only supported. | `Metrics_Name` |
+| Context | Specify matching context. Currently, metric\_name and delete\_label\_value are only supported. | `Metrics_Name` |
| Action | Specify the action for specified metrics. INCLUDE and EXCLUDE are allowed. | |
| Operation\_Type | Specify the operation type of action for metrics payloads. PREFIX and SUBSTRING are allowed. | |
+| Label | Specify a label key and value pair. | |
## Configuration Examples
@@ -44,6 +47,35 @@ pipeline:
delete: name
+ outputs:
+ - name: stdout
+ match: '*'
+```
+{% endtab %}
+
+{% tab title="context-delete\_label\_value.yaml" %}
+```yaml
+service:
+ flush: 5
+ daemon: off
+ log_level: info
+
+pipeline:
+ inputs:
+ - name: fluentbit_metrics
+ tag: fluentbit.metrics
+ scrape_interval: 10
+
+ processors:
+ metrics:
+ - name: metrics_selector
+ context: delete_label_value
+ label: name stdout.0
+
+ - name: labels
+ delete: name
+
+
outputs:
- name: stdout
match: '*'
diff --git a/pipeline/processors/opentelemetry-envelope.md b/pipeline/processors/opentelemetry-envelope.md
new file mode 100644
index 000000000..f9df45a3c
--- /dev/null
+++ b/pipeline/processors/opentelemetry-envelope.md
@@ -0,0 +1,165 @@
+# OpenTelemetry Envelope
+
+The _OpenTelemetry Envelope_ processor is used to transform your data to be compatible with the OpenTelemetry Log schema. If your data was __not__ generated by [OpenTelemetry input](../inputs/opentelemetry.md) and your backend or destination for your logs expects to be in an OpenTelemetry schema.
+
+![](/imgs/processor_opentelemetry_envelope.png)
+
+## Configuration Parameters
+
+The processor does not provide any extra configuration parameter, it can be used directly in your _processors_ Yaml directive.
+
+## Usage Example
+
+In this example, we will use the Dummy input plugin to generate a sample message per second, right after is created the processor `opentelemetry_envelope` is used to transform the data to be compatible with the OpenTelemetry Log schema. The output is sent to the standard output and also to an OpenTelemetry collector which is receiving data in port 4318.
+
+
+__fluent-bit.yaml__
+
+```yaml
+service:
+ flush: 1
+ log_level: info
+
+pipeline:
+ inputs:
+ - name: dummy
+ dummy: '{"message": "Hello World"}'
+
+ processors:
+ logs:
+ - name: opentelemetry_envelope
+
+ outputs:
+ - name : stdout
+ match: '*'
+
+ - name: opentelemetry
+ match: '*'
+ host: 127.0.0.1
+ port: 4318
+```
+
+__otel-collector.yaml__
+
+```yaml
+receivers:
+ otlp:
+ protocols:
+ http:
+ endpoint: 127.0.0.1:4318
+
+exporters:
+ file:
+ path: out.json
+ logging:
+ loglevel: info
+
+service:
+ telemetry:
+ logs:
+ level: debug
+ pipelines:
+ logs:
+ receivers: [otlp]
+ exporters: [file, logging]
+```
+
+ You will notice in the standard output of FLuent Bit will print the raw representation of the schema, however, the OpenTelemetry collector will receive the data in the OpenTelemetry Log schema.
+
+Inspecting the output file `out.json` you will see the data in the OpenTelemetry Log schema:
+
+
+```json
+{
+ "resourceLogs": [
+ {
+ "resource": {},
+ "scopeLogs": [
+ {
+ "scope": {},
+ "logRecords": [
+ {
+ "timeUnixNano": "1722904188085758000",
+ "body": {
+ "stringValue": "dummy"
+ },
+ "traceId": "",
+ "spanId": ""
+ }
+ ]
+ }
+ ]
+ }
+ ]
+}
+```
+
+While OpenTelemetry Envelope enrich your logs with the Schema, you might be interested into take a step further and use the [Content Modifier](../processors/content-modifier.md) processor to modify the content of your logs. Here is a quick example that will allow you to add some resource and scope attributes to your logs:
+
+```yaml
+service:
+ flush: 1
+ log_level: info
+
+pipeline:
+ inputs:
+ - name: dummy
+ dummy: '{"message": "Hello World"}'
+
+ processors:
+ logs:
+ - name: opentelemetry_envelope
+
+ - name: content_modifier
+ context: otel_resource_attributes
+ action: upsert
+ key: service.name
+ value: my-service
+
+ outputs:
+ - name : stdout
+ match: '*'
+
+ - name: opentelemetry
+ match: '*'
+ host: 127.0.0.1
+ port: 4318
+```
+
+The collector JSON output will look like this:
+
+```json
+{
+ "resourceLogs": [
+ {
+ "resource": {
+ "attributes": [
+ {
+ "key": "service.name",
+ "value": {
+ "stringValue": "my-service"
+ }
+ }
+ ]
+ },
+ "scopeLogs": [
+ {
+ "scope": {},
+ "logRecords": [
+ {
+ "timeUnixNano": "1722904465173450000",
+ "body": {
+ "stringValue": "Hello World"
+ },
+ "traceId": "",
+ "spanId": ""
+ }
+ ]
+ }
+ ]
+ }
+ ]
+}
+```
+
+For more details about further processing, read the [Content Modifier](../processors/content-modifier.md) processor documentation.
diff --git a/pipeline/processors/sql.md b/pipeline/processors/sql.md
index d1b06d461..47482f80a 100644
--- a/pipeline/processors/sql.md
+++ b/pipeline/processors/sql.md
@@ -2,6 +2,8 @@
The **sql** processor provides a simple interface to select content from Logs by also supporting conditional expressions.
+
+
Our SQL processor does not depend on a database or indexing; it runs everything on the fly (this is good). We don't have the concept of tables but you run the query on the STREAM.
Note that this processor differs from the "stream processor interface" that runs after the filters; this one can only be used in the processor's section of the input plugins when using YAML configuration mode.
diff --git a/vale-styles/FluentBit/AMPM.yml b/vale-styles/FluentBit/AMPM.yml
new file mode 100644
index 000000000..5b696b3b8
--- /dev/null
+++ b/vale-styles/FluentBit/AMPM.yml
@@ -0,0 +1,10 @@
+
+extends: existence
+message: "Use 'AM' or 'PM' (preceded by a space)."
+link: 'https://developers.google.com/style/word-list'
+level: suggestion
+nonword: true
+tokens:
+ - '\d{1,2}[AP]M'
+ - '\d{1,2} ?[ap]m'
+ - '\d{1,2} ?[aApP]\.[mM]\.'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Acronyms.yml b/vale-styles/FluentBit/Acronyms.yml
new file mode 100644
index 000000000..19936b25c
--- /dev/null
+++ b/vale-styles/FluentBit/Acronyms.yml
@@ -0,0 +1,95 @@
+extends: conditional
+message: "Spell out '%s', if it's unfamiliar to the audience."
+link: 'https://developers.google.com/style/abbreviations'
+level: suggestion
+ignorecase: false
+# Ensures that the existence of 'first' implies the existence of 'second'.
+first: '\b([A-Z]{3,5})\b'
+second: '(?:\b[A-Z][a-z]+ )+\(([A-Z]{3,5})\)'
+# ... with the exception of these:
+exceptions:
+ - ACL
+ - API
+ - ARN
+ - ASC
+ - ASP
+ - AWS
+ - CIDR
+ - CLI
+ - CPU
+ - CRD
+ - CSS
+ - CSV
+ - DEBUG
+ - DESC
+ - DOM
+ - DNS
+ - DPI
+ - DPPS
+ - FAQ
+ - FIPS
+ - GCC
+ - GCP
+ - GDB
+ - GET
+ - GNU
+ - GPG
+ - GPU
+ - GTK
+ - GUI
+ - GZIP
+ - HPA
+ - IAM
+ - HTML
+ - HTTP
+ - HTTPS
+ - IDE
+ - JAR
+ - JSON
+ - JSX
+ - LESS
+ - LLDB
+ - LTS
+ - NET
+ - NOTE
+ - NVDA
+ - OSS
+ - PATH
+ - PEM
+ - PDF
+ - PHP
+ - POSIX
+ - POST
+ - RAM
+ - REPL
+ - REST
+ - RHEL
+ - RPC
+ - RSA
+ - SASL
+ - SCM
+ - SCSS
+ - SDK
+ - SIEM
+ - SLA
+ - SQL
+ - SSH
+ - SSL
+ - SSO
+ - SVG
+ - TBD
+ - TCP
+ - TLS
+ - TRE
+ - TODO
+ - UDP
+ - URI
+ - URL
+ - USB
+ - UTC
+ - UTF
+ - UUID
+ - XML
+ - XSS
+ - YAML
+ - ZIP
diff --git a/vale-styles/FluentBit/AmSpelling.yml b/vale-styles/FluentBit/AmSpelling.yml
new file mode 100644
index 000000000..9a8ea2e30
--- /dev/null
+++ b/vale-styles/FluentBit/AmSpelling.yml
@@ -0,0 +1,8 @@
+extends: existence
+message: "In general, use American spelling instead of '%s'."
+link: 'https://developers.google.com/style/spelling'
+ignorecase: true
+level: suggestion
+tokens:
+ - '(?:\w+)nised?'
+ - '(?:\w+)logue'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Ampersand.yml b/vale-styles/FluentBit/Ampersand.yml
new file mode 100644
index 000000000..75117bc08
--- /dev/null
+++ b/vale-styles/FluentBit/Ampersand.yml
@@ -0,0 +1,9 @@
+---
+extends: existence
+message: "Don't use an ampersand in place of the word 'and'. Always write out 'and' unless the ampersand is part of a proper name."
+nonword: true
+ignorecase: false
+level: suggestion
+scope: sentence
+tokens:
+ - '[^\*{2}].*.&.*[^\*{2}]\n'
diff --git a/vale-styles/FluentBit/Colons.yml b/vale-styles/FluentBit/Colons.yml
new file mode 100644
index 000000000..aee9281c3
--- /dev/null
+++ b/vale-styles/FluentBit/Colons.yml
@@ -0,0 +1,8 @@
+extends: existence
+message: "'%s' should be in lowercase."
+link: 'https://developers.google.com/style/colons'
+nonword: true
+level: suggestion
+scope: sentence
+tokens:
+ - ':\s[A-Z]'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Contractions.yml b/vale-styles/FluentBit/Contractions.yml
new file mode 100644
index 000000000..525687ca6
--- /dev/null
+++ b/vale-styles/FluentBit/Contractions.yml
@@ -0,0 +1,30 @@
+extends: substitution
+message: "Feel free to use '%s' instead of '%s'."
+link: 'https://developers.google.com/style/contractions'
+level: suggestion
+ignorecase: true
+action:
+ name: replace
+swap:
+ are not: aren't
+ cannot: can't
+ could not: couldn't
+ did not: didn't
+ do not: don't
+ does not: doesn't
+ has not: hasn't
+ have not: haven't
+ how is: how's
+ is not: isn't
+ it is: it's
+ should not: shouldn't
+ that is: that's
+ they are: they're
+ was not: wasn't
+ we are: we're
+ we have: we've
+ were not: weren't
+ what is: what's
+ when is: when's
+ where is: where's
+ will not: won't
\ No newline at end of file
diff --git a/vale-styles/FluentBit/DateFormat.yml b/vale-styles/FluentBit/DateFormat.yml
new file mode 100644
index 000000000..c70b3b7bc
--- /dev/null
+++ b/vale-styles/FluentBit/DateFormat.yml
@@ -0,0 +1,9 @@
+extends: existence
+message: "Use 'July 31, 2016' format, not '%s'."
+link: 'https://developers.google.com/style/dates-times'
+ignorecase: true
+level: suggestion
+nonword: true
+tokens:
+ - '\d{1,2}(?:\.|/)\d{1,2}(?:\.|/)\d{4}'
+ - '\d{1,2} (?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)|May|Jun(?:e)|Jul(?:y)|Aug(?:ust)|Sep(?:tember)?|Oct(?:ober)|Nov(?:ember)?|Dec(?:ember)?) \d{4}'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Directional.yml b/vale-styles/FluentBit/Directional.yml
new file mode 100644
index 000000000..1d18a0f4d
--- /dev/null
+++ b/vale-styles/FluentBit/Directional.yml
@@ -0,0 +1,8 @@
+---
+extends: existence
+message: "Verify your use of '%s' with the Style Guide."
+level: suggestion
+ignorecase: true
+tokens:
+ - above
+ - below
diff --git a/vale-styles/FluentBit/DontUse.yml b/vale-styles/FluentBit/DontUse.yml
new file mode 100644
index 000000000..8308555d7
--- /dev/null
+++ b/vale-styles/FluentBit/DontUse.yml
@@ -0,0 +1,18 @@
+---
+extends: existence
+message: "We don't use '%s'."
+ignorecase: true
+level: suggestion
+tokens:
+ - a.k.a.
+ - aka
+ - and/or
+ - at this point
+ - desire
+ - it is recommended that
+ - just
+ - note that
+ - please
+ - quite
+ - such that
+ - thus
diff --git a/vale-styles/FluentBit/Drilldown.yml b/vale-styles/FluentBit/Drilldown.yml
new file mode 100644
index 000000000..ca7cb2ed7
--- /dev/null
+++ b/vale-styles/FluentBit/Drilldown.yml
@@ -0,0 +1,8 @@
+---
+extends: sequence
+message: "Use drilldown as an adjective or noun."
+level: suggestion
+ignorecase: true
+tokens:
+ - tag: NN|JJ
+ pattern: '(?:drill down|drill-down)'
diff --git a/vale-styles/FluentBit/DrilldownVerb.yml b/vale-styles/FluentBit/DrilldownVerb.yml
new file mode 100644
index 000000000..3dd6ca047
--- /dev/null
+++ b/vale-styles/FluentBit/DrilldownVerb.yml
@@ -0,0 +1,8 @@
+---
+extends: sequence
+message: "Use drill down as a verb."
+level: suggestion
+ignorecase: true
+tokens:
+ - tag: VB|VBD|VBG|VBN|VBP|VBZ
+ pattern: '(?:drilldown|drill-down)'
diff --git a/vale-styles/FluentBit/Ellipses.yml b/vale-styles/FluentBit/Ellipses.yml
new file mode 100644
index 000000000..6b93a0c44
--- /dev/null
+++ b/vale-styles/FluentBit/Ellipses.yml
@@ -0,0 +1,9 @@
+extends: existence
+message: "In general, don't use an ellipsis."
+link: 'https://developers.google.com/style/ellipses'
+nonword: true
+level: suggestion
+action:
+ name: remove
+tokens:
+ - '\.\.\.'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/EmDash.yml b/vale-styles/FluentBit/EmDash.yml
new file mode 100644
index 000000000..c0de45e89
--- /dev/null
+++ b/vale-styles/FluentBit/EmDash.yml
@@ -0,0 +1,12 @@
+extends: existence
+message: "Don't put a space before or after a dash."
+link: 'https://developers.google.com/style/dashes'
+nonword: true
+level: suggestion
+action:
+ name: edit
+ params:
+ - remove
+ - ' '
+tokens:
+ - '\s[—–]\s'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/EnDash.yml b/vale-styles/FluentBit/EnDash.yml
new file mode 100644
index 000000000..e69e4bcbb
--- /dev/null
+++ b/vale-styles/FluentBit/EnDash.yml
@@ -0,0 +1,13 @@
+extends: existence
+message: "Use an em dash ('—') instead of '–'."
+link: 'https://developers.google.com/style/dashes'
+nonword: true
+level: suggestion
+action:
+ name: edit
+ params:
+ - replace
+ - '-'
+ - '—'
+tokens:
+ - '–'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Exclamation.yml b/vale-styles/FluentBit/Exclamation.yml
new file mode 100644
index 000000000..b77798361
--- /dev/null
+++ b/vale-styles/FluentBit/Exclamation.yml
@@ -0,0 +1,7 @@
+extends: existence
+message: "Don't use exclamation points in text."
+link: 'https://developers.google.com/style/exclamation-points'
+nonword: true
+level: suggestion
+tokens:
+ - '\w!(?:\s|$)'
diff --git a/vale-styles/FluentBit/FirstPerson.yml b/vale-styles/FluentBit/FirstPerson.yml
new file mode 100644
index 000000000..e8793eddf
--- /dev/null
+++ b/vale-styles/FluentBit/FirstPerson.yml
@@ -0,0 +1,13 @@
+extends: existence
+message: "Avoid first-person pronouns such as '%s'."
+link: 'https://developers.google.com/style/pronouns#personal-pronouns'
+ignorecase: true
+level: suggestion
+nonword: true
+tokens:
+ - (?:^|\s)I\s
+ - (?:^|\s)I,\s
+ - \bI'm\b
+ - \bme\b
+ - \bmy\b
+ - \bmine\b
\ No newline at end of file
diff --git a/vale-styles/FluentBit/FutureTense.yml b/vale-styles/FluentBit/FutureTense.yml
new file mode 100644
index 000000000..17d1bd6a6
--- /dev/null
+++ b/vale-styles/FluentBit/FutureTense.yml
@@ -0,0 +1,10 @@
+---
+extends: existence
+message: "'%s' might be in future tense. Strive for active voice and present tense in your documentation."
+ignorecase: true
+level: suggestion
+raw:
+ - "(going to( |\n|[[:punct:]])[a-zA-Z]*|"
+ - "will( |\n|[[:punct:]])[a-zA-Z]*|"
+ - "won't( |\n|[[:punct:]])[a-zA-Z]*|"
+ - "[a-zA-Z]*'ll( |\n|[[:punct:]])[a-zA-Z]*)"
diff --git a/vale-styles/FluentBit/Gender.yml b/vale-styles/FluentBit/Gender.yml
new file mode 100644
index 000000000..9b5689ebd
--- /dev/null
+++ b/vale-styles/FluentBit/Gender.yml
@@ -0,0 +1,9 @@
+extends: existence
+message: "Don't use '%s' as a gender-neutral pronoun."
+link: 'https://developers.google.com/style/pronouns#gender-neutral-pronouns'
+level: suggestion
+ignorecase: true
+tokens:
+ - he/she
+ - s/he
+ - \(s\)he
\ No newline at end of file
diff --git a/vale-styles/FluentBit/GenderBias.yml b/vale-styles/FluentBit/GenderBias.yml
new file mode 100644
index 000000000..3a3b6985e
--- /dev/null
+++ b/vale-styles/FluentBit/GenderBias.yml
@@ -0,0 +1,45 @@
+extends: substitution
+message: "Consider using '%s' instead of '%s'."
+link: 'https://developers.google.com/style/inclusive-documentation'
+ignorecase: true
+level: suggestion
+swap:
+ (?:alumna|alumnus): graduate
+ (?:alumnae|alumni): graduates
+ air(?:m[ae]n|wom[ae]n): pilot(s)
+ anchor(?:m[ae]n|wom[ae]n): anchor(s)
+ authoress: author
+ camera(?:m[ae]n|wom[ae]n): camera operator(s)
+ chair(?:m[ae]n|wom[ae]n): chair(s)
+ congress(?:m[ae]n|wom[ae]n): member(s) of congress
+ door(?:m[ae]|wom[ae]n): concierge(s)
+ draft(?:m[ae]n|wom[ae]n): drafter(s)
+ fire(?:m[ae]n|wom[ae]n): firefighter(s)
+ fisher(?:m[ae]n|wom[ae]n): fisher(s)
+ fresh(?:m[ae]n|wom[ae]n): first-year student(s)
+ garbage(?:m[ae]n|wom[ae]n): waste collector(s)
+ lady lawyer: lawyer
+ ladylike: courteous
+ landlord: building manager
+ mail(?:m[ae]n|wom[ae]n): mail carriers
+ man and wife: husband and wife
+ man enough: strong enough
+ mankind: human kind
+ manmade: manufactured
+ manpower: personnel
+ men and girls: men and women
+ middle(?:m[ae]n|wom[ae]n): intermediary
+ news(?:m[ae]n|wom[ae]n): journalist(s)
+ ombuds(?:man|woman): ombuds
+ oneupmanship: upstaging
+ poetess: poet
+ police(?:m[ae]n|wom[ae]n): police officer(s)
+ repair(?:m[ae]n|wom[ae]n): technician(s)
+ sales(?:m[ae]n|wom[ae]n): salesperson or sales people
+ service(?:m[ae]n|wom[ae]n): soldier(s)
+ steward(?:ess)?: flight attendant
+ tribes(?:m[ae]n|wom[ae]n): tribe member(s)
+ waitress: waiter
+ woman doctor: doctor
+ woman scientist[s]?: scientist(s)
+ work(?:m[ae]n|wom[ae]n): worker(s)
\ No newline at end of file
diff --git a/vale-styles/FluentBit/HeadingPunctuation.yml b/vale-styles/FluentBit/HeadingPunctuation.yml
new file mode 100644
index 000000000..659260b27
--- /dev/null
+++ b/vale-styles/FluentBit/HeadingPunctuation.yml
@@ -0,0 +1,13 @@
+extends: existence
+message: "Don't put a period at the end of a heading."
+link: 'https://developers.google.com/style/capitalization#capitalization-in-titles-and-headings'
+nonword: true
+level: suggestion
+scope: heading
+action:
+ name: edit
+ params:
+ - remove
+ - '.'
+tokens:
+ - '[a-z0-9][.]\s*$'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Headings.yml b/vale-styles/FluentBit/Headings.yml
new file mode 100644
index 000000000..2be26b3b3
--- /dev/null
+++ b/vale-styles/FluentBit/Headings.yml
@@ -0,0 +1,73 @@
+extends: capitalization
+message: "'%s' should use sentence-style capitalization."
+link: 'https://developers.google.com/style/capitalization#capitalization-in-titles-and-headings'
+level: suggestion
+scope: heading
+match: $sentence
+indicators:
+ - ':'
+exceptions:
+ - Amazon
+ - Amazon CloudWatch
+ - Amazon Kinesis Firehose
+ - Amazon Kinesis Streams
+ - API
+ - APIs
+ - Azure
+ - BuildKite
+ - CircleCI
+ - CLI
+ - CloudWatch
+ - Code
+ - Collector
+ - Cosmos
+ - Crowdstrike
+ - cURL
+ - Datadog
+ - Docker
+ - DogStatsD
+ - Elastic Cloud
+ - Emmet
+ - EventBridge
+ - Fluent Bit
+ - GCP
+ - GitLab
+ - GitHub
+ - Google
+ - Google Cloud
+ - Google Cloud Platform
+ - Grafana
+ - gRPC
+ - I
+ - InfluxDB
+ - Kinesis
+ - Kubernetes
+ - LaunchDarkly
+ - Linux
+ - macOS
+ - Marketplace
+ - MongoDB
+ - New Relic
+ - Observability Platform
+ - Okta
+ - OpenMetrics
+ - OpenTelemetry
+ - Opsgenie
+ - PagerDuty
+ - Prometheus
+ - PromQL
+ - REPL
+ - ServiceMonitor
+ - SignalFx
+ - Slack
+ - StatsD
+ - Studio
+ - Tanzu
+ - Telemetry Pipeline
+ - Terraform
+ - TypeScript
+ - URLs
+ - VictorOps
+ - Visual
+ - VS
+ - Windows
diff --git a/vale-styles/FluentBit/Latin.yml b/vale-styles/FluentBit/Latin.yml
new file mode 100644
index 000000000..ca6f3abb7
--- /dev/null
+++ b/vale-styles/FluentBit/Latin.yml
@@ -0,0 +1,17 @@
+extends: substitution
+message: "Use '%s' instead of '%s'."
+link: 'https://developers.google.com/style/abbreviations'
+ignorecase: true
+level: suggestion
+nonword: true
+action:
+ name: replace
+swap:
+ '\b(?:eg|e\.g\.)[\s,]': for example
+ '\b(?:ie|i\.e\.)[\s,]': that is
+ 'ad-hoc': if needed
+ '[\s]et al[\s]': and others
+ '[\s]etc[\s|.]': and so on
+ '[\s]via[\s]': through or by using
+ 'vice versa': and the reverse
+ '[\s]vs[\s|.]': versus
diff --git a/vale-styles/FluentBit/LyHyphens.yml b/vale-styles/FluentBit/LyHyphens.yml
new file mode 100644
index 000000000..97e092a33
--- /dev/null
+++ b/vale-styles/FluentBit/LyHyphens.yml
@@ -0,0 +1,14 @@
+extends: existence
+message: "'%s' doesn't need a hyphen."
+link: 'https://developers.google.com/style/hyphens'
+level: suggestion
+ignorecase: false
+nonword: true
+action:
+ name: edit
+ params:
+ - replace
+ - '-'
+ - ' '
+tokens:
+ - '\s[^\s-]+ly-'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/MayMightCan.yml b/vale-styles/FluentBit/MayMightCan.yml
new file mode 100644
index 000000000..56ce95f06
--- /dev/null
+++ b/vale-styles/FluentBit/MayMightCan.yml
@@ -0,0 +1,7 @@
+---
+extends: existence
+message: "Use 'can' for permissions or 'might' for possibility."
+level: suggestion
+ignorecase: true
+tokens:
+ - may
diff --git a/vale-styles/FluentBit/NonStandardQuotes.yml b/vale-styles/FluentBit/NonStandardQuotes.yml
new file mode 100644
index 000000000..40feaafb3
--- /dev/null
+++ b/vale-styles/FluentBit/NonStandardQuotes.yml
@@ -0,0 +1,8 @@
+---
+extends: existence
+message: 'Use standard single quotes or double quotes only. Do not use left or right quotes.'
+level: suggestion
+ignorecase: true
+scope: raw
+raw:
+ - '[‘’“”]'
diff --git a/vale-styles/FluentBit/OptionalPlurals.yml b/vale-styles/FluentBit/OptionalPlurals.yml
new file mode 100644
index 000000000..50bf2c247
--- /dev/null
+++ b/vale-styles/FluentBit/OptionalPlurals.yml
@@ -0,0 +1,12 @@
+extends: existence
+message: "Don't use plurals in parentheses such as in '%s'."
+link: 'https://developers.google.com/style/plurals-parentheses'
+level: suggestion
+nonword: true
+action:
+ name: edit
+ params:
+ - remove
+ - '(s)'
+tokens:
+ - '\b\w+\(s\)'
diff --git a/vale-styles/FluentBit/Ordinal.yml b/vale-styles/FluentBit/Ordinal.yml
new file mode 100644
index 000000000..cd836d5a5
--- /dev/null
+++ b/vale-styles/FluentBit/Ordinal.yml
@@ -0,0 +1,7 @@
+extends: existence
+message: "Spell out all ordinal numbers ('%s') in text."
+link: 'https://developers.google.com/style/numbers'
+level: suggestion
+nonword: true
+tokens:
+ - \d+(?:st|nd|rd|th)
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Passive.yml b/vale-styles/FluentBit/Passive.yml
new file mode 100644
index 000000000..b52c01204
--- /dev/null
+++ b/vale-styles/FluentBit/Passive.yml
@@ -0,0 +1,184 @@
+extends: existence
+link: 'https://developers.google.com/style/voice'
+message: "In general, use active voice instead of passive voice ('%s')."
+ignorecase: true
+level: suggestion
+raw:
+ - \b(am|are|were|being|is|been|was|be)\b\s*
+tokens:
+ - '[\w]+ed'
+ - awoken
+ - beat
+ - become
+ - been
+ - begun
+ - bent
+ - beset
+ - bet
+ - bid
+ - bidden
+ - bitten
+ - bled
+ - blown
+ - born
+ - bought
+ - bound
+ - bred
+ - broadcast
+ - broken
+ - brought
+ - built
+ - burnt
+ - burst
+ - cast
+ - caught
+ - chosen
+ - clung
+ - come
+ - cost
+ - crept
+ - cut
+ - dealt
+ - dived
+ - done
+ - drawn
+ - dreamt
+ - driven
+ - drunk
+ - dug
+ - eaten
+ - fallen
+ - fed
+ - felt
+ - fit
+ - fled
+ - flown
+ - flung
+ - forbidden
+ - foregone
+ - forgiven
+ - forgotten
+ - forsaken
+ - fought
+ - found
+ - frozen
+ - given
+ - gone
+ - gotten
+ - ground
+ - grown
+ - heard
+ - held
+ - hidden
+ - hit
+ - hung
+ - hurt
+ - kept
+ - knelt
+ - knit
+ - known
+ - laid
+ - lain
+ - leapt
+ - learnt
+ - led
+ - left
+ - lent
+ - let
+ - lighted
+ - lost
+ - made
+ - meant
+ - met
+ - misspelt
+ - mistaken
+ - mown
+ - overcome
+ - overdone
+ - overtaken
+ - overthrown
+ - paid
+ - pled
+ - proven
+ - put
+ - quit
+ - read
+ - rid
+ - ridden
+ - risen
+ - run
+ - rung
+ - said
+ - sat
+ - sawn
+ - seen
+ - sent
+ - set
+ - sewn
+ - shaken
+ - shaven
+ - shed
+ - shod
+ - shone
+ - shorn
+ - shot
+ - shown
+ - shrunk
+ - shut
+ - slain
+ - slept
+ - slid
+ - slit
+ - slung
+ - smitten
+ - sold
+ - sought
+ - sown
+ - sped
+ - spent
+ - spilt
+ - spit
+ - split
+ - spoken
+ - spread
+ - sprung
+ - spun
+ - stolen
+ - stood
+ - stridden
+ - striven
+ - struck
+ - strung
+ - stuck
+ - stung
+ - stunk
+ - sung
+ - sunk
+ - swept
+ - swollen
+ - sworn
+ - swum
+ - swung
+ - taken
+ - taught
+ - thought
+ - thrived
+ - thrown
+ - thrust
+ - told
+ - torn
+ - trodden
+ - understood
+ - upheld
+ - upset
+ - wed
+ - wept
+ - withheld
+ - withstood
+ - woken
+ - won
+ - worn
+ - wound
+ - woven
+ - written
+ - wrung
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Periods.yml b/vale-styles/FluentBit/Periods.yml
new file mode 100644
index 000000000..0333c3160
--- /dev/null
+++ b/vale-styles/FluentBit/Periods.yml
@@ -0,0 +1,7 @@
+extends: existence
+message: "Don't use periods with acronyms or initialisms such as '%s'."
+link: 'https://developers.google.com/style/abbreviations'
+level: suggestion
+nonword: true
+tokens:
+ - '\b(?:[A-Z]\.){3,}'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Possessives.yml b/vale-styles/FluentBit/Possessives.yml
new file mode 100644
index 000000000..1d0a74e0c
--- /dev/null
+++ b/vale-styles/FluentBit/Possessives.yml
@@ -0,0 +1,7 @@
+---
+extends: existence
+message: "Rewrite '%s' to not use 's."
+level: suggestion
+ignorecase: true
+tokens:
+ - Bit's
diff --git a/vale-styles/FluentBit/Quotes.yml b/vale-styles/FluentBit/Quotes.yml
new file mode 100644
index 000000000..f9c927459
--- /dev/null
+++ b/vale-styles/FluentBit/Quotes.yml
@@ -0,0 +1,7 @@
+extends: existence
+message: "Commas and periods go inside quotation marks."
+link: 'https://developers.google.com/style/quotation-marks'
+level: suggestion
+nonword: true
+tokens:
+ - '"[^"]+"[.,?]'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Ranges.yml b/vale-styles/FluentBit/Ranges.yml
new file mode 100644
index 000000000..78af6f999
--- /dev/null
+++ b/vale-styles/FluentBit/Ranges.yml
@@ -0,0 +1,7 @@
+extends: existence
+message: "Don't add words such as 'from' or 'between' to describe a range of numbers."
+link: 'https://developers.google.com/style/hyphens'
+nonword: true
+level: suggestion
+tokens:
+ - '(?:from|between)\s\d+\s?-\s?\d+'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Repetition.yml b/vale-styles/FluentBit/Repetition.yml
new file mode 100644
index 000000000..a5158b8b9
--- /dev/null
+++ b/vale-styles/FluentBit/Repetition.yml
@@ -0,0 +1,7 @@
+---
+extends: repetition
+message: '"%s" is repeated.'
+level: suggestion
+alpha: true
+tokens:
+ - '[^\s]+'
diff --git a/vale-styles/FluentBit/SentenceLengthLong.yml b/vale-styles/FluentBit/SentenceLengthLong.yml
new file mode 100644
index 000000000..556580b10
--- /dev/null
+++ b/vale-styles/FluentBit/SentenceLengthLong.yml
@@ -0,0 +1,7 @@
+---
+extends: occurrence
+message: "Improve readability by using fewer than 35 words in this sentence."
+scope: sentence
+level: suggestion
+max: 35
+token: \b(\w+)\b
diff --git a/vale-styles/FluentBit/Simplicity.yml b/vale-styles/FluentBit/Simplicity.yml
new file mode 100644
index 000000000..e9b779763
--- /dev/null
+++ b/vale-styles/FluentBit/Simplicity.yml
@@ -0,0 +1,12 @@
+---
+extends: existence
+message: 'Avoid words like "%s" that imply ease of use, because the user may find this action difficult.'
+level: suggestion
+ignorecase: true
+tokens:
+ - easy
+ - easily
+ - handy
+ - simple
+ - simply
+ - useful
diff --git a/vale-styles/FluentBit/Slang.yml b/vale-styles/FluentBit/Slang.yml
new file mode 100644
index 000000000..b43eeb299
--- /dev/null
+++ b/vale-styles/FluentBit/Slang.yml
@@ -0,0 +1,11 @@
+extends: existence
+message: "Don't use internet slang abbreviations such as '%s'."
+link: 'https://developers.google.com/style/abbreviations'
+ignorecase: true
+level: suggestion
+tokens:
+ - 'tl;dr'
+ - ymmv
+ - rtfm
+ - imo
+ - fwiw
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Spacing.yml b/vale-styles/FluentBit/Spacing.yml
new file mode 100644
index 000000000..57c52f046
--- /dev/null
+++ b/vale-styles/FluentBit/Spacing.yml
@@ -0,0 +1,8 @@
+extends: existence
+message: "'%s' should have one space."
+link: 'https://developers.google.com/style/sentence-spacing'
+level: suggestion
+nonword: true
+tokens:
+ - '[a-z][.?!] {2,}[A-Z]'
+ - '[a-z][.?!][A-Z]'
\ No newline at end of file
diff --git a/vale-styles/FluentBit/Spelling-exceptions.txt b/vale-styles/FluentBit/Spelling-exceptions.txt
new file mode 100644
index 000000000..fa3157ce2
--- /dev/null
+++ b/vale-styles/FluentBit/Spelling-exceptions.txt
@@ -0,0 +1,179 @@
+accessor
+Alertmanager
+allowlist
+API
+APIs
+Appname
+autoscale
+autoscaler
+autoscaling
+backoff
+Blackhole
+blocklist
+Buildkite
+cAdvisor
+Calyptia
+chronotf
+clickstreams
+CloudWatch
+Config
+Coralogix
+Crowdstrike
+CRDs
+DaemonSet
+Datadog
+Datagen
+datapoint
+datapoints
+Datastream
+declaratively
+deduplicate
+Deployer
+deprovision
+deprovisioned
+deprovisioning
+deprovisions
+Devo
+DogStatsD
+downsample
+downsampled
+downsamples
+downsampling
+downscale
+downscaling
+downscales
+dri
+Dynatrace
+Elasticsearch
+endcode
+endhint
+endtab
+endtabs
+Exabeam
+Fargate
+Firehose
+FluentBit
+Fluentd
+Golang
+golib
+Grafana
+Graphite
+Greylog
+grpc_code
+grpc_method
+grpc_service
+gzip
+HashiCorp
+hostname
+Hostname
+Ingester
+Keepalive
+Istio
+keepalive
+Kinesis
+kubectl
+kubelet
+Kubernetes
+Kusto
+labelset
+loadgenerator
+Logstash
+Lua
+Lucene
+macOS
+Mandiant
+matchers
+Minishift
+minikube
+MTTx
+namespace
+namespaces
+Nginx
+OAuth
+Okta
+Oniguruma
+OpenTelemetry
+Opsgenie
+OTel
+PagerDuty
+performant
+persistable
+Postgres
+PowerShell
+prepopulate
+Profiler
+Prometheus
+PromQL
+Protobuf
+proxying
+Pulumi
+Pushgateway
+quantile
+quantiles
+queryable
+Queryable
+rdkafka
+Redpanda
+rollup
+Rollup
+rollups
+Rollups
+routable
+runbook
+runbooks
+Scalyr
+SDKs
+SELinux
+serverless
+ServiceDiscovery
+ServiceMonitor
+ServiceMonitors
+sharding
+SignalFx
+Signup
+sparkline
+sparklines
+Sparklines
+Splunk
+Stackdriver
+StatsD
+stderr
+stdout
+strftime
+subcommand
+subcommands
+subquery
+subrecord
+substring
+syslog
+systemctl
+Systemd
+Tanzu
+Telegraf
+templated
+temporality
+Terraform
+Thanos
+Timeshift
+tolerations
+tooltip
+tooltips
+uber
+unaggregated
+unary
+Unary
+unmuted
+unsort
+UUIDs
+Vectra
+Vercel
+VictoriaMetrics
+VictorOps
+Vivo
+VMs
+Wavefront
+Worldmap
+Zipkin
+Zsh
+Zstandard
+zstd
diff --git a/vale-styles/FluentBit/Spelling.yml b/vale-styles/FluentBit/Spelling.yml
new file mode 100644
index 000000000..46cc6dc74
--- /dev/null
+++ b/vale-styles/FluentBit/Spelling.yml
@@ -0,0 +1,5 @@
+extends: spelling
+message: "Spelling check: '%s'?"
+level: suggestion
+ignore:
+ - FluentBit/Spelling-exceptions.txt
diff --git a/vale-styles/FluentBit/Subjunctive.yml b/vale-styles/FluentBit/Subjunctive.yml
new file mode 100644
index 000000000..f7087389c
--- /dev/null
+++ b/vale-styles/FluentBit/Subjunctive.yml
@@ -0,0 +1,14 @@
+---
+extends: existence
+message: "Use the indicative or imperative moods when writing docs."
+ignorecase: true
+level: suggestion
+tokens:
+ - should
+ - shouldn't
+ - should not
+ - won't
+ - would
+ - wouldn't
+ - could
+ - couldn't
diff --git a/vale-styles/FluentBit/Terms.yml b/vale-styles/FluentBit/Terms.yml
new file mode 100644
index 000000000..6ad4cf2e3
--- /dev/null
+++ b/vale-styles/FluentBit/Terms.yml
@@ -0,0 +1,13 @@
+---
+extends: substitution
+message: Use '%s' instead of '%s'.
+level: suggestion
+ignorecase: true
+scope: paragraph
+action:
+ name: replace
+swap:
+ datapoints: data points
+ Terraform Provider: Terraform provider
+ timeseries: time series
+ topology: placement
diff --git a/vale-styles/FluentBit/Units.yml b/vale-styles/FluentBit/Units.yml
new file mode 100644
index 000000000..786c1d8bb
--- /dev/null
+++ b/vale-styles/FluentBit/Units.yml
@@ -0,0 +1,11 @@
+extends: existence
+message: "Put a nonbreaking space between the number and the unit in '%s'."
+link: 'https://developers.google.com/style/units-of-measure'
+nonword: true
+level: suggestion
+tokens:
+ - \d+(?:B|kB|MB|GB|TB)
+ - \d+(?:ns|ms|s|min|h|d)
+
+exceptions:
+ - k3s
diff --git a/vale-styles/FluentBit/UserFocus.yml b/vale-styles/FluentBit/UserFocus.yml
new file mode 100644
index 000000000..340ad71e7
--- /dev/null
+++ b/vale-styles/FluentBit/UserFocus.yml
@@ -0,0 +1,17 @@
+---
+extends: existence
+message: "Rewrite to put the focus on what the user wants to do rather than on how the document is laid out."
+ignorecase: true
+level: suggestion
+tokens:
+ - The purpose of this document is
+ - This document (?:describes|explains|shows)
+ - This page (?:describes|explains|shows)
+ - This page (?:describes|explains|shows)
+ - This document (?:describes|explains|shows)
+ - This section (?:describes|explains|shows)
+ - This section (?:describes|explains|shows)
+ - The following page (?:describes|explains|shows)
+ - The following document (?:describes|explains|shows)
+ - The following section (?:describes|explains|shows)
+ - This topic (?:describes|explains|shows)
diff --git a/vale-styles/FluentBit/We.yml b/vale-styles/FluentBit/We.yml
new file mode 100644
index 000000000..ffe69e65d
--- /dev/null
+++ b/vale-styles/FluentBit/We.yml
@@ -0,0 +1,11 @@
+extends: existence
+message: "Try to avoid using first-person plural like '%s'."
+link: 'https://developers.google.com/style/pronouns#personal-pronouns'
+level: suggestion
+ignorecase: true
+tokens:
+ - we
+ - we'(?:ve|re)
+ - ours?
+ - us
+ - let's
\ No newline at end of file
diff --git a/vale-styles/FluentBit/WordList.yml b/vale-styles/FluentBit/WordList.yml
new file mode 100644
index 000000000..aca77474a
--- /dev/null
+++ b/vale-styles/FluentBit/WordList.yml
@@ -0,0 +1,79 @@
+extends: substitution
+message: "Use '%s' instead of '%s'."
+link: 'https://developers.google.com/style/word-list'
+level: suggestion
+ignorecase: false
+action:
+ name: replace
+swap:
+ '(?:API Console|dev|developer) key': API key
+ '(?:cell ?phone|smart ?phone)': phone|mobile phone
+ '(?:dev|developer|APIs) console': API console
+ '(?:e-mail|Email|E-mail)': email
+ '(?:file ?path|path ?name)': path
+ '(?:kill|terminate|abort)': stop|exit|cancel|end
+ '(?:OAuth ?2|Oauth)': OAuth 2.0
+ '(?:ok|Okay)': OK|okay
+ '(?:WiFi|wifi)': Wi-Fi
+ '[\.]+apk': APK
+ '3\-D': 3D
+ 'Google (?:I\-O|IO)': Google I/O
+ 'tap (?:&|and) hold': touch & hold
+ 'un(?:check|select)': clear
+ # account name: username
+ action bar: app bar
+ admin: administrator
+ Ajax: AJAX
+ allows you to: lets you
+ Android device: Android-powered device
+ android: Android
+ API explorer: APIs Explorer
+ application: app
+ approx\.: approximately
+ as well as: and
+ authN: authentication
+ authZ: authorization
+ autoupdate: automatically update
+ cellular data: mobile data
+ cellular network: mobile network
+ chapter: documents|pages|sections
+ check box: checkbox
+ check: select
+ click on: click|click in
+ Container Engine: Kubernetes Engine
+ content type: media type
+ curated roles: predefined roles
+ data are: data is
+ Developers Console: Google API Console|API Console
+ ephemeral IP address: ephemeral external IP address
+ fewer data: less data
+ file name: filename
+ firewalls: firewall rules
+ functionality: capability|feature
+ Google account: Google Account
+ Google accounts: Google Accounts
+ Googling: search with Google
+ grayed-out: unavailable
+ HTTPs: HTTPS
+ in order to: to
+ # ingest: import|load
+ k8s: Kubernetes
+ long press: touch & hold
+ network IP address: internal IP address
+ omnibox: address bar
+ open-source: open source
+ overview screen: recents screen
+ regex: regular expression
+ SHA1: SHA-1|HAS-SHA1
+ sign into: sign in to
+ \w* ?sign-?on: single sign-on
+ static IP address: static external IP address
+ stylesheet: style sheet
+ synch: sync
+ tablename: table name
+ tablet: device
+ touch: tap
+ url: URL
+ vs\.: versus
+ wish: want
+ World Wide Web: web