diff --git a/SUMMARY.md b/SUMMARY.md index 432b8fff3..0390e31aa 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -49,6 +49,15 @@ ## Administration * [Configuring Fluent Bit](administration/configuring-fluent-bit/README.md) + * [YAML Configuration Sections](administration/configuring-fluent-bit/yaml/README.md) + * [Service](administration/configuring-fluent-bit/yaml/service-section.md) + * [Parsers](administration/configuring-fluent-bit/yaml/parsers-section.md) + * [Multiline Parsers](administration/configuring-fluent-bit/yaml/multiline-parsers-section.md) + * [Pipeline](administration/configuring-fluent-bit/yaml/pipeline-section.md) + * [Environment Variables](administration/configuring-fluent-bit/yaml/environment-variables-section.md) + * [Includes](administration/configuring-fluent-bit/yaml/includes-section.md) + + * [Configuration File](administration/configuring-fluent-bit/yaml/configuration-file.md) * [Classic mode](administration/configuring-fluent-bit/classic-mode/README.md) * [Format and Schema](administration/configuring-fluent-bit/classic-mode/format-schema.md) * [Configuration File](administration/configuring-fluent-bit/classic-mode/configuration-file.md) @@ -56,8 +65,6 @@ * [Commands](administration/configuring-fluent-bit/classic-mode/commands.md) * [Upstream Servers](administration/configuring-fluent-bit/classic-mode/upstream-servers.md) * [Record Accessor](administration/configuring-fluent-bit/classic-mode/record-accessor.md) - * [YAML Configuration](administration/configuring-fluent-bit/yaml/README.md) - * [Configuration File](administration/configuring-fluent-bit/yaml/configuration-file.md) * [Unit Sizes](administration/configuring-fluent-bit/unit-sizes.md) * [Multiline Parsing](administration/configuring-fluent-bit/multiline-parsing.md) * [Transport Security](administration/transport-security.md) diff --git a/administration/configuring-fluent-bit/README.md b/administration/configuring-fluent-bit/README.md index 794d41950..29498ed79 100644 --- a/administration/configuring-fluent-bit/README.md +++ b/administration/configuring-fluent-bit/README.md @@ -1,14 +1,13 @@ # Configuring Fluent Bit -Fluent Bit supports these configuration formats: +Currently, Fluent Bit supports two configuration formats: -- [Classic mode](classic-mode/README.md) -- [YAML](yaml/README.md) (Fluent Bit 2.0 or greater) +* [Yaml](yaml/README.md): standard configuration format as of v3.2. +* [Classic mode](classic-mode/README.md): to be deprecated at the end of 2025. -## CLI flags +## Command line interface -Fluent Bit also supports a CLI with various flags for the available configuration -options. +Fluent Bit exposes most of it features through the command line interface. Running the `-h` option you can get a list of the options available: ```shell $ docker run --rm -it fluent/fluent-bit --help diff --git a/administration/configuring-fluent-bit/yaml/README.md b/administration/configuring-fluent-bit/yaml/README.md index ce6e7f8c4..3efe405bf 100644 --- a/administration/configuring-fluent-bit/yaml/README.md +++ b/administration/configuring-fluent-bit/yaml/README.md @@ -1,3 +1,44 @@ -# Fluent Bit YAML configuration +# Fluent Bit YAML Configuration -YAML configuration feature was introduced since FLuent Bit version 1.9 as experimental, and it is production ready since Fluent Bit 2.0. +## Before You Get Started + +Fluent Bit traditionally offered a `classic` configuration mode, a custom configuration format that we are gradually phasing out. While `classic` mode has served well for many years, it has several limitations. Its basic design only supports grouping sections with key-value pairs and lacks the ability to handle sub-sections or complex data structures like lists. + +YAML, now a mainstream configuration format, has become essential in a cloud ecosystem where everything is configured this way. To minimize friction and provide a more intuitive experience for creating data pipelines, we strongly encourage users to transition to YAML. The YAML format enables features, such as processors, that are not possible to configure in `classic` mode. + +As of Fluent Bit v3.2, you can configure everything in YAML. + +## List of Available Sections + +Configuring Fluent Bit with YAML introduces the following root-level sections: + +| Section Name |Description | +|----------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------| +| `service` | Describes the global configuration for the Fluent Bit service. This section is optional; if not set, default values will apply. Only one `service` section can be defined. | +| `parsers` | Lists parsers to be used by components like inputs, processors, filters, or output plugins. You can define multiple `parsers` sections, which can also be loaded from external files included in the main YAML configuration. | +| `multiline_parsers` | Lists multiline parsers, functioning similarly to `parsers`. Multiple definitions can exist either in the root or in included files. | +| `pipeline` | Defines a pipeline composed of inputs, processors, filters, and output plugins. You can define multiple `pipeline` sections, but they will not operate independently. Instead, all components will be merged into a single pipeline internally. | +| `plugins` | Specifies the path to external plugins (.so files) to be loaded by Fluent Bit at runtime. | +| `upstream_servers` | Refers to a group of node endpoints that can be referenced by output plugins that support this feature. | +| `env` | Sets a list of environment variables for Fluent Bit. Note that system environment variables are available, while the ones defined in the configuration apply only to Fluent Bit. | + +## Section Documentation + +To access detailed configuration guides for each section, use the following links: + +- [Service Section documentation](service-section.md) + - Overview of global settings, configuration options, and examples. +- [Parsers Section documentation](parsers-section.md) + - Detailed guide on defining parsers and supported formats. +- [Multiline Parsers Section documentation](multiline-parsers-section.md) + - Explanation of multiline parsing configuration. +- [Pipeline Section documentation](pipeline-section.md) + - Details on setting up pipelines and using processors. +- [Plugins Section documentation](plugins-section.md) + - How to load external plugins. +- [Upstreams Section documentation](upstream-servers-section.md) + - Guide on setting up and using upstream nodes with supported plugins. +- [Environment Variables Section documentation](environment-variables-section.md) + - Information on setting environment variables and their scope within Fluent Bit. +- [Includes Section documentation](includes-section.md) + - Description on how to include external YAML files. diff --git a/administration/configuring-fluent-bit/yaml/environment-variables-section.md b/administration/configuring-fluent-bit/yaml/environment-variables-section.md new file mode 100644 index 000000000..7ca377ac2 --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/environment-variables-section.md @@ -0,0 +1,61 @@ +# Environment Variables Section + +The `env` section allows you to define environment variables directly within the configuration file. These variables can then be used to dynamically replace values throughout your configuration using the `${VARIABLE_NAME}` syntax. + +Values set in the `env` section are case-sensitive. However, as a best practice, we recommend using uppercase names for environment variables. The example below defines two variables, `FLUSH_INTERVAL` and `STDOUT_FMT`, which can be accessed in the configuration using `${FLUSH_INTERVAL}` and `${STDOUT_FMT}`: + +```yaml +env: + FLUSH_INTERVAL: 1 + STDOUT_FMT: 'json_lines' + +service: + flush: ${FLUSH_INTERVAL} + log_level: info + +pipeline: + inputs: + - name: random + + outputs: + - name: stdout + match: '*' + format: ${STDOUT_FMT} +``` + +## Predefined Variables + +Fluent Bit provides a set of predefined environment variables that can be used in your configuration: + +| Name | Description | +|--|--| +| `${HOSTNAME}` | The system’s hostname. | + +## External Variables + +In addition to variables defined in the configuration file or the predefined ones, Fluent Bit can access system environment variables set in the user space. These external variables can be referenced in the configuration using the same ${VARIABLE_NAME} pattern. + +For example, to set the FLUSH_INTERVAL system environment variable to 2 and use it in your configuration: + +```bash +export FLUSH_INTERVAL=2 +``` + +In the configuration file, you can then access this value as follows: + +```yaml +service: + flush: ${FLUSH_INTERVAL} + log_level: info + +pipeline: + inputs: + - name: random + + outputs: + - name: stdout + match: '*' + format: json_lines +``` + +This approach allows you to easily manage and override configuration values using environment variables, providing flexibility in various deployment environments. diff --git a/administration/configuring-fluent-bit/yaml/includes-section.md b/administration/configuring-fluent-bit/yaml/includes-section.md new file mode 100644 index 000000000..c36e4b755 --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/includes-section.md @@ -0,0 +1,32 @@ +# Includes Section + +The `includes` section allows you to specify additional YAML configuration files to be merged into the current configuration. These files are identified as a list of filenames and can include relative or absolute paths. If no absolute path is provided, the file is assumed to be located in a directory relative to the file that references it. + +This feature is useful for organizing complex configurations into smaller, manageable files and including them as needed. + +### Usage + +Below is an example demonstrating how to include additional YAML files using relative path references. This is the file system path structure + +``` +├── fluent-bit.yaml +├── inclusion-1.yaml +└── subdir + └── inclusion-2.yaml +``` + +The content of `fluent-bit.yaml` + +```yaml +includes: + - inclusion-1.yaml + - subdir/inclusion-2.yaml +``` + +## Key Points + +- Relative Paths: If a path is not specified as absolute, it will be treated as relative to the file that includes it. + +- Organized Configurations: Using the includes section helps keep your configuration modular and easier to maintain. + +> note: Ensure that the included files are formatted correctly and contain valid YAML configurations for seamless integration. diff --git a/administration/configuring-fluent-bit/yaml/multiline-parsers-section.md b/administration/configuring-fluent-bit/yaml/multiline-parsers-section.md new file mode 100644 index 000000000..340fdea28 --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/multiline-parsers-section.md @@ -0,0 +1,26 @@ +# Multiline Parsers + +Multiline parsers are used to combine logs that span multiple events into a single, cohesive message. This is particularly useful for handling stack traces, error logs, or any log entry that contains multiple lines of information. + +In YAML configuration, the syntax for defining multiline parsers differs slightly from the classic configuration format introducing minor breaking changes, specifically on how the rules are defined. + +Below is an example demonstrating how to define a multiline parser directly in the main configuration file, as well as how to include additional definitions from external files: + +```yaml +multiline_parsers: + - name: multiline-regex-test + type: regex + flush_timeout: 1000 + rules: + - state: start_state + regex: '/([a-zA-Z]+ \d+ \d+:\d+:\d+)(.*)/' + next_state: cont + - state: cont + regex: '/^\s+at.*/' + next_state: cont +``` + +The example above defines a multiline parser named `multiline-regex-test` that uses regular expressions to handle multi-event logs. The parser contains two rules: the first rule transitions from start_state to cont when a matching log entry is detected, and the second rule continues to match subsequent lines. + +For more detailed information on configuring multiline parsers, including advanced options and use cases, please refer to the Configuring Multiline Parsers section. + diff --git a/administration/configuring-fluent-bit/yaml/parsers-section.md b/administration/configuring-fluent-bit/yaml/parsers-section.md new file mode 100644 index 000000000..96b6fd4f9 --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/parsers-section.md @@ -0,0 +1,23 @@ +# Parsers Section + +Parsers enable Fluent Bit components to transform unstructured data into a structured internal representation. You can define parsers either directly in the main configuration file or in separate external files for better organization. + +This page provides a general overview of how to declare parsers. + +The main section name is `parsers`, and it allows you to define a list of parser configurations. The following example demonstrates how to set up two simple parsers: + +```yaml +parsers: + - name: json + format: json + + - name: docker + format: json + time_key: time + time_format: "%Y-%m-%dT%H:%M:%S.%L" + time_keep: true +``` + +You can define multiple parsers sections, either within the main configuration file or distributed across included files. + +For more detailed information on parser options and advanced configurations, please refer to the [Configuring Parsers]() section. diff --git a/administration/configuring-fluent-bit/yaml/pipeline-section.md b/administration/configuring-fluent-bit/yaml/pipeline-section.md new file mode 100644 index 000000000..421a941bb --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/pipeline-section.md @@ -0,0 +1,149 @@ +# Pipeline Section + +The `pipeline` section defines the flow of how data is collected, processed, and sent to its final destination. It encompasses the following core concepts: + +| Name | Description | +|---|---| +| `inputs` | Specifies the name of the plugin responsible for collecting or receiving data. This component serves as the data source in the pipeline. Examples of input plugins include `tail`, `http`, and `random`. | +| `processors` | **Unique to YAML configuration**, processors are specialized plugins that handle data processing directly attached to input plugins. Unlike filters, processors are not dependent on tag or matching rules. Instead, they work closely with the input to modify or enrich the data before it reaches the filtering or output stages. Processors are defined within an input plugin section. | +| `filters` | Filters are used to transform, enrich, or discard events based on specific criteria. They allow matching tags using strings or regular expressions, providing a more flexible way to manipulate data. Filters run as part of the main event loop and can be applied across multiple inputs and filters. Examples of filters include `modify`, `grep`, and `nest`. | +| `outputs` | Defines the destination for processed data. Outputs specify where the data will be sent, such as to a remote server, a file, or another service. Each output plugin is configured with matching rules to determine which events are sent to that destination. Common output plugins include `stdout`, `elasticsearch`, and `kafka`. | + +## Example Configuration + +Here’s a simple example of a pipeline configuration: + +```yaml +pipeline: + inputs: + - name: tail + path: /var/log/example.log + parser: json + + processors: + logs: + - name: record_modifier + filters: + - name: grep + match: '*' + regex: key pattern + + outputs: + - name: stdout + match: '*' +``` + +## Pipeline Processors + +Processors operate on specific signals such as logs, metrics, and traces. They are attached to an input plugin and must specify the signal type they will process. + +### Example of a Processor + +In the example below, the content_modifier processor inserts or updates (upserts) the key my_new_key with the value 123 for all log records generated by the tail plugin. This processor is only applied to log signals: + +```yaml +parsers: + - name: json + format: json + +pipeline: + inputs: + - name: tail + path: /var/log/example.log + parser: json + + processors: + logs: + - name: content_modifier + action: upsert + key: my_new_key + value: 123 + filters: + - name: grep + match: '*' + regex: key pattern + + outputs: + - name: stdout + match: '*' +``` + +Here is a more complete example with multiple processors: + +```yaml +service: + log_level: info + http_server: on + http_listen: 0.0.0.0 + http_port: 2021 + +pipeline: + inputs: + - name: random + tag: test-tag + interval_sec: 1 + processors: + logs: + - name: modify + add: hostname monox + - name: lua + call: append_tag + code: | + function append_tag(tag, timestamp, record) + new_record = record + new_record["tag"] = tag + return 1, timestamp, new_record + end + + outputs: + - name: stdout + match: '*' + processors: + logs: + - name: lua + call: add_field + code: | + function add_field(tag, timestamp, record) + new_record = record + new_record["output"] = "new data" + return 1, timestamp, new_record + end +``` + +You might noticed that processors not only can be attached to input, but also to an output. + +### How Are Processors Different from Filters? + +While processors and filters are similar in that they can transform, enrich, or drop data from the pipeline, there is a significant difference in how they operate: + +- Processors: Run in the same thread as the input plugin when the input plugin is configured to be threaded (threaded: true). This design provides better performance, especially in multi-threaded setups. + +- Filters: Run in the main event loop. When multiple filters are used, they can introduce performance overhead, particularly under heavy workloads. + +## Running Filters as Processors + +You can configure existing [Filters](https://docs.fluentbit.io/manual/pipeline/filters) to run as processors. There are no specific changes needed; you simply use the filter name as if it were a native processor. + +### Example of a Filter Running as a Processor + +In the example below, the grep filter is used as a processor to filter log events based on a pattern: + +```yaml +parsers: + - name: json + format: json + +pipeline: + inputs: + - name: tail + path: /var/log/example.log + parser: json + + processors: + logs: + - name: grep + regex: log aa + outputs: + - name: stdout + match: '*' +``` diff --git a/administration/configuring-fluent-bit/yaml/plugins-section.md b/administration/configuring-fluent-bit/yaml/plugins-section.md new file mode 100644 index 000000000..c3df7be12 --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/plugins-section.md @@ -0,0 +1,54 @@ +# Plugins Section + +While Fluent Bit comes with a variety of built-in plugins, it also supports loading external plugins at runtime. This feature is especially useful for loading Go or Wasm plugins that are built as shared object files (.so). Fluent Bit's YAML configuration provides two ways to load these external plugins: + +## 1. Inline YAML Section + +You can specify external plugins directly within your main YAML configuration file using the `plugins` section. Here’s an example: + +```yaml +plugins: + - /path/to/out_gstdout.so + +service: + log_level: info + +pipeline: + inputs: + - name: random + + outputs: + - name: gstdout + match: '*' +``` + +## 2. YAML Plugins File Included via plugins_file Option + +Alternatively, you can load external plugins from a separate YAML file by specifying the plugins_file option in the service section. Here’s how to configure this: + +```yaml +service: + log_level: info + plugins_file: extra_plugins.yaml + +pipeline: + inputs: + - name: random + + outputs: + - name: gstdout + match: '*' +``` + +In this setup, the `extra_plugins.yaml` file might contain the following plugins section: + +```yaml +plugins: + - /other/path/to/out_gstdout.so +``` + +### Key Points + +- Built-in vs. External: Fluent Bit comes with many built-in plugins, but you can load external plugins at runtime to extend the tool’s functionality. +- Loading Mechanism: External plugins must be shared object files (.so). You can define them inline in the main YAML configuration or include them from a separate YAML file for better modularity. + diff --git a/administration/configuring-fluent-bit/yaml/service-section.md b/administration/configuring-fluent-bit/yaml/service-section.md new file mode 100644 index 000000000..b3aca6665 --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/service-section.md @@ -0,0 +1,44 @@ +## Service Section + +The `service` section defines global properties of the service. The available configuration keys are: + +| Key | Description | Default | +|---|---|---| +| `flush` | Sets the flush time in `seconds.nanoseconds`. The engine loop uses a flush timeout to determine when to flush records ingested by input plugins to output plugins. | `1` | +| `grace` | Sets the grace time in `seconds` as an integer value. The engine loop uses a grace timeout to define the wait time before exiting. | `5` | +| `daemon` | Boolean. Specifies whether Fluent Bit should run as a daemon (background process). Allowed values are: `yes`, `no`, `on`, and `off`. Do not enable when using a Systemd-based unit, such as the one provided in Fluent Bit packages. | `off` | +| `dns.mode` | Sets the primary transport layer protocol used by the asynchronous DNS resolver. Can be overridden on a per-plugin basis. | `UDP` | +| `log_file` | Absolute path for an optional log file. By default, all logs are redirected to the standard error interface (stderr). | _none_ | +| `log_level` | Sets the logging verbosity level. Allowed values are: `off`, `error`, `warn`, `info`, `debug`, and `trace`. Values are cumulative. If `debug` is set, it will include `error`, `warn`, `info`, and `debug`. Trace mode is only available if Fluent Bit was built with the _`WITH_TRACE`_ option enabled. | `info` | +| `parsers_file` | Path for a `parsers` configuration file. Multiple `parsers_file` entries can be defined within the section. However, with the new YAML configuration schema, defining parsers using this key is now optional. Parsers can be declared directly in the `parsers` section of your YAML configuration, offering a more streamlined and integrated approach. | _none_ | +| `plugins_file` | Path for a `plugins` configuration file. This file specifies the paths to external plugins (.so files) that Fluent Bit can load at runtime. With the new YAML schema, the `plugins_file` key is optional. External plugins can now be referenced directly within the `plugins` section, simplifying the plugin management process. [See an example](https://github.com/fluent/fluent-bit/blob/master/conf/plugins.conf). | _none_ | +| `streams_file` | Path for the Stream Processor configuration file. This file defines the rules and operations for stream processing within Fluent Bit. The `streams_file` key is optional, as Stream Processor configurations can be defined directly in the `streams` section of the YAML schema. This flexibility allows for easier and more centralized configuration. [Learn more about Stream Processing configuration](../../../stream-processing/introduction.md). | _none_ | +| `http_server` | Enables the built-in HTTP Server. | `off` | +| `http_listen` | Sets the listening interface for the HTTP Server when it's enabled. | `0.0.0.0` | +| `http_port` | Sets the TCP port for the HTTP Server. | `2020` | +| `coro_stack_size` | Sets the coroutine stack size in bytes. The value must be greater than the page size of the running system. Setting the value too small (`4096`) can cause coroutine threads to overrun the stack buffer. The default value of this parameter should not be changed. | `24576` | +| `scheduler.cap` | Sets a maximum retry time in seconds. Supported in v1.8.7 and greater. | `2000` | +| `scheduler.base` | Sets the base of exponential backoff. Supported in v1.8.7 and greater. | `5` | +| `json.convert_nan_to_null` | If enabled, `NaN` is converted to `null` when Fluent Bit converts `msgpack` to `json`. | `false` | +| `sp.convert_from_str_to_num` | If enabled, the Stream Processor converts strings that represent numbers to a numeric type. | `true` | + +### Configuration Example + +Below is a simple configuration example that defines a `service` section and a pipeline with a `random` input and `stdout` output: + +```yaml +service: + flush: 1 + log_level: info + http_server: true + http_listen: 0.0.0.0 + http_port: 2020 + +pipeline: + inputs: + - name: random + + outputs: + - name: stdout + match: '*' +``` diff --git a/administration/configuring-fluent-bit/yaml/upstream-servers-section.md b/administration/configuring-fluent-bit/yaml/upstream-servers-section.md new file mode 100644 index 000000000..e9f13e00c --- /dev/null +++ b/administration/configuring-fluent-bit/yaml/upstream-servers-section.md @@ -0,0 +1,46 @@ +# Upstream Servers Section + +The `Upstream Servers` section defines a group of endpoints, referred to as nodes, which are used by output plugins to distribute data in a round-robin fashion. This is particularly useful for plugins that require load balancing when sending data. Examples of plugins that support this capability include [Forward](https://docs.fluentbit.io/manual/pipeline/outputs/forward) and [Elasticsearch](https://docs.fluentbit.io/manual/pipeline/outputs/elasticsearch). + +In YAML, this section is named `upstream_servers` and requires specifying a `name` for the group and a list of `nodes`. Below is an example that defines two upstream server groups: `forward-balancing` and `forward-balancing-2`: + +```yaml +upstream_servers: + - name: forward-balancing + nodes: + - name: node-1 + host: 127.0.0.1 + port: 43000 + + - name: node-2 + host: 127.0.0.1 + port: 44000 + + - name: node-3 + host: 127.0.0.1 + port: 45000 + tls: true + tls_verify: false + shared_key: secret + + - name: forward-balancing-2 + nodes: + - name: node-A + host: 192.168.1.10 + port: 50000 + + - name: node-B + host: 192.168.1.11 + port: 51000 +``` + +### Key Concepts + +- Nodes: Each node in the upstream_servers group must specify a name, host, and port. Additional settings like tls, tls_verify, and shared_key can be configured as needed for secure communication. + + +### Usage Note + +While the `upstream_servers` section can be defined globally, some output plugins may require the configuration to be specified in a separate YAML file. Be sure to consult the documentation for each specific output plugin to understand its requirements. + +For more details, refer to the documentation of the respective output plugins.