diff --git a/charts/falco/CHANGELOG.md b/charts/falco/CHANGELOG.md index 2f513295..4e308328 100644 --- a/charts/falco/CHANGELOG.md +++ b/charts/falco/CHANGELOG.md @@ -3,6 +3,23 @@ This file documents all notable changes to Falco Helm Chart. The release numbering uses [semantic versioning](http://semver.org). +## v4.9.0 +* Bump Falco to v0.39.0 +* update(falco): add new configuration entries for Falco + This commit adds new config keys introduces in Falco 0.39.0. + Furthermore, updates the unit tests for the latest changes + in the values.yaml. +* cleanup(falco): remove deprecated falco configuration + This commit removes the "output" config key that has + been deprecated in falco. +* update(falco): mount proc filesystem for plugins + The following PR in libs https://github.com/falcosecurity/libs/pull/1969 + introduces a new platform for plugins that requires access to the + proc filesystem. +* fix(falco): update broken link pointing to Falco docs + After the changes made by the following PR to the Falco docs https://github.com/falcosecurity/falco-website/pull/1362 + this commit updates a broken link. + ## v4.8.3 * The init container, when driver.kind=auto, automatically generates diff --git a/charts/falco/Chart.yaml b/charts/falco/Chart.yaml index 319c3ab2..7b8b4030 100644 --- a/charts/falco/Chart.yaml +++ b/charts/falco/Chart.yaml @@ -1,7 +1,7 @@ apiVersion: v2 name: falco -version: 4.8.3 -appVersion: "0.38.2" +version: 4.9.0 +appVersion: "0.39.0-rc2" description: Falco keywords: - monitoring diff --git a/charts/falco/README.gotmpl b/charts/falco/README.gotmpl index a50c32d0..f7d86d1d 100644 --- a/charts/falco/README.gotmpl +++ b/charts/falco/README.gotmpl @@ -47,7 +47,7 @@ The cluster in our example has three nodes, one *control-plane* node and two *wo ### Falco, Event Sources and Kubernetes Starting from Falco 0.31.0 the [new plugin system](https://falco.org/docs/plugins/) is stable and production ready. The **plugin system** can be seen as the next step in the evolution of Falco. Historically, Falco monitored system events from the **kernel** trying to detect malicious behaviors on Linux systems. It also had the capability to process k8s Audit Logs to detect suspicious activities in Kubernetes clusters. Since Falco 0.32.0 all the related code to the k8s Audit Logs in Falco was removed and ported in a [plugin](https://github.com/falcosecurity/plugins/tree/master/plugins/k8saudit). At the time being Falco supports different event sources coming from **plugins** or **drivers** (system events). -Note that **a Falco instance can handle multiple event sources in parallel**. you can deploy Falco leveraging **drivers** for syscall events and at the same time loading **plugins**. A step by step guide on how to deploy Falco with multiple sources can be found [here](https://falco.org/docs/getting-started/third-party/learning/#falco-with-multiple-sources). +Note that **a Falco instance can handle multiple event sources in parallel**. you can deploy Falco leveraging **drivers** for syscall events and at the same time loading **plugins**. A step by step guide on how to deploy Falco with multiple sources can be found [here](https://falco.org/docs/getting-started/learning-environments/#falco-with-multiple-sources). #### About Drivers diff --git a/charts/falco/README.md b/charts/falco/README.md index b6be5bae..b8c0490f 100644 --- a/charts/falco/README.md +++ b/charts/falco/README.md @@ -47,7 +47,7 @@ The cluster in our example has three nodes, one *control-plane* node and two *wo ### Falco, Event Sources and Kubernetes Starting from Falco 0.31.0 the [new plugin system](https://falco.org/docs/plugins/) is stable and production ready. The **plugin system** can be seen as the next step in the evolution of Falco. Historically, Falco monitored system events from the **kernel** trying to detect malicious behaviors on Linux systems. It also had the capability to process k8s Audit Logs to detect suspicious activities in Kubernetes clusters. Since Falco 0.32.0 all the related code to the k8s Audit Logs in Falco was removed and ported in a [plugin](https://github.com/falcosecurity/plugins/tree/master/plugins/k8saudit). At the time being Falco supports different event sources coming from **plugins** or **drivers** (system events). -Note that **a Falco instance can handle multiple event sources in parallel**. you can deploy Falco leveraging **drivers** for syscall events and at the same time loading **plugins**. A step by step guide on how to deploy Falco with multiple sources can be found [here](https://falco.org/docs/getting-started/third-party/learning/#falco-with-multiple-sources). +Note that **a Falco instance can handle multiple event sources in parallel**. you can deploy Falco leveraging **drivers** for syscall events and at the same time loading **plugins**. A step by step guide on how to deploy Falco with multiple sources can be found [here](https://falco.org/docs/getting-started/learning-environments/#falco-with-multiple-sources). #### About Drivers @@ -581,7 +581,7 @@ If you use a Proxy in your cluster, the requests between `Falco` and `Falcosidek ## Configuration -The following table lists the main configurable parameters of the falco chart v4.8.3 and their default values. See [values.yaml](./values.yaml) for full list. +The following table lists the main configurable parameters of the falco chart v4.9.0 and their default values. See [values.yaml](./values.yaml) for full list. ## Values @@ -646,6 +646,7 @@ The following table lists the main configurable parameters of the falco chart v4 | extra.args | list | `[]` | Extra command-line arguments. | | extra.env | list | `[]` | Extra environment variables that will be pass onto Falco containers. | | extra.initContainers | list | `[]` | Additional initContainers for Falco pods. | +| falco.append_output | list | `[]` | | | falco.base_syscalls | object | `{"custom_set":[],"repair":false}` | - [Suggestions] NOTE: setting `base_syscalls.repair: true` automates the following suggestions for you. These suggestions are subject to change as Falco and its state engine evolve. For execve* events: Some Falco fields for an execve* syscall are retrieved from the associated `clone`, `clone3`, `fork`, `vfork` syscalls when spawning a new process. The `close` syscall is used to purge file descriptors from Falco's internal thread / process cache table and is necessary for rules relating to file descriptors (e.g. open, openat, openat2, socket, connect, accept, accept4 ... and many more) Consider enabling the following syscalls in `base_syscalls.custom_set` for process rules: [clone, clone3, fork, vfork, execve, execveat, close] For networking related events: While you can log `connect` or `accept*` syscalls without the socket syscall, the log will not contain the ip tuples. Additionally, for `listen` and `accept*` syscalls, the `bind` syscall is also necessary. We recommend the following as the minimum set for networking-related rules: [clone, clone3, fork, vfork, execve, execveat, close, socket, bind, getsockopt] Lastly, for tracking the correct `uid`, `gid` or `sid`, `pgid` of a process when the running process opens a file or makes a network connection, consider adding the following to the above recommended syscall sets: ... setresuid, setsid, setuid, setgid, setpgid, setresgid, setsid, capset, chdir, chroot, fchdir ... | | falco.buffered_outputs | bool | `false` | Enabling buffering for the output queue can offer performance optimization, efficient resource usage, and smoother data flow, resulting in a more reliable output mechanism. By default, buffering is disabled (false). | | falco.config_files[0] | string | `"/etc/falco/config.d"` | | @@ -665,6 +666,7 @@ The following table lists the main configurable parameters of the falco chart v4 | falco.http_output.insecure | bool | `false` | Tell Falco to not verify the remote server. | | falco.http_output.keep_alive | bool | `false` | keep_alive whether to keep alive the connection. | | falco.http_output.mtls | bool | `false` | Tell Falco to use mTLS | +| falco.json_include_message_property | bool | `false` | | | falco.json_include_output_property | bool | `true` | When using JSON output in Falco, you have the option to include the "output" property itself in the generated JSON output. The "output" property provides additional information about the purpose of the rule. To reduce the logging volume, it is recommended to turn it off if it's not necessary for your use case. | | falco.json_include_tags_property | bool | `true` | When using JSON output in Falco, you have the option to include the "tags" field of the rules in the generated JSON output. The "tags" field provides additional metadata associated with the rule. To reduce the logging volume, if the tags associated with the rule are not needed for your use case or can be added at a later stage, it is recommended to turn it off. | | falco.json_output | bool | `false` | When enabled, Falco will output alert messages and rules file loading/validation results in JSON format, making it easier for downstream programs to process and consume the data. By default, this option is disabled. | @@ -673,9 +675,8 @@ The following table lists the main configurable parameters of the falco chart v4 | falco.log_level | string | `"info"` | The `log_level` setting determines the minimum log level to include in Falco's logs related to the functioning of the software. This setting is separate from the `priority` field of rules and specifically controls the log level of Falco's operational logging. By specifying a log level, you can control the verbosity of Falco's operational logs. Only logs of a certain severity level or higher will be emitted. Supported levels: "emergency", "alert", "critical", "error", "warning", "notice", "info", "debug". | | falco.log_stderr | bool | `true` | Send information logs to stderr. Note these are *not* security notification logs! These are just Falco lifecycle (and possibly error) logs. | | falco.log_syslog | bool | `true` | Send information logs to syslog. Note these are *not* security notification logs! These are just Falco lifecycle (and possibly error) logs. | -| falco.metrics | object | `{"convert_memory_to_mb":true,"enabled":false,"include_empty_values":false,"interval":"1h","kernel_event_counters_enabled":true,"libbpf_stats_enabled":true,"output_rule":true,"resource_utilization_enabled":true,"rules_counters_enabled":true,"state_counters_enabled":true}` | - [Usage] `enabled`: Disabled by default. `interval`: The stats interval in Falco follows the time duration definitions used by Prometheus. https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations Time durations are specified as a number, followed immediately by one of the following units: ms - millisecond s - second m - minute h - hour d - day - assuming a day has always 24h w - week - assuming a week has always 7d y - year - assuming a year has always 365d Example of a valid time duration: 1h30m20s10ms A minimum interval of 100ms is enforced for metric collection. However, for production environments, we recommend selecting one of the following intervals for optimal monitoring: 15m 30m 1h 4h 6h `output_rule`: To enable seamless metrics and performance monitoring, we recommend emitting metrics as the rule "Falco internal: metrics snapshot". This option is particularly useful when Falco logs are preserved in a data lake. Please note that to use this option, the Falco rules config `priority` must be set to `info` at a minimum. `output_file`: Append stats to a `jsonl` file. Use with caution in production as Falco does not automatically rotate the file. `resource_utilization_enabled`: Emit CPU and memory usage metrics. CPU usage is reported as a percentage of one CPU and can be normalized to the total number of CPUs to determine overall usage. Memory metrics are provided in raw units (`kb` for `RSS`, `PSS` and `VSZ` or `bytes` for `container_memory_used`) and can be uniformly converted to megabytes (MB) using the `convert_memory_to_mb` functionality. In environments such as Kubernetes when deployed as daemonset, it is crucial to track Falco's container memory usage. To customize the path of the memory metric file, you can create an environment variable named `FALCO_CGROUP_MEM_PATH` and set it to the desired file path. By default, Falco uses the file `/sys/fs/cgroup/memory/memory.usage_in_bytes` to monitor container memory usage, which aligns with Kubernetes' `container_memory_working_set_bytes` metric. Finally, we emit the overall host CPU and memory usages, along with the total number of processes and open file descriptors (fds) on the host, obtained from the proc file system unrelated to Falco's monitoring. These metrics help assess Falco's usage in relation to the server's workload intensity. `rules_counters_enabled`: Emit counts for each rule. `resource_utilization_enabled`: Emit CPU and memory usage metrics. CPU usage is reported as a percentage of one CPU and can be normalized to the total number of CPUs to determine overall usage. Memory metrics are provided in raw units (`kb` for `RSS`, `PSS` and `VSZ` or `bytes` for `container_memory_used`) and can be uniformly converted to megabytes (MB) using the `convert_memory_to_mb` functionality. In environments such as Kubernetes when deployed as daemonset, it is crucial to track Falco's container memory usage. To customize the path of the memory metric file, you can create an environment variable named `FALCO_CGROUP_MEM_PATH` and set it to the desired file path. By default, Falco uses the file `/sys/fs/cgroup/memory/memory.usage_in_bytes` to monitor container memory usage, which aligns with Kubernetes' `container_memory_working_set_bytes` metric. Finally, we emit the overall host CPU and memory usages, along with the total number of processes and open file descriptors (fds) on the host, obtained from the proc file system unrelated to Falco's monitoring. These metrics help assess Falco's usage in relation to the server's workload intensity. `state_counters_enabled`: Emit counters related to Falco's state engine, including added, removed threads or file descriptors (fds), and failed lookup, store, or retrieve actions in relation to Falco's underlying process cache table (threadtable). We also log the number of currently cached containers if applicable. `kernel_event_counters_enabled`: Emit kernel side event and drop counters, as an alternative to `syscall_event_drops`, but with some differences. These counters reflect monotonic values since Falco's start and are exported at a constant stats interval. `libbpf_stats_enabled`: Exposes statistics similar to `bpftool prog show`, providing information such as the number of invocations of each BPF program attached by Falco and the time spent in each program measured in nanoseconds. To enable this feature, the kernel must be >= 5.1, and the kernel configuration `/proc/sys/kernel/bpf_stats_enabled` must be set. This option, or an equivalent statistics feature, is not available for non `*bpf*` drivers. Additionally, please be aware that the current implementation of `libbpf` does not support granularity of statistics at the bpf tail call level. `include_empty_values`: When the option is set to true, fields with an empty numeric value will be included in the output. However, this rule does not apply to high-level fields such as `n_evts` or `n_drops`; they will always be included in the output even if their value is empty. This option can be beneficial for exploring the data schema and ensuring that fields with empty values are included in the output. todo: prometheus export option todo: syscall_counters_enabled option | +| falco.metrics | object | `{"convert_memory_to_mb":true,"enabled":false,"include_empty_values":false,"interval":"1h","kernel_event_counters_enabled":true,"kernel_event_counters_per_cpu_enabled":false,"libbpf_stats_enabled":true,"output_rule":true,"resource_utilization_enabled":true,"rules_counters_enabled":true,"state_counters_enabled":true}` | - [Usage] `enabled`: Disabled by default. `interval`: The stats interval in Falco follows the time duration definitions used by Prometheus. https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations Time durations are specified as a number, followed immediately by one of the following units: ms - millisecond s - second m - minute h - hour d - day - assuming a day has always 24h w - week - assuming a week has always 7d y - year - assuming a year has always 365d Example of a valid time duration: 1h30m20s10ms A minimum interval of 100ms is enforced for metric collection. However, for production environments, we recommend selecting one of the following intervals for optimal monitoring: 15m 30m 1h 4h 6h `output_rule`: To enable seamless metrics and performance monitoring, we recommend emitting metrics as the rule "Falco internal: metrics snapshot". This option is particularly useful when Falco logs are preserved in a data lake. Please note that to use this option, the Falco rules config `priority` must be set to `info` at a minimum. `output_file`: Append stats to a `jsonl` file. Use with caution in production as Falco does not automatically rotate the file. `resource_utilization_enabled`: Emit CPU and memory usage metrics. CPU usage is reported as a percentage of one CPU and can be normalized to the total number of CPUs to determine overall usage. Memory metrics are provided in raw units (`kb` for `RSS`, `PSS` and `VSZ` or `bytes` for `container_memory_used`) and can be uniformly converted to megabytes (MB) using the `convert_memory_to_mb` functionality. In environments such as Kubernetes when deployed as daemonset, it is crucial to track Falco's container memory usage. To customize the path of the memory metric file, you can create an environment variable named `FALCO_CGROUP_MEM_PATH` and set it to the desired file path. By default, Falco uses the file `/sys/fs/cgroup/memory/memory.usage_in_bytes` to monitor container memory usage, which aligns with Kubernetes' `container_memory_working_set_bytes` metric. Finally, we emit the overall host CPU and memory usages, along with the total number of processes and open file descriptors (fds) on the host, obtained from the proc file system unrelated to Falco's monitoring. These metrics help assess Falco's usage in relation to the server's workload intensity. `rules_counters_enabled`: Emit counts for each rule. `resource_utilization_enabled`: Emit CPU and memory usage metrics. CPU usage is reported as a percentage of one CPU and can be normalized to the total number of CPUs to determine overall usage. Memory metrics are provided in raw units (`kb` for `RSS`, `PSS` and `VSZ` or `bytes` for `container_memory_used`) and can be uniformly converted to megabytes (MB) using the `convert_memory_to_mb` functionality. In environments such as Kubernetes when deployed as daemonset, it is crucial to track Falco's container memory usage. To customize the path of the memory metric file, you can create an environment variable named `FALCO_CGROUP_MEM_PATH` and set it to the desired file path. By default, Falco uses the file `/sys/fs/cgroup/memory/memory.usage_in_bytes` to monitor container memory usage, which aligns with Kubernetes' `container_memory_working_set_bytes` metric. Finally, we emit the overall host CPU and memory usages, along with the total number of processes and open file descriptors (fds) on the host, obtained from the proc file system unrelated to Falco's monitoring. These metrics help assess Falco's usage in relation to the server's workload intensity. `state_counters_enabled`: Emit counters related to Falco's state engine, including added, removed threads or file descriptors (fds), and failed lookup, store, or retrieve actions in relation to Falco's underlying process cache table (threadtable). We also log the number of currently cached containers if applicable. `kernel_event_counters_enabled`: Emit kernel side event and drop counters, as an alternative to `syscall_event_drops`, but with some differences. These counters reflect monotonic values since Falco's start and are exported at a constant stats interval. `kernel_event_counters_per_cpu_enabled`: Detailed kernel event and drop counters per CPU. Typically used when debugging and not in production. `libbpf_stats_enabled`: Exposes statistics similar to `bpftool prog show`, providing information such as the number of invocations of each BPF program attached by Falco and the time spent in each program measured in nanoseconds. To enable this feature, the kernel must be >= 5.1, and the kernel configuration `/proc/sys/kernel/bpf_stats_enabled` must be set. This option, or an equivalent statistics feature, is not available for non `*bpf*` drivers. Additionally, please be aware that the current implementation of `libbpf` does not support granularity of statistics at the bpf tail call level. `include_empty_values`: When the option is set to true, fields with an empty numeric value will be included in the output. However, this rule does not apply to high-level fields such as `n_evts` or `n_drops`; they will always be included in the output even if their value is empty. This option can be beneficial for exploring the data schema and ensuring that fields with empty values are included in the output. todo: prometheus export option todo: syscall_counters_enabled option | | falco.output_timeout | int | `2000` | The `output_timeout` parameter specifies the duration, in milliseconds, to wait before considering the deadline exceeded. By default, the timeout is set to 2000ms (2 seconds), meaning that the consumer of Falco outputs can block the Falco output channel for up to 2 seconds without triggering a timeout error. Falco actively monitors the performance of output channels. With this setting the timeout error can be logged, but please note that this requires setting Falco's operational logs `log_level` to a minimum of `notice`. It's important to note that Falco outputs will not be discarded from the output queue. This means that if an output channel becomes blocked indefinitely, it indicates a potential issue that needs to be addressed by the user. | -| falco.outputs | object | `{"max_burst":1000,"rate":0}` | A throttling mechanism, implemented as a token bucket, can be used to control the rate of Falco outputs. Each event source has its own rate limiter, ensuring that alerts from one source do not affect the throttling of others. The following options control the mechanism: - rate: the number of tokens (i.e. right to send a notification) gained per second. When 0, the throttling mechanism is disabled. Defaults to 0. - max_burst: the maximum number of tokens outstanding. Defaults to 1000. For example, setting the rate to 1 allows Falco to send up to 1000 notifications initially, followed by 1 notification per second. The burst capacity is fully restored after 1000 seconds of no activity. Throttling can be useful in various scenarios, such as preventing notification floods, managing system load, controlling event processing, or complying with rate limits imposed by external systems or APIs. It allows for better resource utilization, avoids overwhelming downstream systems, and helps maintain a balanced and controlled flow of notifications. With the default settings, the throttling mechanism is disabled. | | falco.outputs_queue | object | `{"capacity":0}` | Falco utilizes tbb::concurrent_bounded_queue for handling outputs, and this parameter allows you to customize the queue capacity. Please refer to the official documentation: https://oneapi-src.github.io/oneTBB/main/tbb_userguide/Concurrent_Queue_Classes.html. On a healthy system with optimized Falco rules, the queue should not fill up. If it does, it is most likely happening due to the entire event flow being too slow, indicating that the server is under heavy load. `capacity`: the maximum number of items allowed in the queue is determined by this value. Setting the value to 0 (which is the default) is equivalent to keeping the queue unbounded. In other words, when this configuration is set to 0, the number of allowed items is effectively set to the largest possible long value, disabling this setting. In the case of an unbounded queue, if the available memory on the system is consumed, the Falco process would be OOM killed. When using this option and setting the capacity, the current event would be dropped, and the event loop would continue. This behavior mirrors kernel-side event drops when the buffer between kernel space and user space is full. | | falco.plugins | list | `[{"init_config":null,"library_path":"libk8saudit.so","name":"k8saudit","open_params":"http://:9765/k8s-audit"},{"library_path":"libcloudtrail.so","name":"cloudtrail"},{"init_config":"","library_path":"libjson.so","name":"json"}]` | Customize subsettings for each enabled plugin. These settings will only be applied when the corresponding plugin is enabled using the `load_plugins` option. | | falco.priority | string | `"debug"` | Any rule with a priority level more severe than or equal to the specified minimum level will be loaded and run by Falco. This allows you to filter and control the rules based on their severity, ensuring that only rules of a certain priority or higher are active and evaluated by Falco. Supported levels: "emergency", "alert", "critical", "error", "warning", "notice", "info", "debug" | @@ -722,7 +723,7 @@ The following table lists the main configurable parameters of the falco chart v4 | falcoctl.image.pullPolicy | string | `"IfNotPresent"` | The image pull policy. | | falcoctl.image.registry | string | `"docker.io"` | The image registry to pull from. | | falcoctl.image.repository | string | `"falcosecurity/falcoctl"` | The image repository to pull from. | -| falcoctl.image.tag | string | `"0.9.0"` | The image tag to pull. | +| falcoctl.image.tag | string | `"0.10.0"` | The image tag to pull. | | falcosidekick | object | `{"enabled":false,"fullfqdn":false,"listenPort":""}` | For configuration values, see https://github.com/falcosecurity/charts/blob/master/charts/falcosidekick/values.yaml | | falcosidekick.enabled | bool | `false` | Enable falcosidekick deployment. | | falcosidekick.fullfqdn | bool | `false` | Enable usage of full FQDN of falcosidekick service (useful when a Proxy is used). | @@ -740,11 +741,12 @@ The following table lists the main configurable parameters of the falco chart v4 | image.repository | string | `"falcosecurity/falco-no-driver"` | The image repository to pull from | | image.tag | string | `""` | The image tag to pull. Overrides the image tag whose default is the chart appVersion. | | imagePullSecrets | list | `[]` | Secrets containing credentials when pulling from private/secure registries. | -| metrics | object | `{"convertMemoryToMB":true,"enabled":false,"includeEmptyValues":false,"interval":"1h","kernelEventCountersEnabled":true,"libbpfStatsEnabled":true,"outputRule":false,"resourceUtilizationEnabled":true,"rulesCountersEnabled":true,"service":{"create":true,"ports":{"metrics":{"port":8765,"protocol":"TCP","targetPort":8765}},"type":"ClusterIP"},"stateCountersEnabled":true}` | metrics configures Falco to enable and expose the metrics. | +| metrics | object | `{"convertMemoryToMB":true,"enabled":false,"includeEmptyValues":false,"interval":"1h","kernelEventCountersEnabled":true,"kernelEventCountersPerCPUEnabled":false,"libbpfStatsEnabled":true,"outputRule":false,"resourceUtilizationEnabled":true,"rulesCountersEnabled":true,"service":{"create":true,"ports":{"metrics":{"port":8765,"protocol":"TCP","targetPort":8765}},"type":"ClusterIP"},"stateCountersEnabled":true}` | metrics configures Falco to enable and expose the metrics. | | metrics.convertMemoryToMB | bool | `true` | convertMemoryToMB specifies whether the memory should be converted to mb. | | metrics.enabled | bool | `false` | enabled specifies whether the metrics should be enabled. | | metrics.includeEmptyValues | bool | `false` | includeEmptyValues specifies whether the empty values should be included in the metrics. | | metrics.interval | string | `"1h"` | interval is stats interval in Falco follows the time duration definitions used by Prometheus. https://prometheus.io/docs/prometheus/latest/querying/basics/#time-durations Time durations are specified as a number, followed immediately by one of the following units: ms - millisecond s - second m - minute h - hour d - day - assuming a day has always 24h w - week - assuming a week has always 7d y - year - assuming a year has always 365d Example of a valid time duration: 1h30m20s10ms A minimum interval of 100ms is enforced for metric collection. However, for production environments, we recommend selecting one of the following intervals for optimal monitoring: 15m 30m 1h 4h 6h | +| metrics.kernelEventCountersPerCPUEnabled | bool | `false` | kernelEventCountersPerCPUEnabled specifies whether the event counters per cpu should be enabled. | | metrics.libbpfStatsEnabled | bool | `true` | libbpfStatsEnabled exposes statistics similar to `bpftool prog show`, providing information such as the number of invocations of each BPF program attached by Falco and the time spent in each program measured in nanoseconds. To enable this feature, the kernel must be >= 5.1, and the kernel configuration `/proc/sys/kernel/bpf_stats_enabled` must be set. This option, or an equivalent statistics feature, is not available for non `*bpf*` drivers. Additionally, please be aware that the current implementation of `libbpf` does not support granularity of statistics at the bpf tail call level. | | metrics.outputRule | bool | `false` | outputRule enables seamless metrics and performance monitoring, we recommend emitting metrics as the rule "Falco internal: metrics snapshot". This option is particularly useful when Falco logs are preserved in a data lake. Please note that to use this option, the Falco rules config `priority` must be set to `info` at a minimum. | | metrics.resourceUtilizationEnabled | bool | `true` | resourceUtilizationEnabled`: Emit CPU and memory usage metrics. CPU usage is reported as a percentage of one CPU and can be normalized to the total number of CPUs to determine overall usage. Memory metrics are provided in raw units (`kb` for `RSS`, `PSS` and `VSZ` or `bytes` for `container_memory_used`) and can be uniformly converted to megabytes (MB) using the `convert_memory_to_mb` functionality. In environments such as Kubernetes when deployed as daemonset, it is crucial to track Falco's container memory usage. To customize the path of the memory metric file, you can create an environment variable named `FALCO_CGROUP_MEM_PATH` and set it to the desired file path. By default, Falco uses the file `/sys/fs/cgroup/memory/memory.usage_in_bytes` to monitor container memory usage, which aligns with Kubernetes' `container_memory_working_set_bytes` metric. Finally, we emit the overall host CPU and memory usages, along with the total number of processes and open file descriptors (fds) on the host, obtained from the proc file system unrelated to Falco's monitoring. These metrics help assess Falco's usage in relation to the server's workload intensity. | @@ -757,7 +759,6 @@ The following table lists the main configurable parameters of the falco chart v4 | metrics.service.ports.metrics.protocol | string | `"TCP"` | protocol specifies the network protocol that the Service should use for the associated port. | | metrics.service.ports.metrics.targetPort | int | `8765` | targetPort is the port on which the Pod is listening. | | metrics.service.type | string | `"ClusterIP"` | type denotes the service type. Setting it to "ClusterIP" we ensure that are accessible from within the cluster. | -| mounts.enforceProcMount | bool | `false` | By default, `/proc` from the host is only mounted into the Falco pod when `driver.enabled` is set to `true`. This flag allows it to override this behaviour for edge cases where `/proc` is needed but syscall data source is not enabled at the same time (e.g. for specific plugins). | | mounts.volumeMounts | list | `[]` | A list of volumes you want to add to the Falco pods. | | mounts.volumes | list | `[]` | A list of volumes you want to add to the Falco pods. | | nameOverride | string | `""` | Put here the new name if you want to override the release name used for Falco components. | diff --git a/charts/falco/templates/_helpers.tpl b/charts/falco/templates/_helpers.tpl index f611a539..b5147965 100644 --- a/charts/falco/templates/_helpers.tpl +++ b/charts/falco/templates/_helpers.tpl @@ -427,6 +427,7 @@ Based on the use input it populates the metrics configuration in the falco confi {{- $_ = set .Values.falco.metrics "resource_utilization_enabled" .Values.metrics.resourceUtilizationEnabled -}} {{- $_ = set .Values.falco.metrics "state_counters_enabled" .Values.metrics.stateCountersEnabled -}} {{- $_ = set .Values.falco.metrics "kernel_event_counters_enabled" .Values.metrics.kernelEventCountersEnabled -}} +{{- $_ = set .Values.falco.metrics "kernel_event_counters_per_cpu_enabled" .Values.metrics.kernelEventCountersPerCPUEnabled -}} {{- $_ = set .Values.falco.metrics "libbpf_stats_enabled" .Values.metrics.libbpfStatsEnabled -}} {{- $_ = set .Values.falco.metrics "convert_memory_to_mb" .Values.metrics.convertMemoryToMB -}} {{- $_ = set .Values.falco.metrics "include_empty_values" .Values.metrics.includeEmptyValues -}} diff --git a/charts/falco/templates/pod-template.tpl b/charts/falco/templates/pod-template.tpl index d062336d..8ed3844f 100644 --- a/charts/falco/templates/pod-template.tpl +++ b/charts/falco/templates/pod-template.tpl @@ -135,10 +135,8 @@ spec: {{- end }} - mountPath: /root/.falco name: root-falco-fs - {{- if or .Values.driver.enabled .Values.mounts.enforceProcMount }} - mountPath: /host/proc name: proc-fs - {{- end }} {{- if and .Values.driver.enabled (not .Values.driver.loader.enabled) }} readOnly: true - mountPath: /host/boot @@ -289,11 +287,9 @@ spec: {{- end }} {{- end }} {{- end }} - {{- if or .Values.driver.enabled .Values.mounts.enforceProcMount }} - name: proc-fs hostPath: path: /proc - {{- end }} {{- if eq .Values.driver.kind "gvisor" }} - name: runsc-path hostPath: diff --git a/charts/falco/tests/unit/metricsConfig_test.go b/charts/falco/tests/unit/metricsConfig_test.go index 2d0cc33d..e983f58c 100644 --- a/charts/falco/tests/unit/metricsConfig_test.go +++ b/charts/falco/tests/unit/metricsConfig_test.go @@ -26,16 +26,17 @@ import ( ) type metricsConfig struct { - Enabled bool `yaml:"enabled"` - ConvertMemoryToMB bool `yaml:"convert_memory_to_mb"` - IncludeEmptyValues bool `yaml:"include_empty_values"` - KernelEventCountersEnabled bool `yaml:"kernel_event_counters_enabled"` - ResourceUtilizationEnabled bool `yaml:"resource_utilization_enabled"` - RulesCountersEnabled bool `yaml:"rules_counters_enabled"` - LibbpfStatsEnabled bool `yaml:"libbpf_stats_enabled"` - OutputRule bool `yaml:"output_rule"` - StateCountersEnabled bool `yaml:"state_counters_enabled"` - Interval string `yaml:"interval"` + Enabled bool `yaml:"enabled"` + ConvertMemoryToMB bool `yaml:"convert_memory_to_mb"` + IncludeEmptyValues bool `yaml:"include_empty_values"` + KernelEventCountersEnabled bool `yaml:"kernel_event_counters_enabled"` + KernelEventCountersPerCPUEnabled bool `yaml:"kernel_event_counters_per_cpu_enabled"` + ResourceUtilizationEnabled bool `yaml:"resource_utilization_enabled"` + RulesCountersEnabled bool `yaml:"rules_counters_enabled"` + LibbpfStatsEnabled bool `yaml:"libbpf_stats_enabled"` + OutputRule bool `yaml:"output_rule"` + StateCountersEnabled bool `yaml:"state_counters_enabled"` + Interval string `yaml:"interval"` } type webServerConfig struct { @@ -63,7 +64,7 @@ func TestMetricsConfigInFalcoConfig(t *testing.T) { "defaultValues", nil, func(t *testing.T, metricsConfig, webServerConfig any) { - require.Len(t, metricsConfig, 10, "should have ten items") + require.Len(t, metricsConfig, 11, "should have ten items") metrics, err := getMetricsConfig(metricsConfig) require.NoError(t, err) @@ -78,6 +79,7 @@ func TestMetricsConfigInFalcoConfig(t *testing.T) { require.True(t, metrics.LibbpfStatsEnabled) require.True(t, metrics.OutputRule) require.True(t, metrics.StateCountersEnabled) + require.False(t, metrics.KernelEventCountersPerCPUEnabled) webServer, err := getWebServerConfig(webServerConfig) require.NoError(t, err) @@ -92,7 +94,7 @@ func TestMetricsConfigInFalcoConfig(t *testing.T) { "metrics.enabled": "true", }, func(t *testing.T, metricsConfig, webServerConfig any) { - require.Len(t, metricsConfig, 10, "should have ten items") + require.Len(t, metricsConfig, 11, "should have ten items") metrics, err := getMetricsConfig(metricsConfig) require.NoError(t, err) @@ -107,6 +109,7 @@ func TestMetricsConfigInFalcoConfig(t *testing.T) { require.True(t, metrics.LibbpfStatsEnabled) require.False(t, metrics.OutputRule) require.True(t, metrics.StateCountersEnabled) + require.False(t, metrics.KernelEventCountersPerCPUEnabled) webServer, err := getWebServerConfig(webServerConfig) require.NoError(t, err) @@ -118,19 +121,20 @@ func TestMetricsConfigInFalcoConfig(t *testing.T) { { "Flip/Change Values", map[string]string{ - "metrics.enabled": "true", - "metrics.convertMemoryToMB": "false", - "metrics.includeEmptyValues": "true", - "metrics.kernelEventCountersEnabled": "false", - "metrics.resourceUtilizationEnabled": "false", - "metrics.rulesCountersEnabled": "false", - "metrics.libbpfStatsEnabled": "false", - "metrics.outputRule": "false", - "metrics.stateCountersEnabled": "false", - "metrics.interval": "1s", + "metrics.enabled": "true", + "metrics.convertMemoryToMB": "false", + "metrics.includeEmptyValues": "true", + "metrics.kernelEventCountersEnabled": "false", + "metrics.resourceUtilizationEnabled": "false", + "metrics.rulesCountersEnabled": "false", + "metrics.libbpfStatsEnabled": "false", + "metrics.outputRule": "false", + "metrics.stateCountersEnabled": "false", + "metrics.interval": "1s", + "metrics.kernelEventCountersPerCPUEnabled": "true", }, func(t *testing.T, metricsConfig, webServerConfig any) { - require.Len(t, metricsConfig, 10, "should have ten items") + require.Len(t, metricsConfig, 11, "should have ten items") metrics, err := getMetricsConfig(metricsConfig) require.NoError(t, err) @@ -145,6 +149,7 @@ func TestMetricsConfigInFalcoConfig(t *testing.T) { require.False(t, metrics.LibbpfStatsEnabled) require.False(t, metrics.OutputRule) require.False(t, metrics.StateCountersEnabled) + require.True(t, metrics.KernelEventCountersPerCPUEnabled) webServer, err := getWebServerConfig(webServerConfig) require.NoError(t, err) diff --git a/charts/falco/values.yaml b/charts/falco/values.yaml index bd8f2a61..761e76b7 100644 --- a/charts/falco/values.yaml +++ b/charts/falco/values.yaml @@ -240,6 +240,8 @@ metrics: convertMemoryToMB: true # -- includeEmptyValues specifies whether the empty values should be included in the metrics. includeEmptyValues: false + # -- kernelEventCountersPerCPUEnabled specifies whether the event counters per cpu should be enabled. + kernelEventCountersPerCPUEnabled: false # -- service exposes the metrics service to be accessed from within the cluster. # ref: https://kubernetes.io/docs/concepts/services-networking/service/ service: @@ -265,8 +267,6 @@ mounts: volumes: [] # -- A list of volumes you want to add to the Falco pods. volumeMounts: [] - # -- By default, `/proc` from the host is only mounted into the Falco pod when `driver.enabled` is set to `true`. This flag allows it to override this behaviour for edge cases where `/proc` is needed but syscall data source is not enabled at the same time (e.g. for specific plugins). - enforceProcMount: false # Driver settings (scenario requirement) driver: @@ -471,7 +471,7 @@ falcoctl: # -- The image repository to pull from. repository: falcosecurity/falcoctl # -- The image tag to pull. - tag: "0.9.0" + tag: "0.10.0" artifact: # -- Runs "falcoctl artifact install" command as an init container. It is used to install artfacts before # Falco starts. It provides them to Falco by using an emptyDir volume. @@ -834,6 +834,15 @@ falco: # be added at a later stage, it is recommended to turn it off. json_include_tags_property: true + # [Incubating] `json_include_message_property` + # + # When using JSON output in Falco, you have the option to include the formatted + # rule output without timestamp or priority. For instance, if a rule specifies + # an "output" property like "Opened process %proc.name" the "message" field will + # only contain "Opened process bash" whereas the "output" field will contain more + # information. + json_include_message_property: false + # [Stable] `buffered_outputs` # # -- Enabling buffering for the output queue can offer performance optimization, @@ -841,30 +850,49 @@ falco: # output mechanism. By default, buffering is disabled (false). buffered_outputs: false - # [Stable] `outputs` - # - # -- A throttling mechanism, implemented as a token bucket, can be used to control - # the rate of Falco outputs. Each event source has its own rate limiter, - # ensuring that alerts from one source do not affect the throttling of others. - # The following options control the mechanism: - # - rate: the number of tokens (i.e. right to send a notification) gained per - # second. When 0, the throttling mechanism is disabled. Defaults to 0. - # - max_burst: the maximum number of tokens outstanding. Defaults to 1000. + # [Sandbox] `append_output` + # + # Add information to the Falco output. + # With this setting you can add more information to the Falco output message, customizable by + # rule, tag or source. + # You can also add additional data that will appear in the output_fields property + # of JSON formatted messages or gRPC output but will not be part of the regular output message. + # This allows you to add custom fields that can help you filter your Falco events without + # polluting the message text. + # + # Each append_output entry has an optional `match` map which specifies which rules will be + # affected. + # `match`: + # `rule`: append output only to a specific rule + # `source`: append output only to a specific source + # `tags`: append output only to rules that have all of the specified tags + # If none of the above are specified (or `match` is omitted) + # output is appended to all events. + # If more than one match condition is specified output will be appended to events + # that match all conditions. + # And several options to add output: + # `extra_output`: add output to the Falco message + # `extra_fields`: add new fields to the JSON output and structured output, which will not + # affect the regular Falco message in any way. These can be specified as a + # custom name with a custom format or as any supported field + # (see: https://falco.org/docs/reference/rules/supported-fields/) # - # For example, setting the rate to 1 allows Falco to send up to 1000 - # notifications initially, followed by 1 notification per second. The burst - # capacity is fully restored after 1000 seconds of no activity. + # Example: # - # Throttling can be useful in various scenarios, such as preventing notification - # floods, managing system load, controlling event processing, or complying with - # rate limits imposed by external systems or APIs. It allows for better resource - # utilization, avoids overwhelming downstream systems, and helps maintain a - # balanced and controlled flow of notifications. + # append_output: + # - match: + # source: syscall + # extra_output: "on CPU %evt.cpu" + # extra_fields: + # - home_directory: "${HOME}" + # - evt.hostname # - # With the default settings, the throttling mechanism is disabled. - outputs: - rate: 0 - max_burst: 1000 + # In the example above every event coming from the syscall source will get an extra message + # at the end telling the CPU number. In addition, if `json_output` is true, in the "output_fields" + # property you will find three new ones: "evt.cpu", "home_directory" which will contain the value of the + # environment variable $HOME, and "evt.hostname" which will contain the hostname. + append_output: [] + ########################## # Falco outputs channels # @@ -1323,6 +1351,9 @@ falco: # counters reflect monotonic values since Falco's start and are exported at a # constant stats interval. # + # `kernel_event_counters_per_cpu_enabled`: Detailed kernel event and drop counters + # per CPU. Typically used when debugging and not in production. + # # `libbpf_stats_enabled`: Exposes statistics similar to `bpftool prog show`, # providing information such as the number of invocations of each BPF program # attached by Falco and the time spent in each program measured in nanoseconds. @@ -1352,6 +1383,7 @@ falco: libbpf_stats_enabled: true convert_memory_to_mb: true include_empty_values: false + kernel_event_counters_per_cpu_enabled: false #######################################