Skip to content

Commit

Permalink
redaction: fix crash related to redaction filters + improve performance
Browse files Browse the repository at this point in the history
[upstream commit: 91012a8]

The previous redaction filters implementation was unsound, as it made modifications to the
process object within an event message, but this object was already being shared in the
event cache. This caused a data race that in turn resulted in a frequent crash when
applying redaction filters on a modest workload. Moreover, the prior implementation was
not very performant, since it relied on iterating over all fields in an event message to
redact strings. In practice, we only really care about arguments in the majority of use
cases.

To address the above issues, we make a small breaking change in how the redaction filters
work to focus ONLY on arguments and apply them much earlier in the pipeline. This both
fixes the crash and significantly reduces performance impact, while solving the primary
use case.

Signed-off-by: William Findlay <[email protected]>
  • Loading branch information
willfindlay committed Apr 11, 2024
1 parent f531959 commit 7b9b305
Show file tree
Hide file tree
Showing 27 changed files with 541 additions and 1,499 deletions.
3 changes: 2 additions & 1 deletion api/v1/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1158,8 +1158,9 @@ AggregationOptions defines configuration options for aggregating events.

| Field | Type | Label | Description |
| ----- | ---- | ----- | ----------- |
| match | [Filter](#tetragon-Filter) | repeated | Match events that the redaction filter will apply to. |
| match | [Filter](#tetragon-Filter) | repeated | **Deprecated.** Deprecated, do not use. |
| redact | [string](#string) | repeated | Regular expressions to use for redaction. Strings inside capture groups are redacted. |
| binary_regex | [string](#string) | repeated | Regular expression to match binary name. If supplied, redactions will only be applied to matching processes. |



Expand Down
236 changes: 125 additions & 111 deletions api/v1/tetragon/events.pb.go

Large diffs are not rendered by default.

6 changes: 4 additions & 2 deletions api/v1/tetragon/events.proto
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,12 @@ message Filter {
}

message RedactionFilter {
// Match events that the redaction filter will apply to.
repeated Filter match = 1;
// Deprecated, do not use.
repeated Filter match = 1 [deprecated=true];
// Regular expressions to use for redaction. Strings inside capture groups are redacted.
repeated string redact = 2;
// Regular expression to match binary name. If supplied, redactions will only be applied to matching processes.
repeated string binary_regex = 3;
}

// Determins the behaviour of a field filter
Expand Down
17 changes: 11 additions & 6 deletions cmd/tetragon/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -103,9 +103,16 @@ func getFieldFilters() ([]*tetragon.FieldFilter, error) {
return filters, nil
}

func getRedactionFilters() (fieldfilters.RedactionFilterList, error) {
func setRedactionFilters() error {
var err error
redactionFilters := viper.GetString(option.KeyRedactionFilters)
return fieldfilters.ParseRedactionFilterList(redactionFilters)
fieldfilters.RedactionFilters, err = fieldfilters.ParseRedactionFilterList(redactionFilters)
if err == nil {
log.WithFields(logrus.Fields{"redactionFilters": redactionFilters}).Info("Configured redaction filters")
} else {
log.WithError(err).Error("Error configuring redaction filters")
}
return err
}

// Save daemon information so it is used by client cli but
Expand Down Expand Up @@ -414,18 +421,16 @@ func tetragonExecute() error {

hookRunner := rthooks.GlobalRunner().WithWatcher(k8sWatcher)

redactionFilters, err := getRedactionFilters()
err = setRedactionFilters()
if err != nil {
return err
}
log.WithFields(logrus.Fields{"redactionFilters": redactionFilters}).Info("Configured redaction filters")

pm, err := tetragonGrpc.NewProcessManager(
ctx,
&cleanupWg,
observer.GetSensorManager(),
hookRunner,
redactionFilters)
hookRunner)
if err != nil {
return err
}
Expand Down

Large diffs are not rendered by default.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 4 additions & 5 deletions docs/content/en/docs/concepts/events.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ Since Tetragon traces the entire system, event exports might sometimes contain
sensitive information (for example, a secret passed via a command line argument
to a process). To prevent this information from being exfiltrated via Tetragon
JSON export, Tetragon provides a mechanism called Redaction Filters which can be
used to select events and string patterns to redact. These filters are written
used to string patterns to redact from exported process arguments. These filters are written
in JSON and passed to the Tetragon agent via the `--redaction-filters` command
line flag or the `redactionFilters` Helm value.

Expand All @@ -217,9 +217,8 @@ When writing regular expressions in JSON, it is important to escape backslash
characters. For instance `\Wpasswd\W?` would be written as `{"redact": "\\Wpasswd\\W?"}`.
{{< /warning >}}

Redaction filters select events using the `match` field, which contains one or
more filters (these filters are defined the same way as export filters). If no
match filter is defined, all events are selected.
For more control, you can select which binary or binaries should have their
arguments redacted with the `binary_regex` field.

As a concrete example, the following will redact all passwords passed to
processes with the `"--password"` argument:
Expand All @@ -235,7 +234,7 @@ Suppose we also see some passwords passed via the -p shorthand for a specific bi
We can also redact these as follows:

```json
{"match": [{"binary_regex": "(?:^|/)foo$"}], "redact": ["-p(?:\\s+|=)(\\S*)"]}
{"binary_regex": ["(?:^|/)foo$"], "redact": ["-p(?:\\s+|=)(\\S*)"]}
```

With both of the above redaction filters in place, we are now redacting all
Expand Down
3 changes: 2 additions & 1 deletion docs/content/en/docs/reference/grpc-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -705,8 +705,9 @@ AggregationOptions defines configuration options for aggregating events.

| Field | Type | Label | Description |
| ----- | ---- | ----- | ----------- |
| match | [Filter](#tetragon-Filter) | repeated | Match events that the redaction filter will apply to. |
| match | [Filter](#tetragon-Filter) | repeated | **Deprecated.** Deprecated, do not use. |
| redact | [string](#string) | repeated | Regular expressions to use for redaction. Strings inside capture groups are redacted. |
| binary_regex | [string](#string) | repeated | Regular expression to match binary name. If supplied, redactions will only be applied to matching processes. |

<a name="tetragon-EventType"></a>

Expand Down
9 changes: 4 additions & 5 deletions install/kubernetes/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -106,14 +106,13 @@ tetragon:
# {"event_set": ["PROCESS_KPROBE"], "fields": "process", "action": "INCLUDE"}
#
fieldFilters: ""
# Filters to redact secrets from string fields in Tetragon events. To perform
# Filters to redact secrets from the args fields in Tetragon events. To perform
# redactions, redaction filters define regular expressions in the `redact`
# field. Any capture groups in these regular expressions are redacted and
# replaced with "*****".
#
# Redaction filters select events using the `match` field, which contains one
# or more filters (these filters are defined the same way as export filters).
# If no match filter is defined, all events are selected.
# For more control, you can select which binary or binaries should have their
# arguments redacted with the `binary_regex` field.
#
# NOTE: When writing regular expressions in JSON, it is important to escape
# backslash characters. For instance `\Wpasswd\W?` would be written as
Expand All @@ -130,7 +129,7 @@ tetragon:
# Suppose we also see some passwords passed via the -p shorthand for a specific binary, foo.
# We can also redact these as follows:
#
# {"match": [{"binary_regex": "(?:^|/)foo$"}], "redact": ["-p(?:\\s+|=)(\\S*)"]}
# {"binary_regex": ["(?:^|/)foo$"], "redact": ["-p(?:\\s+|=)(\\S*)"]}
#
# With both of the above redaction filters in place, we are now redacting all
# password arguments.
Expand Down
4 changes: 1 addition & 3 deletions pkg/bench/bench.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ import (
"github.com/cilium/tetragon/pkg/cilium"
"github.com/cilium/tetragon/pkg/defaults"
"github.com/cilium/tetragon/pkg/exporter"
"github.com/cilium/tetragon/pkg/fieldfilters"
"github.com/cilium/tetragon/pkg/grpc"
"github.com/cilium/tetragon/pkg/logger"
"github.com/cilium/tetragon/pkg/observer"
Expand Down Expand Up @@ -229,8 +228,7 @@ func startBenchmarkExporter(ctx context.Context, obs *observer.Observer, summary
ctx,
&wg,
observer.GetSensorManager(),
hookRunner,
fieldfilters.RedactionFilterList{})
hookRunner)
if err != nil {
return err
}
Expand Down
5 changes: 2 additions & 3 deletions pkg/exporter/exporter_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ import (

"github.com/cilium/tetragon/api/v1/tetragon"
"github.com/cilium/tetragon/pkg/encoder"
"github.com/cilium/tetragon/pkg/fieldfilters"
"github.com/cilium/tetragon/pkg/ratelimit"
"github.com/cilium/tetragon/pkg/rthooks"
"github.com/cilium/tetragon/pkg/server"
Expand Down Expand Up @@ -86,7 +85,7 @@ func TestExporter_Send(t *testing.T) {
eventNotifier := newFakeNotifier()
ctx, cancel := context.WithCancel(context.Background())
dr := rthooks.DummyHookRunner{}
grpcServer := server.NewServer(ctx, &wg, eventNotifier, &server.FakeObserver{}, dr, fieldfilters.RedactionFilterList{})
grpcServer := server.NewServer(ctx, &wg, eventNotifier, &server.FakeObserver{}, dr)
numRecords := 2
results := newArrayWriter(numRecords)
encoder := encoder.NewProtojsonEncoder(results)
Expand Down Expand Up @@ -191,7 +190,7 @@ func Test_rateLimitExport(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
eventNotifier := newFakeNotifier()
dr := rthooks.DummyHookRunner{}
grpcServer := server.NewServer(ctx, &wg, eventNotifier, &server.FakeObserver{}, dr, fieldfilters.RedactionFilterList{})
grpcServer := server.NewServer(ctx, &wg, eventNotifier, &server.FakeObserver{}, dr)
results := newArrayWriter(tt.totalEvents)
encoder := encoder.NewProtojsonEncoder(results)
request := &tetragon.GetEventsRequest{}
Expand Down
23 changes: 23 additions & 0 deletions pkg/fieldfilters/benchmark_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -177,3 +177,26 @@ func BenchmarkSerialize_FieldFilters_NoProcesInfoKeepExecid(b *testing.B) {
assert.NoError(b, err, "event must encode")
}
}

// Apply a redaction filter to the event. This doesn't exactly capture how it's done in the real code path but it's a close approximation.
func BenchmarkSerialize_RedactionFilters(b *testing.B) {
b.StopTimer()
gen := newRandomEventGenerator(b, Seed)
encoder := getEncoder()
evs := gen.GenerateN(b)
filterList := `{"redact": ["(a)"]}`
ff, err := ParseRedactionFilterList(filterList)
require.NoError(b, err)
b.StartTimer()

for i := 0; i < b.N; i++ {
ev := evs[i]
getProcess, ok := ev.Event.(interface{ GetProcess() *tetragon.Process })
if ok {
process := getProcess.GetProcess()
process.Arguments = ff.Redact(process.Binary, process.Arguments)
}
err := encoder.Encode(ev)
assert.NoError(b, err, "event must encode")
}
}
Loading

0 comments on commit 7b9b305

Please sign in to comment.