new(driver, libsinsp, libscap): Add kernel signals exe_ino, exe_ino_ctime, exe_ino_mtime, pidns_init_start_ts + derived filter fields #595

incertum · 2022-09-12T07:02:20Z

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap-engine-udig

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

What this PR does / why we need it:

Dropping an implant, making the file executable and executing the implant is amongst one of the oldest tricks. While memory based cyber attacks mostly circumvent touching disk, reliably detecting drifts, that is, a suspicious new executable is executed is often considered a crucial baseline detection.

Falco's upstream rules "Container Drift Detected (chmod)" and "Container Drift Detected (open+create)" aim to detect the creation of a new executable in a container (drift). However, both rules are disabled by default, because those rules can be noisy in un-profiled environments and workloads. Finally, currently there are no easy or robust mechanisms to correlate above rules that are based on file operation events with the events where the executable is run (execve).

This PR attempts to address this gap via adding enhanced kernel signals to spawned processes. While the proposed signals won't replace the need to monitor file operation events, they can help reduce the search space for tracking spawned processes where for example chmod +x was run against the executable file on disk prior to execution (this causes ctime of inode to change, but we don't know if it was chmod related or a different status change operation). In addition, end users could use these fields for selected rules to augment information available for incident response.

New derived filter fields based on new kernel signals

"proc.exe_ino", "Inode number of executable image file on disk", "The inode number of the executable image file on disk. Can be correlated with fd.ino."

"proc.exe_ino.ctime", "Last status change time (ctime - epoch ns) of exe file on disk", "Last status change time (ctime - epoch nanoseconds) of executable image file on disk (inode->ctime). Time is changed by writing or by setting inode information e.g. owner, group, link count, mode etc."

"proc.exe_ino.mtime", "Last modification time (mtime - epoch ns) of exe file on disk", "Last modification time (mtime - epoch nanoseconds) of executable image file on disk (inode->mtime). Time is changed by file modifications, e.g. by mknod, truncate, utime, write of more than zero bytes etc. For tracking changes in owner, group, link count or mode, use proc.exe_ino.ctime instead."

"proc.exe_ino.ctime_duration_proc_start", "Number of nanoseconds between ctime exe file and proc clone ts", "Number of nanoseconds between modifying status of executable image and spawning a new process using the changed executable image."

"proc.exe_ino.ctime_duration_pidns_start", "Number of nanoseconds between pidns start ts and ctime exe file", "Number of nanoseconds between pid namespace start ts and ctime exe file if pidns start predates ctime."

"proc.pidns_init_start_ts", "Start ts of pid namespace (epoch ns)", "Start ts (epoch ns) of pid namespace; approximate start ts of container if pid in container or start ts of host if pid in host namespace."

"container.start_ts", "Container start ts (epoch in ns)", "Container start ts (epoch in ns) based on proc.pidns_init_start_ts."

"container.duration", "Number of nanoseconds since the container start ts", "Number of nanoseconds since the container start ts."

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Includes cleanup, mainly make sched_prog_exec_4 and execve_family_flags filler alike in terms of style. Refactored (no logic changes) get_exe_writable to avoid few redundant _READ()s on same kernel structures within the same filler (@LucaGuerra).

This PR is not yet ready. Hoping for some early feedback to make these new signals better :)

Checklist (this PR)

container.duration
proc.exe_ino.ctime_duration_pidns_start (how much time has passed since container start or host re-boot until status change of exe file on disk)
more ideas / different ideas
- [TRACKING] "drop+exec" kernel signal correlations detections + threat modeling beyond #615
- [FEATURE] Options available for tapping into "linux_binprm" that holds args used when loading binaries #621
kmod
modern_bpf
scap procs, scap file
bump driver schema version

Checklist (future PR)

Initial attempt for on host anomaly detection for container drift use case (expectation would be to first PR a proposal doc as this would mark a significant new feature)

Does this PR introduce a user-facing change?:

new: kernel signals `exe_ino`, `exe_ino_ctime`, `exe_ino_mtime`, `pidns_init_start_ts`, plus derived filter fields

FedeDP · 2022-09-12T07:27:59Z

This is a huge PR @incertum!
A small note that must be addressed before leaving "wip" state:

i think scap_procs must be updated fetching new info from proc, if we can, right?

Aside from this, it looks really cool, thank you!

loresuso · 2022-09-12T08:03:01Z

Hello @incertum! The container drift detection seems to be something really significant, thanks for spending time on it and trying to detect this kind of behavior!
Since I see in your checklist that you are open to discuss also different ideas, although this one is really cool, I want to ask you what you think about this other approach that I came up with and described here:

#287

The main problem of this approach is that it relies on overlayfs and so it cannot work with old kernels and container runtimes that do not use it. It also needs to be tested across a wider variety of kernels to be sure that it's working, since it was like an experiment for me. I would be happy to know what you think about it!

incertum · 2022-09-12T17:08:03Z

@FedeDP ty let me look into scap-procs - once feature complete will implement this for modern_bpf, scap file and kmod, always leave the kmod fun for the end :) Also still need to test this on more kernel versions and distros than just the one I was quickly developing on ...

@loresuso was actually lurking around that overlayfs PR a good while ago. Thanks for experimenting ❤️! In general I believe more and stronger kernel signals just like the one you proposed are needed, let's chat more.
What is needed to merge it? I approve, really nice work and think this is an excellent feature that adds even more signal for the container use case and I think it's ok that it doesn't work for super old kernels etc. Besides containers would also be interested in nailing this for bare-metal hosts.

@loresuso more signals are needed for detecting memory attacks or RCE in a more general and robust way (executables are just one aspect), one step at a time though. And I saw you also refactored the get_exe_writable and created similar get_exe_inode lol, we can sync on how to merge this cleanup into one approach.
Also once all new kernel signals we can come up with at the moment are merged, wanna team up on creating a strong and robust userspace logic to nail it? Would be amazing if some rules come out at the other end that can be enabled by default aka they can work in unknown environments. Called it anomaly detection, but we can also call it advanced signal correlation etc 🙃.

Re fetching the container start time or pid namespace creation time works too still monkeying around if this is best implemented kernel side. Something like somehow fetching the start time of pid=1 as seen from process namespace or the creation ts of the pid namespace the process belongs too ... would you have any thoughts on this?

LucaGuerra · 2022-09-14T14:33:01Z

Hey folks, I'd like to add my thoughts to the discussion since I originally introduced the is_exe_writable flag for this purpose, discussed a lot with Lorenzo about its evolution is_exe_upper_layer and am very interested in basically catching suspicious executions. While it's true that attacks can be fully in memory (which would bypass any file-based rule of course) we all know that a defense-in-depth strategy needs to consider many cases. Also, I expect the most common attacks to be indeed file based. This is a bit of a larger discussion that we may want to expand somewhere.

Regarding attack scenarios, the proposed fields would allow us to add another way to filter events to try and reduce the noise from this kind of rules. I would love for Falco to be able to have a set of rules to deal with the standard "drop + execute" case. This is what comes to mind:

In containers you can do, depending on how your container is built, one of two things: you either use is_exe_writable with containers that runs as regular user but has executable files normally owned by root (this is the default if you run as user!) or is_exe_upper_layer which works with containers executed as root as well 😎 This alerts for new executables at all times.
On hosts in my opinion the best bet is is_exe_writable and inspect non-root users I think because root on a host does way too many things 😭 . Installing and updating software is common, downloading and running software happens often during normal deployments ... So many regular actions would trigger drop+execute that may make this pretty useless :/ But in some deployments it's not expected for regular users to bring their own binaries, and that's what I would want to catch. Also, remember that true root can change mtime and ctime of all files if it wants.

@incertum 's idea I think is definitely clever, as it allows us to add the time dimension to the above. You can say "If a regular user is running an executable that they can modify AND it has been modified 'recently', then alert". This allows us to detect drops without drowning in noise caused by system-wide software updates and new deployments. Same goes for containers. In that case I like the stronger properties of is_exe_upper_layer because you can't easily evade it if you're inside a container. Even if you drop a file today and schedule its execution at some other time it will be caught.

In conclusion, I probably want all of these fields 😎 I actually wanted is_exe_upper_layer in 0.33.0 but there's so much content that is going into that release that we probably want to merge it right after the release so we have time to test it and see that it doesn't break too many things (every new thing happening at process start as you could see is a little tricky...). Does it make sense to you? The first step as you mentioned could be to refactor and generalize the exe inode data collection in the kernel and ebpf.

incertum · 2022-09-15T06:37:07Z

@LucaGuerra ❤️ 😎 as always a fantastic summary and technical assessment of what the actual problem here is. Fully agree that all these signals combined will be super valuable in addition to existing metadata fields. It's nice to see three folks having come to similar conclusions, that is, (1) it is at process startup where we need to fetch better kernel signals and (2) this old problem "drop+execute" has not yet been well addressed.

Of course the "host" is the more tricky one, doesn't change the fact that I have been asked to fix / solve this ... so thinking we won't get away without determining a pattern of past behavior of the applications that are running, and analyze behaviors outside the past behavior. There will be both data modeling challenges and software implementation challenges, the good news is similar problems have been solved in the industry before and we can build upon this. Needless to say let's start more basic and iterate.

How about first merging @loresuso PR that features is_exe_upper_layer after the upcoming release freeze, I'll continue monkeying around a bit for next 2 weeks and see if maybe there are more kernel signals that could be valuable. Perhaps you stumble across something new as well 🙃 that would be cool.
After everything is merged we collaborate on a fresh PR that just does userspace modeling? Also happy to offer deploying a prototype to production to be able to better assess how well it may work and also check that Falco does not deteriorate in case we introduce some significant new userspace features.

... Also, I expect the most common attacks to be indeed file based. This is a bit of a larger discussion that we may want to expand somewhere.

Would you have ideas re what the best forum would be to expand on those Threat Modeling discussions?

Also, remember that true root can change mtime and ctime of all files if it wants.

Yeah you can never just have nice things in security, hence why I am a big fan of multi-signal correlations.

loresuso · 2022-09-15T08:13:08Z

Thanks @incertum @LucaGuerra, this conversation is getting more and more interesting!
I strongly agree that all these signals combined together are needed to improve the detection capabilities of the drop+execute pattern. So, soon after the release, I'll try my best to get the exe_upper_layer merged. Some help in testing it better before the merge would be really appreciated!
Also, I wanted to say that I am thrilled to team up altogether to discuss how to improve the detection capabilities of Falco with these new signals.

I also believe that we have to expand the conversation (maybe in Slack or a Github issue?) to other attack patterns. I think we may want to research a bit on fileless execution (especially the one implemented with memfd_create. Execution from tmpfs) and post container escapes behaviors (like accessing files outside overlayfs from not mounted fs). I think these patterns are widespread too nowadays and I have some ideas that I would love to share with you!

incertum · 2022-09-15T17:28:37Z

Edited: We have moved all brainstorming to #615 in order to keep this PR focused.

incertum · 2022-09-19T05:57:40Z

Kernel side solution for robustness reasons: Add pid namespace init task start ts to generically approx container or host start ts and compute time deltas useful for detections, such as container duration or duration between pidns start ts and ctime exe file if pidns ts predates ctime. A general detection use case can be that if suspicious events happen in multiple containers of a deployment near container start it's more likely to be "normal". The longer a container runs the longer it is "exposed".

What questions do you have re the proposed approach to solve above? Would it be possible to check soundness of this approach? This would be much appreciated. Initial experimentation seemed correct ts values for various scenarios, but will continue testing.

incertum · 2022-09-19T06:06:41Z

Another kernel side signal that would like to look into and possibly add to this PR would be:

"Interpreter scripts" aka text files with execute permissions (see https://man7.org/linux/man-pages/man2/execve.2.html)
For example chmod +x a.sh && ./a.sh or chmod +x a.sh && exec ./a.sh is currently logged as "proc.exepath":"/tmp/a.sh","proc.name":"a.sh","proc.cmdline":"a.sh ./a.sh", but the interpreter was configured as #! /bin/sh and we wouldn't know what interpreter binary ran the script directly or that it was not a binary without inferring from extension if even available and we know how fragile that is.

Please note, not talking about the use case where you run the interpreter and pass the script, like /bin/sh a.sh would give "proc.exepath":"/bin/sh","proc.name":"sh","proc.cmdline":"sh a.sh".

Any thoughts on above? @LucaGuerra @loresuso @FedeDP @Andreagit97

After that this PR should be feature complete and can start finalizing it, followed by code optimization review.

Signed-off-by: Melissa Kilby <[email protected]>

Consistently have constant m_boot_ts_epoch for pidns_init_start_ts when vpid != pid. Signed-off-by: Melissa Kilby <[email protected]>

* Add pidns_init_start_time to sched_prog_fork_3. * Ensure consistent unsigned long long usage and init variable properly. Signed-off-by: Melissa Kilby <[email protected]>

…to sched bpfs * cleanup some debugging leftovers. Signed-off-by: Melissa Kilby <[email protected]>

* address minor reviewers comments * properly init some variables to 0 that were overlooked * use new macro CHECK_RES(res) * perform pidns start ts lookup only when in childtid (raw syscall tracepoints) * formalize consistent helper function epoch_ns_from_time also in modern_bpf * minor modern_bpf refactor based on reviewers comments * additional cleanup after a fresh look Co-authored-by: Andrea Terzolo <[email protected]> Signed-off-by: Melissa Kilby <[email protected]>

Signed-off-by: Melissa Kilby <[email protected]>

* remove redudant CHECK_RES(res) when possible * cleanup epoch_ns_from_time helper function * modern_bpf rename function variable for extract__task_pidns_start_time Co-authored-by: Hendrik Brueckner <[email protected]> Signed-off-by: Melissa Kilby <[email protected]>

Signed-off-by: Melissa Kilby <[email protected]>

Signed-off-by: Andrea Terzolo <[email protected]>

Co-authored-by: Luca Guerra <[email protected]> Signed-off-by: Melissa Kilby <[email protected]>

Signed-off-by: Andrea Terzolo <[email protected]>

Andreagit97 · 2022-12-03T23:02:52Z

The last commit should fix windows CI, just removed scap_get_host_boot_time_ns() helper since we already have scap_get_boot_time() :)

incertum · 2022-12-04T06:18:52Z

The last commit should fix windows CI, just removed scap_get_host_boot_time_ns() helper since we already have scap_get_boot_time() :)

lol classic, thanks for keeping the new more reliable method to get a constant boot ts :)

Signed-off-by: Andrea Terzolo <[email protected]>

FedeDP

/approve

poiana · 2022-12-05T14:44:59Z

LGTM label has been added.

Git tree hash: 7f7068dcd291f5ed76d4ec430bd06adb37f263bf

poiana · 2022-12-05T15:09:15Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: FedeDP, incertum, LucaGuerra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [FedeDP,LucaGuerra]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

poiana added do-not-merge/work-in-progress kind/cleanup release-note dco-signoff: yes kind/feature New feature or request area/driver-kmod area/driver-bpf area/driver-modern-bpf area/libscap labels Sep 12, 2022

poiana requested review from Andreagit97 and hbrueckner September 12, 2022 07:02

poiana added area/libsinsp size/L labels Sep 12, 2022

incertum mentioned this pull request Sep 15, 2022

[TRACKING] "drop+exec" kernel signal correlations detections + threat modeling beyond #615

Closed

poiana added size/XL and removed size/L labels Sep 19, 2022

incertum mentioned this pull request Sep 20, 2022

[FEATURE] Options available for tapping into "linux_binprm" that holds args used when loading binaries #621

Open

incertum force-pushed the new-executable-enhanced-signal branch from d3d1c9d to a5510e7 Compare September 23, 2022 05:27

incertum changed the title ~~[WIP] - Add kernel signals exe_ino, exe_ino_ctime, exe_ino_mtime~~ [WIP] - Add kernel signals exe_ino, exe_ino_ctime, exe_ino_mtime, pidns_init_start_ts + derived filter fields Sep 23, 2022

incertum force-pushed the new-executable-enhanced-signal branch from a5510e7 to b4b54f4 Compare September 24, 2022 00:15

poiana removed the size/XL label Sep 24, 2022

incertum and others added 11 commits December 2, 2022 11:12

cleanup(libsinsp): update approach to derive host boot ts

47071cc

Signed-off-by: Melissa Kilby <[email protected]>

new(libscap): scap procs - add exe ino related and pidns start ts fields

724121f

Signed-off-by: Melissa Kilby <[email protected]>

cleanup(libsinsp): update pidns_init_start_ts parsing

16ac9dc

Consistently have constant m_boot_ts_epoch for pidns_init_start_ts when vpid != pid. Signed-off-by: Melissa Kilby <[email protected]>

cleanup(driver): cleanup pidns_init_start_time

45debc3

* Add pidns_init_start_time to sched_prog_fork_3. * Ensure consistent unsigned long long usage and init variable properly. Signed-off-by: Melissa Kilby <[email protected]>

update(driver-modern-bpf): add exe ino, ctime, mtime, pidns start ts …

10552f2

…to sched bpfs * cleanup some debugging leftovers. Signed-off-by: Melissa Kilby <[email protected]>

cleanup(test): extend modern_bpf test suite with new params

d334912

Signed-off-by: Melissa Kilby <[email protected]>

cleanup(driver-kmod): cleanup f_sched_prog_* for new kernel signals

e1d9fe5

Signed-off-by: Melissa Kilby <[email protected]>

fix: arm64 build and old kernel versions

f489bb0

Signed-off-by: Andrea Terzolo <[email protected]>

fix: verifier issues on s390x

140f612

Signed-off-by: Andrea Terzolo <[email protected]>

incertum force-pushed the new-executable-enhanced-signal branch from 18f1676 to 140f612 Compare December 2, 2022 19:12

cleanup(libsinsp, libscap): move boot_ts_epoch to scap m_machine_info

fa80a6c

Co-authored-by: Luca Guerra <[email protected]> Signed-off-by: Melissa Kilby <[email protected]>

Andreagit97 force-pushed the new-executable-enhanced-signal branch from 4693f3a to db9270e Compare December 3, 2022 22:49

fix: windows/macos CI

074625a

Signed-off-by: Andrea Terzolo <[email protected]>

Andreagit97 force-pushed the new-executable-enhanced-signal branch from db9270e to 074625a Compare December 3, 2022 22:59

fix(ci): fix e2e tests

0b6d43c

Signed-off-by: Andrea Terzolo <[email protected]>

FedeDP approved these changes Dec 5, 2022

View reviewed changes

poiana assigned FedeDP Dec 5, 2022

poiana added the lgtm label Dec 5, 2022

poiana added the approved label Dec 5, 2022

LucaGuerra approved these changes Dec 5, 2022

View reviewed changes

poiana assigned LucaGuerra Dec 5, 2022

poiana merged commit 4e52eed into falcosecurity:master Dec 5, 2022

happy-dude mentioned this pull request Feb 27, 2023

[BUG] Timestamp incorrect in event logs, dated in the future by 7 days/1 week #932

Closed

incertum deleted the new-executable-enhanced-signal branch December 8, 2023 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new(driver, libsinsp, libscap): Add kernel signals exe_ino, exe_ino_ctime, exe_ino_mtime, pidns_init_start_ts + derived filter fields #595

new(driver, libsinsp, libscap): Add kernel signals exe_ino, exe_ino_ctime, exe_ino_mtime, pidns_init_start_ts + derived filter fields #595

incertum commented Sep 12, 2022 •

edited by leogr

Loading

FedeDP commented Sep 12, 2022 •

edited

Loading

loresuso commented Sep 12, 2022

incertum commented Sep 12, 2022

LucaGuerra commented Sep 14, 2022

incertum commented Sep 15, 2022

loresuso commented Sep 15, 2022 •

edited

Loading

incertum commented Sep 15, 2022 •

edited

Loading

incertum commented Sep 19, 2022

incertum commented Sep 19, 2022

Andreagit97 commented Dec 3, 2022

incertum commented Dec 4, 2022

FedeDP left a comment

poiana commented Dec 5, 2022

poiana commented Dec 5, 2022

new(driver, libsinsp, libscap): Add kernel signals exe_ino, exe_ino_ctime, exe_ino_mtime, pidns_init_start_ts + derived filter fields #595

new(driver, libsinsp, libscap): Add kernel signals exe_ino, exe_ino_ctime, exe_ino_mtime, pidns_init_start_ts + derived filter fields #595

Conversation

incertum commented Sep 12, 2022 • edited by leogr Loading

FedeDP commented Sep 12, 2022 • edited Loading

loresuso commented Sep 12, 2022

incertum commented Sep 12, 2022

LucaGuerra commented Sep 14, 2022

incertum commented Sep 15, 2022

loresuso commented Sep 15, 2022 • edited Loading

incertum commented Sep 15, 2022 • edited Loading

incertum commented Sep 19, 2022

incertum commented Sep 19, 2022

Andreagit97 commented Dec 3, 2022

incertum commented Dec 4, 2022

FedeDP left a comment

Choose a reason for hiding this comment

poiana commented Dec 5, 2022

poiana commented Dec 5, 2022

incertum commented Sep 12, 2022 •

edited by leogr

Loading

FedeDP commented Sep 12, 2022 •

edited

Loading

loresuso commented Sep 15, 2022 •

edited

Loading

incertum commented Sep 15, 2022 •

edited

Loading