Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Falco unable to retrieve correct uid and container names from LXC containers #3213

Closed
Lameterra opened this issue May 23, 2024 · 13 comments
Closed
Assignees
Labels
Milestone

Comments

@Lameterra
Copy link

Describe the bug
Falco has a hard time retrieving informations from LXC containers (using falco-modern-bpf installed on the host).
Common fields affected are %container.name %container.id %user.loginname.

We have a rule that allow us to triggers events logs whenever a command is executed by a physical or virtual user with some conditions so we don't get every commands executed by a user.

For the users, the problem is that logs generated by events from Falco contains <NA>, nulls, and sometimes wrong values when a host and its container have a different username for the same uid. The value retrieved is the name associated with the uid in the /etc/passwd file from the host machine, which can be wrong as the user corresponding to this uid is not the same on the container. Moreover, if the uid is non-existent in the host, the value can be blank or <NA>.

Regarding the container name, an LXC container is always named <NA>.
We don't have the same problem for our dockers containers or bare-metal machines but the rule still the same.

How to reproduce it
Install Falco on a machine hosting LXC containers with an ansible playbook following installation details and use our rule:

- rule: User executing commands
  desc: Detect when a user executes commands
  condition: >
    evt.type = execve and
    container.id != host and
    (proc.pname in (bash, sh, zsh, fish, <NA>)) and
    proc.cmdline != bash and
    not user.name in (jenkins, awx, nagios, prometheus)
  output: >
    User has executed a command (proc_cwd=%proc.cwd proc_pcmdline=%proc.pcmdline user=%user.name user_uid=%user.uid user_loginuid=%user.loginuid user_loginname=%user.loginname group_gid=%group.gid group_name=%group.name process=%proc.name proc_exepath=%proc.exepath parent=%proc.pname command=%proc.cmdline proc_ses=%proc.sid proc_vpid=%proc.vpid proc_vpgid=%proc.vpgid terminal=%proc.tty timestamp=%evt.datetime.s hostname=%evt.hostname container_id=%container.id container_name=%container.name)
  priority: INFORMATIONAL

Expected behaviour
It should display the expected container hostname instead of <NA> and give the username related to the uid of the container

Screenshots

Environment

  • Falco version:
Falco version: 0.37.1    <---
Libs version:  0.14.3
Plugin API:    3.2.0
Engine:        0.31.0
Driver:
  API version:    8.0.0
  Schema version: 2.0.0
  Default driver: 7.0.0+driver
  • System info:
Thu May 23 10:28:31 2024: Falco version: 0.37.1 (x86_64)
Thu May 23 10:28:31 2024: Falco initialized with configuration file: /etc/falco/falco.yaml
Thu May 23 10:28:31 2024: System info: Linux version 6.1.0-20-amd64 ([email protected]) (gcc-12 (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC Debian 6.1.85-1 (2024-04-11)
Thu May 23 10:28:31 2024: Loading rules from file /etc/falco/falco_rules.yaml
Thu May 23 10:28:32 2024: Loading rules from file /etc/falco/rules.d/custom_rules.yaml
{
  "machine": "x86_64",
  "nodename": "lxc-machine-52",
  "release": "6.1.0-20-amd64",
  "sysname": "Linux",
  "version": "#1 SMP PREEMPT_DYNAMIC Debian 6.1.85-1 (2024-04-11)"
}
  • Cloud provider or hardware configuration:
    LXC
  • OS:
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
  • Kernel:
Linux lxc-machine-52 6.1.0-20-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.85-1 (2024-04-11) x86_64 GNU/Linux
  • Installation method:

Installed from installation details provided by Falco documentation through an ansible playbook developed.

Additional context
We don't install clang, llvm or dialog (we set FALCO_FRONTEND to noninteractive and set FALCO_DRIVER_CHOICE during one playbook task).

@FedeDP
Copy link
Contributor

FedeDP commented May 28, 2024

Hi! Thanks for opening this issue!

For the users, the problem is that logs generated by events from Falco contains , nulls, and sometimes wrong values when a host and its container have a different username for the same uid. The value retrieved is the name associated with the uid in the /etc/passwd file from the host machine, which can be wrong as the user corresponding to this uid is not the same on the container. Moreover, if the uid is non-existent in the host, the value can be blank or .

When uid is matched against an host user, it means that Falco failed to detect that the process is inside a container and therefore it is looking up for uid in the host user map.
When it is NA it means that it is not able to find any host user with the requested uid.

Regarding the container name, an LXC container is always named .

I think we are simply failing to retrieve lxc containers (ie: they might be broken :/ ).
I will give it a look once the new release (Falco 0.38.0) is out in the upcoming days!

/assign

/milestone 0.39.0

@FedeDP
Copy link
Contributor

FedeDP commented May 29, 2024

Hi! I opened this: falcosecurity/libs#1879 with the fix; it seems like we never supported the new LXC cgroup layout (since 4.0): https://linuxcontainers.org/lxc/news/2020_03_25_13_03.html

Unfortunately, the fix will be part of the next Falco release (ie: Falco 0.39 released by the end of September).
I will ping you once the fix is merged and ready to be tested from Falco master images, if you are willing to test the fix!

@Lameterra
Copy link
Author

Hi Federico !
Thanks a lot for your quick feedback on this problem.

We can give your fix a try once it's merged, would be a pleasure. Feel free to ping me once it's done.
Thanks again for your support regarding this issue

@FedeDP
Copy link
Contributor

FedeDP commented May 29, 2024

In the end, since we discovered a more worrisome bug in libs 0.17.0 (to be used by Falco 0.38.0), i decided to take also the LXC fix in libs 0.17.1; Falco 0.38.0 will be released with the fix 😃 #3221

@FedeDP
Copy link
Contributor

FedeDP commented May 29, 2024

/milestone 0.38.0

@poiana poiana modified the milestones: 0.39.0, 0.38.0 May 29, 2024
@FedeDP
Copy link
Contributor

FedeDP commented May 29, 2024

Can you try with https://github.com/falcosecurity/falco/releases/tag/0.38.0-rc5?
You can find the docker images on dockerhub or the rpm/deb packages on download.falco.org

@Lameterra
Copy link
Author

Thanks for the update !
We will try it tomorrow and let you know how it goes

@FedeDP
Copy link
Contributor

FedeDP commented May 30, 2024

Falco 0.38.0 is now released with my LXC fix ;) so you can now use normal Falco images!

@Lameterra
Copy link
Author

Hi Federico,
We tried the test version you linked yesterday. At first, there is a huge improvement regarding informations retrieved from LXC hosts and containers.
About LXC container names, it's all good. No more <NA> occuring, your fix did the job !
Regarding the user_loginname, there is also a nice improvement. We still have some <NA> but I think this is not related to LXC itself, as it's also occuring on the host for example.

To illustrate, below are two Falco logs :

14:12:44.265659523: Informational User has executed a command (proc_cwd=/ proc_pcmdline=sh -c dpkg -l auditd | grep auditd | awk '{print $3}' user=root user_uid=0 user_loginuid=-1 user_loginname=<NA> group_gid=0 group_name=root process=dpkg proc_exepath=/usr/bin/dpkg parent=sh command=dpkg -l auditd proc_ses=4193792 proc_vpid=2700422 proc_vpgid=4193792 terminal=0 timestamp=2024-05-30 14:12:44 hostname=lxc-host-2 container_id=host container_name=host)

14:12:41.342505987: Informational User has executed a command (proc_cwd=/var/www/ee/site/releases/7d6482598b6614088b39bebc0ad7c5be2b178541/ proc_pcmdline=sh -c stty 2>&1 user=ee user_uid=5667 user_loginuid=-1 user_loginname=<NA> group_gid=5667 group_name=ee process=sh proc_exepath=/usr/bin/bash parent=sh command=sh -c stty 2>&1 proc_ses=1094273 proc_vpid=1953761 proc_vpgid=2603583 terminal=0 timestamp=2024-05-30 14:12:41 hostname=lxc-host-2 container_id=lxc-machine-25 container_name=lxc-machine-25)

These two entries share a value of -1 on the field user_loginuid.
Do you have any idea regarding this ?

And thanks for the update regarding the release of Falco 0.38.0 !

@FedeDP
Copy link
Contributor

FedeDP commented May 30, 2024

Hi! Yes, user_loginuid can be -1 when the command is not executed from a login shell; for example, see this stackoverflow post: https://stackoverflow.com/questions/22914627/some-uids-in-proc-pid-loginuid-are-strange

@Lameterra
Copy link
Author

Oh, I was not aware of this, thank you for the link. We will deal with it directly through Falco rules then.

Well I think that is all regarding this issue, you fixed our problem and I am grateful for that. Thanks a lot !

@FedeDP
Copy link
Contributor

FedeDP commented May 30, 2024

You are welcome!
Thanks for helping us with testing :)
/close

@poiana poiana closed this as completed May 30, 2024
@poiana
Copy link
Contributor

poiana commented May 30, 2024

@FedeDP: Closing this issue.

In response to this:

You are welcome!
Thanks for helping us with testing :)
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants