Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Testing framework shared between all the drivers #783

Closed
wants to merge 58 commits into from

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind cleanup

/kind feature

Any specific area of the project related to this PR?

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area tests

Does this PR require a change in the driver versions?

No

What this PR does / why we need it:

Finally, we have the testing framework shared between all the 3 drivers! Right now it supports only the syscalls implemented in the modern probe but every time we add a new syscall in the modern and so a new test we can check that this test runs also for the other 2 drivers.
I know this PR is quite huge, and sorry for that, when I started I didn't imagine it would become something like this 😟 BTW we can think of splitting it in some way not so easy to do but we can find something.

One of the most widespread issues is the conversion from 32-bit signed numbers to 64-bit signed numbers. What we usually do
is:

  • take a 32-bit signed number like -1 (FFFFFFFF)
  • put it into an unsigned long (00000000FFFFFFFF)
  • push it to userspace, which expects an int64 so -1 should be (FFFFFFFFFFFFFFFF)

We have other issues with syscalls that receive data like (recvfrom, recvmsg), because we rely on userspace structs that could be empty so in some cases, we are not able to extract the desired data (you can find more info in every single commit).

When a syscall doesn't behave as we expect I've chosen to skip the test case, but I don't skip it a priori, before doing that, I check that the event sent to userspace is correct, and only at the end I skip the test case, this is the reason why in the old probe or in the kmod you could see some tests SKIPPED

The idea is to run a sort of matrix in CI (I've still to check if all works well in CI), right now I would to run them on an ubuntu 22.04 machine and an ubuntu 20.04 one (only x86, unfortunately, we cannot run these tests in CI with aarch64 and s390x and that is really a pity since I tried locally and they work also on these architectures :( )

Which issue(s) this PR fixes:

Special notes for your reviewer:

I suggest reviewing it commit by commit, every commit should address a particular issue.
The PR is still not ready, I've to check the CI part, but I've tested it locally on the following machines:

  • 5.11.0-051100 (x86)
  • 5.15.0-1023 (x86)
  • 4.14.276 (x86)
  • 5.15.0-1019 (aarch64)
  • 5.11.0-051100 (s390x)

Does this PR introduce a user-facing change?:

NONE

@poiana
Copy link
Contributor

poiana commented Dec 9, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Andreagit97
Copy link
Member Author

and yes, I didn't change 202 files I've just renamed the external folder, and now we can build tests also without the modern bpf probe, I wrote everything in the README so please take a look :)

@Andreagit97
Copy link
Member Author

Cool it seems that gtest doesn't define GTEST_SKIP() when we use system deps 🙄

@hbrueckner
Copy link
Contributor

Hi @Andreagit97,

the PR reads awesome ❤️ ! And my thanks for also testing it on s390x! The s390 CI topic is still on my todo list ;-)

and yes, I didn't change 202 files I've just renamed the external folder, and now we can build tests also without the modern bpf probe, I wrote everything in the README so please take a look :)

I will have a look at it ;-)

@Andreagit97 Andreagit97 force-pushed the test_framework branch 2 times, most recently from 1b17109 to ede6d73 Compare December 9, 2022 12:32
@Andreagit97
Copy link
Member Author

The question about build-modern-bpf-arm64 and build-modern-bpf-s390x(point [2] #782) is quite clear here we are using 2 jobs just to build the cpp tests without running them :/

@Andreagit97
Copy link
Member Author

I supposed that ubuntu-20.04 and ubuntu-22.04 runners were different but they run on the same kernel so it's completely useless to have both of them I will remove ubuntu-20.04

Copy link
Member Author

@Andreagit97 Andreagit97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A sort of little self-review

build-and-test-modern-bpf-x86:
name: build-and-test-modern-bpf-x86 😇 (bundled_deps)
# This job checks that we correctly run `scap-open` for all the 3 drivers.
test-scap-open-x86:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a test to check all drivers correctly start (only x86, it would be amazing to test also other archs)

@@ -38,6 +38,7 @@ option(USE_BUNDLED_DEPS "Enable bundled dependencies instead of using the system
option(MINIMAL_BUILD "Produce a minimal build with only the essential features (no eBPF probe driver, no kubernetes, no mesos, no marathon and no container metadata)" OFF)
option(MUSL_OPTIMIZED_BUILD "Enable if you want a musl optimized build" OFF)
option(USE_BUNDLED_DRIVER "Use the driver/ subdirectory in the build process (only available in Linux)" ON)
option(ENABLE_DRIVERS_TESTS "Enable driver tests (bpf, kernel module, modern bpf)" OFF)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to disable it by default otherwise some CI jobs that use system deps will break since the gtest version is different

@@ -1227,7 +1215,8 @@ FILLER(sys_socketpair_x, true)
{
struct unix_sock *us = NULL;
struct sock *speer = NULL;
int fds[2] = { 0 };
/* In case of failure we send invalid fd (-1) */
int fds[2] = {-1, -1};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like in the modern probe

/* Levels different from `SOL_SOCKET` are not supported
* right now.
*/
if(level != SOL_SOCKET)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic is the same as before, i've just removed a nested if

Comment on lines +1286 to +1302
case SO_RCVTIMEO:
#endif
#if (defined(SO_RCVTIMEO_OLD) && !defined(SO_RCVTIMEO)) || (defined(SO_RCVTIMEO_OLD) && (SO_RCVTIMEO_OLD != SO_RCVTIMEO))
case SO_RCVTIMEO_OLD:
#endif
#if (defined(SO_RCVTIMEO_NEW) && !defined(SO_RCVTIMEO)) || (defined(SO_RCVTIMEO_NEW) && (SO_RCVTIMEO_NEW != SO_RCVTIMEO))
case SO_RCVTIMEO_NEW:
#endif
#ifdef SO_SNDTIMEO
case SO_SNDTIMEO:
case SO_SNDTIMEO:
#endif
if (bpf_probe_read(&u.tv, sizeof(u.tv), optval))
return PPM_FAILURE_INVALID_USER_MEMORY;
return bpf_val_to_ring_dyn(data, u.tv.tv_sec * 1000000000 + u.tv.tv_usec * 1000, PT_RELTIME, PPM_SOCKOPT_IDX_TIMEVAL);
#if (defined(SO_SNDTIMEO_OLD) && !defined(SO_SNDTIMEO)) || (defined(SO_SNDTIMEO_OLD) && (SO_SNDTIMEO_OLD != SO_SNDTIMEO))
case SO_SNDTIMEO_OLD:
#endif
#if (defined(SO_SNDTIMEO_NEW) && !defined(SO_SNDTIMEO)) || (defined(SO_SNDTIMEO_NEW) && (SO_SNDTIMEO_NEW != SO_SNDTIMEO))
case SO_SNDTIMEO_NEW:
#endif
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround to avoid 2 case with the same value, for example:
if SO_SNDTIMEO is defined its value will be SO_SNDTIMEO_OLD or SO_SNDTIMEO_NEW so we cannot have a case for each of them

@@ -0,0 +1,79 @@
#include "../../event_class/event_class.h"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again this is not a new test case!

Comment on lines +27 to +39
if(ret_pid == 0)
{
/* In this way in the father we know if the call was successful or not. */
if(syscall(__NR_bpf, cmd, attr, size) == -1)
{
/* SUCCESS because we want the call to fail */
exit(EXIT_SUCCESS);
}
else
{
exit(EXIT_FAILURE);
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to call the bpf syscall from the child because the main process throws lots of bpf and sometimes we caught the wrong event from the ring buffer. With this workaround, we have just one bpf syscall produced by the child

Same for ioctl since the kmod use many ioctl calls

Comment on lines +144 to +151
/* Here we scan the parent just to obtain some info for the child */
struct proc_info info = {0};
pid_t pid = ::getpid();
if(!get_proc_info(pid, &info))
{
FAIL() << "Unable to get all the info from proc" << std::endl;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the approach in all process syscalls. Now the father collects some data and asserts some parameters of the child. In the previous approach, the child asserted itself but the problem is that now we use a scap_handle and this should not be shared between more than one process for this reason i chose this new method :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for all other process syscalls

@@ -0,0 +1,83 @@
#include "../../event_class/event_class.h"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a new test!

uint64_t request = SIOCGIFCOUNT;
char* argp = NULL;

/* Here we need to call the `ioctl` from a child because the main process throws lots of
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same approach described in bpf syscall

@Andreagit97
Copy link
Member Author

/hold

@Andreagit97
Copy link
Member Author

/milestone next-driver

@poiana poiana added this to the next-driver milestone Dec 9, 2022
Signed-off-by: Andrea Terzolo <[email protected]>
Signed-off-by: Andrea Terzolo <[email protected]>
Signed-off-by: Andrea Terzolo <[email protected]>
Signed-off-by: Andrea Terzolo <[email protected]>
Signed-off-by: Andrea Terzolo <[email protected]>
Signed-off-by: Andrea Terzolo <[email protected]>
Signed-off-by: Andrea Terzolo <[email protected]>
@Andreagit97
Copy link
Member Author

Probably this is too complex to review in one shot, I will keep this PR open just to show you the final result, but I will split its content into smaller PRs

We merged all the required PRs I think we can close this one :)

@hbrueckner
Copy link
Contributor

Hi @Andreagit97 ,
with rebase of #809, I started running the driver_test against the BPF probe. There is yet a different behavior for sys_accept_x where modern BPF tests for successful return and BPF driver always provides information. I would add below to #809 unless you like to have a separate PR for it. Wdyt?

--- a/driver/bpf/fillers.h
+++ b/driver/bpf/fillers.h
@@ -2867,25 +2867,32 @@ FILLER(sys_accept_x, true)
        int res = bpf_val_to_ring_type(data, (s64)fd, PT_FD);
        CHECK_RES(res);
 
-       /* Parameter 2: tuple (type: PT_SOCKTUPLE) */
-       long size = bpf_fd_to_socktuple(data, fd, NULL, 0, false, true, data->tmp_scratch);
-       data->curarg_already_on_frame = true;
-       res = __bpf_val_to_ring(data, 0, size, PT_SOCKTUPLE, -1, false, KERNEL);
-       CHECK_RES(res);
-
        u32 queuelen = 0;
        u32 queuemax = 0;
        u8 queuepct = 0;
 
-       /* Get the listening socket (first syscall parameter) */
-       s32 listening_fd = (s32)bpf_syscall_get_argument(data, 0);
-       struct socket * sock = bpf_sockfd_lookup(data, listening_fd);
-       struct sock *sk = _READ(sock->sk);
-       queuelen = _READ(sk->sk_ack_backlog);
-       queuemax = _READ(sk->sk_max_ack_backlog);
-       if(queuelen && queuemax)
+       if (fd >= 0)
+       {
+               /* Parameter 2: tuple (type: PT_SOCKTUPLE) */
+               long size = bpf_fd_to_socktuple(data, fd, NULL, 0, false, true, data->tmp_scratch);
+               data->curarg_already_on_frame = true;
+               res = __bpf_val_to_ring(data, 0, size, PT_SOCKTUPLE, -1, false, KERNEL);
+               CHECK_RES(res);
+
+               /* Get the listening socket (first syscall parameter) */
+               s32 listening_fd = (s32)bpf_syscall_get_argument(data, 0);
+               struct socket * sock = bpf_sockfd_lookup(data, listening_fd);
+               struct sock *sk = _READ(sock->sk);
+               queuelen = _READ(sk->sk_ack_backlog);
+               queuemax = _READ(sk->sk_max_ack_backlog);
+               if(queuelen && queuemax)
+               {
+                       queuepct = (u8)((u64)queuelen * 100 / queuemax);
+               }
+       }
+       else
        {
-               queuepct = (u8)((u64)queuelen * 100 / queuemax);
+               bpf_push_empty_param(data);
        }
 
        /* Parameter 3: queuepct (type: PT_UINT8) */

@Andreagit97
Copy link
Member Author

Hey @hbrueckner thank you for that! feel free to add these changes into #809 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants