Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(libsinsp/container_engine): proper containerd support #2195

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

therealbobo
Copy link
Contributor

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

What this PR does / why we need it:

Up until now the container information were populated only through the cri api. There are cases in which this is not sufficient (e.g. bottlerocket): for example the containerd runtime keeps the containers creates through the cri api separated from the ones created with the containerd api. This PR aims to support also the containers created with the vanilla containerd api.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@poiana
Copy link
Contributor

poiana commented Dec 9, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: therealbobo
Once this PR has been reviewed and has the lgtm label, please assign incertum for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana requested review from hbrueckner and mstemm December 9, 2024 17:37
@poiana poiana added the size/XXL label Dec 9, 2024
@therealbobo therealbobo force-pushed the broader-container-id-support branch from 893a76f to 7c3c360 Compare December 9, 2024 17:42
if(m_container_engine_mask & (1 << CT_CONTAINERD)) {
auto containerd_engine = std::make_shared<container_engine::containerd>(*this);
m_container_engines.push_back(containerd_engine);
// m_container_engine_by_type[CT_CONTAINERD] = containerd_engine;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This depends on this #2141

Copy link

github-actions bot commented Dec 9, 2024

Perf diff from master - unit tests

     8.52%     -0.70%  [.] sinsp_evt::get_type
    11.65%     -0.63%  [.] sinsp::next
     3.86%     -0.56%  [.] gzfile_read
     5.80%     +0.46%  [.] next_event_from_file
     0.69%     +0.41%  [.] sinsp_evt::get_direction
     0.73%     +0.40%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char const*>
     0.53%     +0.37%  [.] std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>
     3.31%     +0.30%  [.] sinsp_evt::load_params
     0.16%     +0.26%  [.] libsinsp::runc::match_one_container_id
     0.59%     -0.25%  [.] std::_Hashtable<long, std::pair<long const, std::shared_ptr<sinsp_fdinfo> >, std::allocator<std::pair<long const, std::shared_ptr<sinsp_fdinfo> > >, std::__detail::_Select1st, std::equal_to<long>, std::hash<long>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node

Heap diff from master - unit tests

peak heap memory consumption: -5.79K
peak RSS (including heaptrack overhead): 0B
total memory leaked: -3.93K

Heap diff from master - scap file

peak heap memory consumption: -5.97K
peak RSS (including heaptrack overhead): 0B
total memory leaked: -3.93K

Benchmarks diff from master

Comparing gbench_data.json to /root/actions-runner/_work/libs/libs/build/gbench_data.json
Benchmark                                                         Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------
BM_sinsp_split_mean                                            +0.0211         +0.0211           144           147           144           147
BM_sinsp_split_median                                          +0.0150         +0.0150           145           147           145           147
BM_sinsp_split_stddev                                          -0.3990         -0.3981             3             2             3             2
BM_sinsp_split_cv                                              -0.4114         -0.4106             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_mean                  -0.0459         -0.0459            63            60            63            60
BM_sinsp_concatenate_paths_relative_path_median                -0.0437         -0.0437            63            60            63            60
BM_sinsp_concatenate_paths_relative_path_stddev                +1.2445         +1.2470             0             0             0             0
BM_sinsp_concatenate_paths_relative_path_cv                    +1.3525         +1.3551             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_mean                     +0.0309         +0.0309            24            25            24            25
BM_sinsp_concatenate_paths_empty_path_median                   +0.0310         +0.0310            24            24            24            24
BM_sinsp_concatenate_paths_empty_path_stddev                   -0.0794         -0.0792             0             0             0             0
BM_sinsp_concatenate_paths_empty_path_cv                       -0.1070         -0.1068             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_mean                  -0.0413         -0.0413            64            61            64            61
BM_sinsp_concatenate_paths_absolute_path_median                -0.0435         -0.0435            64            61            64            61
BM_sinsp_concatenate_paths_absolute_path_stddev                +1.3749         +1.3749             0             0             0             0
BM_sinsp_concatenate_paths_absolute_path_cv                    +1.4773         +1.4773             0             0             0             0
BM_sinsp_split_container_image_mean                            +0.0159         +0.0158           390           396           390           396
BM_sinsp_split_container_image_median                          +0.0140         +0.0140           390           395           390           395
BM_sinsp_split_container_image_stddev                          -0.0367         -0.0364             3             3             3             3
BM_sinsp_split_container_image_cv                              -0.0517         -0.0514             0             0             0             0

Copy link

codecov bot commented Dec 9, 2024

Codecov Report

Attention: Patch coverage is 29.82456% with 160 lines in your changes missing coverage. Please review.

Project coverage is 75.05%. Comparing base (9ee57c8) to head (3978470).

Files with missing lines Patch % Lines
userspace/libsinsp/container_engine/containerd.cpp 10.44% 120 Missing ⚠️
userspace/libsinsp/container_engine/containerd.h 0.00% 23 Missing ⚠️
userspace/libsinsp/container.h 28.57% 5 Missing ⚠️
userspace/libsinsp/container.cpp 90.90% 3 Missing ⚠️
userspace/libsinsp/container_engine/cri.cpp 85.71% 2 Missing ⚠️
userspace/libsinsp/container_engine/bpm.cpp 0.00% 1 Missing ⚠️
...serspace/libsinsp/container_engine/docker/base.cpp 0.00% 1 Missing ⚠️
...serspace/libsinsp/container_engine/libvirt_lxc.cpp 0.00% 1 Missing ⚠️
userspace/libsinsp/container_engine/lxc.cpp 0.00% 1 Missing ⚠️
userspace/libsinsp/container_engine/mesos.cpp 0.00% 1 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2195      +/-   ##
==========================================
- Coverage   75.38%   75.05%   -0.34%     
==========================================
  Files         264      266       +2     
  Lines       34014    34189     +175     
  Branches     5805     5914     +109     
==========================================
+ Hits        25642    25661      +19     
- Misses       8372     8528     +156     
Flag Coverage Δ
libsinsp 75.05% <29.82%> (-0.34%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@therealbobo therealbobo force-pushed the broader-container-id-support branch from 7c3c360 to b0e0a90 Compare December 9, 2024 17:54
@therealbobo therealbobo force-pushed the broader-container-id-support branch from b0e0a90 to bc7d08d Compare December 10, 2024 12:18
@therealbobo therealbobo force-pushed the broader-container-id-support branch 5 times, most recently from 82b4b68 to 3371202 Compare December 16, 2024 16:00
@therealbobo therealbobo force-pushed the broader-container-id-support branch 2 times, most recently from 2dc0158 to 3978470 Compare December 17, 2024 13:57
@Andreagit97 Andreagit97 added this to the TBD milestone Dec 17, 2024
@therealbobo therealbobo force-pushed the broader-container-id-support branch from 3978470 to 3c6b970 Compare December 17, 2024 19:54
@therealbobo therealbobo changed the title [WIP] proper containerd support feat(libsinsp/container_engine): proper containerd support Dec 18, 2024
@therealbobo therealbobo marked this pull request as ready for review December 18, 2024 06:32
@therealbobo therealbobo force-pushed the broader-container-id-support branch from cf13c54 to 13aead4 Compare December 19, 2024 10:16
@@ -393,7 +397,7 @@ TEST_F(sys_call_test, container_docker_bad_socket) {
ASSERT_NE(PPME_CONTAINER_JSON_2_E, param.m_evt->get_type());

sinsp_threadinfo* tinfo = param.m_evt->get_thread_info(false);
ASSERT_TRUE(tinfo->m_container_id.length() == 12);
ASSERT_TRUE(tinfo->m_container_id.length() <= 12);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Care to expand on this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until now the container.id field was just the truncated version of the container.full_id which was a 64 char string. This is not always the case (e.g. containerd permits to define an arbitrary container id string) and we should accept container id of arbitrary length. Given that, the container id can now be less than 12 chars.

@@ -68,6 +70,7 @@ void event_capture::start(bool dump, libsinsp::events::set<ppm_sc_code>& sc_set)
}
}
}
m_before_open(m_inspector.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to introduce these new callback?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if it needs to stay, at the very least i'd rename before_close to after_capture too.

Copy link
Contributor Author

@therealbobo therealbobo Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The container engines are initialised before the start of the capture and so we need an hook to set the cri settings before the start of the capture and after the open.

Also, if it needs to stay, at the very least i'd rename before_close to after_capture too.

👍


target_link_libraries(sinsp PUBLIC cri_v1alpha2 cri_v1)
target_link_libraries(sinsp PRIVATE containerd_interface)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that cri_v1alpha2 and cri_v1 can be moved to PRIVATE too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be a problem given that the libsinsp client shouldn't access the cri interfaces (at least not directly).

@@ -250,6 +251,82 @@ function(prepare_cri_grpc api_version)
endif()
endfunction()

function(prepare_containerd_grpc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function seems pretty similar to the prepare_cri_grpc above; do you think we can easily extract a single cmake helper?
I am not sure though, there are some subtle differences.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a great way of doing things but I'll clean up most of it with a later PR, once we upgrade protobuf we can use better helpers.

m_container_engine_by_type[CT_CRI] = cri_engine;
m_container_engine_by_type[CT_CRIO] = cri_engine;
m_container_engine_by_type[CT_CONTAINERD] = cri_engine;
// Get CRI socket paths from settings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this PR still based on #2141?
These changes seems related to that PR. In this case, i'd move this PR as wip until the other gets merged and this one rebased.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeap! exactly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#2141 is now merged!

@@ -224,13 +232,20 @@ class sinsp_container_manager : public libsinsp::container_engine::container_cac
* This method effectively checks if m_lookups[container_id][ctype]
* exists, without creating unnecessary map entries along the way.
*/
bool should_lookup(const std::string& container_id, sinsp_container_type ctype) override {
bool should_lookup(const std::string& container_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the comment?
Also, can you expand on why were these changes needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of more than one cri runtime, we should have a way to address (in this case by index) all the runtimes caches. I'll add a comment.

using namespace libsinsp::container_engine;
using namespace libsinsp::runc;

constexpr const cgroup_layout CONTAINERD_CGROUP_LAYOUT[] = {{"/default/", ""}, {nullptr, nullptr}};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a full cgroup example to be added as a comment here?

std::string container_id;
sinsp_container_type container_type;
unsigned long uid;
bool request_rw_size;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually unused; do you plan to add support for update_with_size API to containerd or we can drop it?

@@ -269,7 +245,7 @@ bool cri::resolve(sinsp_threadinfo *tinfo, bool query_os_for_missing_info) {
} else {
cache->notify_new_container(container, tinfo);
}
return true;
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on this change?

// In some container runtimes the container the container id is not
// necessarly CONTAINER_ID_LENGTH long and can be arbitrarly defined.
// To keep it simple we only discard the container id > of CONTAINER_ID_LENGTH.
if(end_pos - start_pos > CONTAINER_ID_LENGTH || end_pos - start_pos == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we only accept container_id whose length is <= CONTAINER_ID_LENGTH right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below in this function we call

container_id = cgroup.substr(start_pos, REPORTED_CONTAINER_ID_LENGTH);

What if cgroup length is less than REPORTED_CONTAINER_ID_LENGTH? It would throw an out_of_range exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ops! good point!

@FedeDP
Copy link
Contributor

FedeDP commented Dec 19, 2024

/hold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants