Replies: 3 comments 10 replies
-
In your proposed Kubernetes provider setup for asset discovery and scanning, when dealing with container images, you mentioned that images can be discovered from multiple sources such as running containers, the Kubernetes node API, and configurations within Deployments, Jobs, ReplicaSets, and Pods. Among these sources, which one would you prioritize for obtaining the most accurate and comprehensive image data, considering factors like image size, architecture, and potential variations due to deployment stages? |
Beta Was this translation helpful? Give feedback.
-
Maybe better to use the
But in that case we are missing the "pod" info, right?
I would say that for all scanners it's better to unpack the image, that way there will not be any changes to the scanners, they all still gets folder path as an input, right? (even sbom analyzer should get the unpack image path)
Can you elaborate more on how it is done?
Are we sure it is possible for multiple readers from the same PVC in all k8s providers?
+1 for that. |
Beta Was this translation helpful? Give feedback.
-
Why we need to push to a docker registry and scan it with another scanner? why don't we just commit -> save -> unpack -> scan? |
Beta Was this translation helpful? Give feedback.
-
Goal
To create a provider for Kubernetes which can discover and scan assets on a cluster.
Types of Assets
In Kubernetes there are a number of different assets which could be discovered and scanned for issues. To start with Containers and Container Images will be the supported assets, something like a "Pod" could be considered an asset in the future but it is a logical/virtual asset so we'd need asset relationships as a prerequisite for this. For now the "Pod" a container is part of can be considered part of its location.
Discovery
Containers
To discover running containers the provider will list all pods and then inspect their status. Within the status field there is a subfield "containerStatuses" this contains all the information about each container that is part of the pod. If there are multiple containers in a pod then all the statuses are here.
Each container status has:
Each container status also has a state field which indicates if the container is running or terminated. If a container is terminated then the asset should be considered terminated too.
Container Images
The container images on the cluster can be discovered in a number of ways:
First is the container images that the running containers (collected above) were started from, this information is available in the containerStatuses. The image IDs are available in a tagged format or sha format. For example:
Second is from the kubernetes node API. Each node has a list of all the container images which are currently stored on that node's container runtime. This will give a more complete picture of all the possible running container images and include historic images if kubelet hasn't garbage collected them yet.
sizeBytes: 73699188
I think the most complete way to find the images from the cluster is to use the nodes API. This will give us the best data to populate the assets. It gives us size in bytes, it gives us the arch from the node's arch, and os info.
Scanning
Container Images
Container image scanning should take a similar form to what we do in KubeClarity today. The VMClarity CLI accepts a container image ID as an input for some of the families (SBOM and Vulnerability) and some of the others need to be extended to support it (Malware, Misconfiguration).
The VMClarity CLI will then pull the image and if necessary unpack the image into a temporary file system mount. Once this occurs the scan will be the same as any other file system.
This can be run as a Kubernetes Job on the cluster the same as we do for KubeClarity. Similar to KubeClarity we should handle images that require an ImagePullSecret by detecting then using the image pull secret and namespace of another pod using this image in the system. The downside of this is that unless there is a pod running the image we won't be able to discover the credentials required to pull private images for scanning
Containers
Similar to the docker provider containers will be scanned by taking snapshots of their filesystem at the time the scan is run, and then that will be analysed by the VMClarity CLI the same as we do for VMs. In Kubernetes its a little more complicated to perform that it would be on a single docker daemon because the data can be distributed across multiple nodes.
This can be solved by utilising the Kubernetes persistent storage, or co-locating the scan job with the location of the original container.
Using persistent storage would allow a flow to mirror the what we do for virtual machines:
Without persistent storage we could handle things in a single operation with no need to store the snapshot volume:
Another alternative is to use a container registry as an intermediary:
If we want to support the most Kubernetes clusters that we can support, developing a solution which doesn't require persistent storage is the best option IMO. The providers don't have to follow the, take a snapshot, create a volume, mount the volume logic if its possible to do it another way because the Provider API is generic.
We can safely handle different container runtimes by detecting them from the Kubernetes nodes API, and generating a different script for docker/containerd etc to handle the exporting logic. We can make this extensible.
Beta Was this translation helpful? Give feedback.
All reactions