Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add self-assessment doc #183

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

Add self-assessment doc #183

wants to merge 2 commits into from

Conversation

niladrih
Copy link
Member

@niladrih niladrih commented Feb 5, 2025

Resolves #152

security/self-assessment.md Outdated Show resolved Hide resolved
security/self-assessment.md Outdated Show resolved Hide resolved
security/self-assessment.md Outdated Show resolved Hide resolved
@niladrih niladrih force-pushed the security-self-assessment branch from f1d83ab to 3be0ddf Compare February 5, 2025 10:56
@niladrih niladrih requested a review from avishnu February 5, 2025 10:56
…tures, secure dev practices

Signed-off-by: Niladri Halder <[email protected]>
@niladrih niladrih changed the title feat(security/self-assessment): add actors and basic actions writeup Add self-assessment doc Feb 6, 2025
@tiagolobocastro
Copy link
Collaborator

hmm shouldn't this go here?

@niladrih
Copy link
Member Author

niladrih commented Feb 6, 2025

hmm shouldn't this go here?

Yes, I'll eventually raise the PR there. Wanted to get an internal round of review.

- Uses CSI volume modes and NVMe Reservations to guarantee exclusive volume access.

- **Secure Communication:**
- Utilizes secure HTTP/gRPC channels between control-plane and node components.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm secure how?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The REST doesn't have TLS. The gRPC layer doesn't have it also?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the io-engine gRPC

Comment on lines +61 to +64
Built-in data redundancy across nodes
Automated failure detection and recovery
Dynamic scaling of storage resources
Advanced storage features typically found in enterprise storage systems
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Built-in data redundancy across nodes
Automated failure detection and recovery
Dynamic scaling of storage resources
Advanced storage features typically found in enterprise storage systems
- Built-in data redundancy across nodes
- Automated failure detection and recovery
- Dynamic scaling of storage resources
- Advanced storage features typically found in enterprise storage systems

- **Replicated PV Mayastor Core Agent:** This is acts as a control-plane for a Mayastor cluster. Communitcates with other mayastor services via HTTP (gRPC).
- **Replicated PV Mayastor Etcd persistent store:** This persists the state of a Mayastor cluster. Uses replication and self-healing for redundancy and high-availability.
- **Replicated PV Mayastor HA Cluster Agent:** This is a Mayastor control-plane agent which provides highly available volume target management. This communicates to the Mayastor's core agent via HTTP (gRPC).
- **Replicated PV Mayastor HA Node Agent:** This is a Mayastor control-plane agent which mounts a hostpath directory and makes use of NVMe commands to execute volume target failovers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detection, reporting and replacement of failed target paths. I think we should mention these somewhere

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these required?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I am not sure what do you mean by makes use of nvme commands to execute failover. There are multiple actors involved in the whole process. The bit which HA node agent does is what I listed above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It interacts with csi node with grpc over uds and with cluster agent over grpc.

- **Replicated PV Mayastor HA Cluster Agent:** This is a Mayastor control-plane agent which provides highly available volume target management. This communicates to the Mayastor's core agent via HTTP (gRPC).
- **Replicated PV Mayastor HA Node Agent:** This is a Mayastor control-plane agent which mounts a hostpath directory and makes use of NVMe commands to execute volume target failovers.
- **Replicated PV Mayastor CSI Controller plugin:** This is a CSI-controller plugin which communicates with the Mayastor storage API (HTTP) and the Kubernetes APIs to orchestrate volume provisioning, de-provisioning, expansion, snapshot operations for Mayastor volumes
- **Replicated PV Mayastor CSI Node plugin:** This is a CSI-node plugin which communicates with the Mayastor control-plane via HTTP (gRPC) and executes host-level volumes operations. It mounts hostpath directories for accessing sysfs APIs and kernel device events.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kernel device events not sure if it does that, what do you mean by this? Also it's main aim is to do csi operations orchestrated by the kubelet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses udevadm to talk to udev. Udev listens to kernel device events.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that HA node agent you are talking about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the CSI node

- **Replicated PV Mayastor HA Node Agent:** This is a Mayastor control-plane agent which mounts a hostpath directory and makes use of NVMe commands to execute volume target failovers.
- **Replicated PV Mayastor CSI Controller plugin:** This is a CSI-controller plugin which communicates with the Mayastor storage API (HTTP) and the Kubernetes APIs to orchestrate volume provisioning, de-provisioning, expansion, snapshot operations for Mayastor volumes
- **Replicated PV Mayastor CSI Node plugin:** This is a CSI-node plugin which communicates with the Mayastor control-plane via HTTP (gRPC) and executes host-level volumes operations. It mounts hostpath directories for accessing sysfs APIs and kernel device events.
- **Replicated PV Mayastor IO Engine:** This is a userspace storage controller which polls for IO requests and serves a volume target for Kubernetes containers. It consumes a high degree of CPU and memory resources to provide low-lantency, resilient storage. This communicates with the Mayastor control plane using HTTP (gRPC).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

serves a volume target --> serves volume targets

- **Replicated PV Mayastor CSI Controller plugin:** This is a CSI-controller plugin which communicates with the Mayastor storage API (HTTP) and the Kubernetes APIs to orchestrate volume provisioning, de-provisioning, expansion, snapshot operations for Mayastor volumes
- **Replicated PV Mayastor CSI Node plugin:** This is a CSI-node plugin which communicates with the Mayastor control-plane via HTTP (gRPC) and executes host-level volumes operations. It mounts hostpath directories for accessing sysfs APIs and kernel device events.
- **Replicated PV Mayastor IO Engine:** This is a userspace storage controller which polls for IO requests and serves a volume target for Kubernetes containers. It consumes a high degree of CPU and memory resources to provide low-lantency, resilient storage. This communicates with the Mayastor control plane using HTTP (gRPC).
- **Replicated PV Mayastor IO Engine metrics exporter:** This exposes volume controller stats data in prometheus-compatible format. This communicates with IO engines using intra Pod IPC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it exports pool metrics as well right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does.

- **Replicated PV Mayastor IO Engine:** This is a userspace storage controller which polls for IO requests and serves a volume target for Kubernetes containers. It consumes a high degree of CPU and memory resources to provide low-lantency, resilient storage. This communicates with the Mayastor control plane using HTTP (gRPC).
- **Replicated PV Mayastor IO Engine metrics exporter:** This exposes volume controller stats data in prometheus-compatible format. This communicates with IO engines using intra Pod IPC.
- **Replicated PV Mayastor Stats and Call-home plugin:** This is a plugin for reporting anonymous usage data from the Kubernetes cluster. It communicates with the Kubernetes API, and the Mayastor storage API to collect data.
- **Clients:** This actor interacts with an OpenEBS cluster using standard Kubernetes tools and/or specialised clients for accessing storage layer functionality. This is usually a Kubernetes cluster admin or a storage admin.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the mayastor api rest being listed here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've missed that one. I'll add it.


- **Privileged Container Operations:**
- **What Happens:**
- Node-level plugins, including those for LocalPV and Replicated PV Mayastor, run as privileged containers. This enables them to access system-level OS APIs and execute low-level operations such as direct I/O on block devices and hostPath mounts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LocalPV --> Local PV. Let's follow this throughout the docs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, okay.

Extensive unit and integration tests validate functionality and secure behavior.

- **Automated CI/CD Pipelines:**
Secure pipelines enforce coding standards, run vulnerability scans, and perform dependency checks before deployment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run vulnerability scans --> we don't do that on CI/CD, rather it's done by the dependabot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do it on linux-utils. But yes, we don't directly do vulnerability scans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CNCF CNCF interactions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Required] Document Security Self-Assessment.
4 participants