openebs · bors-openebs-mayastor · Jan 30, 2025 · Jan 23, 2025 · Jan 21, 2025 · Jan 23, 2025
@@ -7,7 +7,6 @@
 [![Community Meetings](https://img.shields.io/badge/Community-Meetings-blue)](https://us05web.zoom.us/j/87535654586?pwd=CigbXigJPn38USc6Vuzt7qSVFoO79X.1)
 [![built with nix](https://builtwithnix.org/badge.svg)](https://builtwithnix.org)
 
-
 ## Table of contents
 
 ---
@@ -23,7 +22,7 @@
 - [Frequently asked questions](/doc/FAQ.md)
 
 <p align="justify">
-<strong>Mayastor</strong> is a cloud-native declarative data plane written in <strong>Rust.</strong>
+<strong>Mayastor</strong> is a cloud-native declarative data plane written in <strong>Rust</strong>.
 Our goal is to abstract storage resources and their differences through the data plane such that users only need to
 supply the <strong>what</strong> and do not have to worry about the <strong>how</strong>
 so that individual teams stay in control.
@@ -53,24 +52,30 @@ The official user documentation for the Mayastor Project is published at: [OpenE
 
 ## Overview
 
+![OpenEBS Mayastor](./doc/img/overview.drawio.png)
+
 At a high-level, Mayastor consists of two major components.
 
 ### **Control plane:**
 
-- A microservices patterned control plane, centered around a core agent which publically exposes a RESTful API.
+- A microservices patterned control plane, centered around a core agent and a RESTful API.
   This is extended by a dedicated operator responsible for managing the life cycle of "Disk Pools"
   (an abstraction for devices supplying the cluster with persistent backing storage) and a CSI compliant
-  external provisioner (controller).
-  Source code for the control plane components is located in its [own repository](https://github.com/openebs/mayastor-control-plane)
+  external provisioner (controller). \
 
-- A daemonset _mayastor-csi_ plugin which implements the identity and node grpc services from CSI protocol.
+  Source code for the control plane components is located in the [controller repository](https://github.com/openebs/mayastor-control-plane). \
+  The helm chart as well as other k8s specific extensions (ex: kubectl-plugin) are located in the [extensions repository](https://github.com/openebs/mayastor-extensions).
+
+- CSI plugins:
+  - A daemonset _csi-node_ plugin which implements the identity and node services.
+  - A deployment _csi-controller_ plugin which implements the identity and controller services.
 
 ### **Data plane:**
 
-- Each node you wish to use for storage or storage services will have to run an IO Engine daemonset. Mayastor itself has
-  two major components: the Nexus and a local storage component.
+- Each node you wish to use for storage or storage services will have to run an I/O Engine instance. The Mayastor data-plane (i/o engine) itself has
+  two major components: the volume target (nexus) and a local storage pools which can be carved out into logical volumes (replicas), which in turn can be shared to other i/o engines via NVMe-oF.
 
-## Nexus
+## Volume Target / Nexus
 
 <p align="justify">
 The Nexus is responsible for attaching to your storage resources and making it available to the host that is
@@ -89,7 +94,7 @@ they way we do things. Moreover, due to hardware [changes](https://searchstorage
 we in fact are forced to think about it.
 
 Based on storage URIs the Nexus knows how to connect to the resources and will make these resources available as
-a single device to a protocol standard protocol. These storage URIs are generated automatically by MOAC and it keeps
+a single device to a protocol standard protocol. These storage URIs are managed by the control-plane and it keeps
 track of what resources belong to what Nexus instance and subsequently to what PVC.
 
 You can also directly use the nexus from within your application code. For example:
@@ -138,7 +143,7 @@ buf.as_slice().into_iter().map(|b| assert_eq!(b, 0xff)).for_each(drop);
 <p align="justify">
 
 We think this can help a lot of database projects as well, where they typically have all the smarts in their database engine
-and they want the most simple (but fast) storage device. For a more elaborate example see some of the tests in mayastor/tests.
+and they want the most simple (but fast) storage device. For a more elaborate example see some of the tests in io-engine/tests.
 
 To communicate with the children, the Nexus uses industry standard protocols. The Nexus supports direct access to local
 storage and remote storage using NVMe-oF TCP. Another advantage of the implementation is that if you were to remove
@@ -159,8 +164,8 @@ What model fits best for you? You get to decide!
 <p align="justify">
 If you do not have a storage system, and just have local storage, i.e block devices attached to your system, we can
 consume these and make a "storage system" out of these local devices such that
-you can leverage features like snapshots, clones, thin provisioning, and the likes. Our K8s tutorial does that under
-the water today. Currently, we are working on exporting your local storage implicitly when needed, such that you can
+you can leverage features like snapshots, clones, thin provisioning, and the likes. Our K8s deployment does that under
+the water. Currently, we are working on exporting your local storage implicitly when needed, such that you can
 share storage between nodes. This means that your application, when re-scheduled, can still connect to your local storage
 except for the fact that it is not local anymore.
 
@@ -192,12 +197,8 @@ In following example of a client session is assumed that mayastor has been
 started and is running:
 
 ```
-$ dd if=/dev/zero of=/tmp/disk bs=1024 count=102400
-102400+0 records in
-102400+0 records out
-104857600 bytes (105 MB, 100 MiB) copied, 0.235195 s, 446 MB/s
-$ sudo losetup /dev/loop8 /tmp/disk
-$ io-engine-client pool create tpool /dev/loop8
+$ fallocate -l 100M /tmp/disk.img
+$ io-engine-client pool create tpool aio:///tmp/disk.img
 $ io-engine-client pool list
 NAME                 STATE        CAPACITY         USED   DISKS
 tpool                0            96.0 MiB          0 B   tpool
@@ -232,5 +233,4 @@ Unless you explicitly state otherwise, any contribution intentionally submitted
 inclusion in Mayastor by you, as defined in the Apache-2.0 license, licensed as above,
 without any additional terms or conditions.
 
-
 [![FOSSA Status](https://app.fossa.com/api/projects/custom%2B162%2Fgithub.com%2Fopenebs%2Fmayastor.svg?type=large&issueType=license)](https://app.fossa.com/projects/custom%2B162%2Fgithub.com%2Fopenebs%2Fmayastor?ref=badge_large&issueType=license)
@@ -7,10 +7,45 @@ document.
 Basic workflow starting from registration is as follows:
 
 1. csi-node-driver-registrar retrieves information about csi plugin (mayastor) using csi identity service.
-1. csi-node-driver-registrar registers csi plugin with kubelet passing plugin's csi endpoint as parameter.
-1. kubelet uses csi identity and node services to retrieve information about the plugin (including plugin's ID string).
-1. kubelet creates a custom resource (CR) "csi node info" for the CSI plugin.
-1. kubelet issues requests to publish/unpublish and stage/unstage volume to the CSI plugin when mounting the volume.
+2. csi-node-driver-registrar registers csi plugin with kubelet passing plugin's csi endpoint as parameter.
+3. kubelet uses csi identity and node services to retrieve information about the plugin (including plugin's ID string).
+4. kubelet creates a custom resource (CR) "csi node info" for the CSI plugin.
+5. kubelet issues requests to publish/unpublish and stage/unstage volume to the CSI plugin when mounting the volume.
 
-The registration of mayastor storage nodes with control plane (moac) is handled
-by a separate protocol using NATS message bus that is independent on CSI plugin.
+The registration of the storage nodes (i/o engines) with the control plane is handled
+by a gRPC service which is independent of the CSI plugin.
+
+<br>
+
+```mermaid
+graph LR
+;
+    PublicApi{"Public<br>API"}
+    CO[["Container<br>Orchestrator"]]
+
+    subgraph "Mayastor Control-Plane"
+        Rest["Rest"]
+        InternalApi["Internal<br>API"]
+        InternalServices["Agents"]
+    end
+
+    subgraph "Mayastor Data-Plane"
+        IO_Node_1["Node 1"]
+    end
+
+    subgraph "Mayastor CSI"
+        Controller["Controller<br>Plugin"]
+        Node_1["Node<br>Plugin"]
+    end
+
+%% Connections
+    CO -.-> Node_1
+    CO -.-> Controller
+    Controller -->|REST/http| PublicApi
+    PublicApi -.-> Rest
+    Rest -->|gRPC| InternalApi
+    InternalApi -.->|gRPC| InternalServices
+    Node_1 <--> PublicApi
+    Node_1 -.->|NVMe-oF| IO_Node_1
+    IO_Node_1 <-->|gRPC| InternalServices
+```
@@ -0,0 +1,171 @@
+# Control Plane Behaviour
+
+This document describes the types of behaviour that the control plane will exhibit under various situations. By
+providing a high-level view it is hoped that the reader will be able to more easily reason about the control plane. \
+<br>
+
+## REST API Idempotency
+
+Idempotency is a term used a lot but which is often misconstrued. The following definition is taken from
+the [Mozilla Glossary](https://developer.mozilla.org/en-US/docs/Glossary/Idempotent):
+
+> An [HTTP](https://developer.mozilla.org/en-US/docs/Web/HTTP) method is **idempotent** if an identical request can be
+> made once or several times in a row with the same effect while leaving the server in the same state. In other words,
+> an idempotent method should not have any side-effects (except for keeping statistics). Implemented correctly, the `GET`,
+`HEAD`,`PUT`, and `DELETE` methods are idempotent, but not the `POST` method.
+> All [safe](https://developer.mozilla.org/en-US/docs/Glossary/Safe) methods are also ***idempotent***.
+
+OK, so making multiple identical requests should produce the same result ***without side effects***. Great, so does the
+return value for each request have to be the same? The article goes on to say:
+
+> To be idempotent, only the actual back-end state of the server is considered, the status code returned by each request
+> may differ: the first call of a `DELETE` will likely return a `200`, while successive ones will likely return a`404`.
+
+The control plane will behave exactly as described above. If, for example, multiple `create volume` calls are made for
+the same volume, the first will return success (`HTTP 200` code) while subsequent calls will return a failure status
+code (`HTTP 409` code) indicating that the resource already exists. \
+<br>
+
+## Handling Failures
+
+There are various ways in which the control plane could fail to satisfy a `REST` request:
+
+- Control plane dies in the middle of an operation.
+- Control plane fails to update the persistent store.
+- A gRPC request to Mayastor fails to complete successfully. \
+  <br>
+
+Regardless of the type of failure, the control plane has to decide what it should do:
+
+1. Fail the operation back to the callee but leave any created resources alone.
+
+2. Fail the operation back to the callee but destroy any created resources.
+
+3. Act like kubernetes and keep retrying in the hope that it will eventually succeed. \
+<br>
+
+Approach 3 is discounted. If we never responded to the callee it would eventually timeout and probably retry itself.
+This would likely present even more issues/complexity in the control plane.
+
+So the decision becomes, should we destroy resources that have already been created as part of the operation? \
+<br>
+
+### Keep Created Resources
+
+Preventing the control plane from having to unwind operations is convenient as it keeps the implementation simple. A
+separate asynchronous process could then periodically scan for unused resources and destroy them.
+
+There is a potential issue with the above described approach. If an operation fails, it would be reasonable to assume
+that the user would retry it. Is it possible for this subsequent request to fail as a result of the existing unused
+resources lingering (i.e. because they have not yet been destroyed)? If so, this would hamper any retry logic
+implemented in the upper layers.
+
+### Destroy Created Resources
+
+This is the optimal approach. For any given operation, failure results in newly created resources being destroyed. The
+responsibility lies with the control plane tracking which resources have been created and destroying them in the event
+of a failure.
+
+However, what happens if destruction of a resource fails? It is possible for the control plane to retry the operation
+but at some point it will have to give up. In effect the control plane will do its best, but it cannot provide any
+guarantee. So does this mean that these resources are permanently leaked? Not necessarily. Like in
+the [Keep Created Resources](#keep-created-resources) section, there could be a separate process which destroys unused
+resources. \
+<br>
+
+## Use of the Persistent Store
+
+For a control plane to be effective it must maintain information about the system it is interacting with and take
+decision accordingly. An in-memory registry is used to store such information.
+
+Because the registry is stored in memory, it is volatile - meaning all information is lost if the service is restarted.
+As a consequence critical information must be backed up to a highly available persistent store (for more detailed
+information see [persistent-store.md](./persistent-store.md)).
+
+The types of data that need persisting broadly fall into 3 categories:
+
+1. Desired state
+
+2. Actual state
+
+3. Control plane specific information \
+   <br>
+
+### Desired State
+
+This is the declarative specification of a resource provided by the user. As an example, the user may request a new
+volume with the following requirements:
+
+- Replica count of 3
+
+- Size
+
+- Preferred nodes
+
+- Number of nexuses
+
+Once the user has provided these constraints, the expectation is that the control plane should create a resource that
+meets the specification. How the control plane achieves this is of no concern.
+
+So what happens if the control plane is unable to meet these requirements? The operation is failed. This prevents any
+ambiguity. If an operation succeeds, the requirements have been met and the user has exactly what they asked for. If the
+operation fails, the requirements couldn’t be met. In this case the control plane should provide an appropriate means of
+diagnosing the issue i.e. a log message.
+
+What happens to resources created before the operation failed? This will be dependent on the chosen failure strategy
+outlined in [Handling Failures](#handling-failures).
+
+### Actual State
+
+This is the runtime state of the system as provided by Mayastor. Whenever this changes, the control plane must reconcile
+this state against the desired state to ensure that we are still meeting the users requirements. If not, the control
+plane will take action to try to rectify this.
+
+Whenever a user makes a request for state information, it will be this state that is returned (Note: If necessary an API
+may be provided which returns the desired state also). \
+<br>
+
+## Control Plane Information
+
+This information is required to aid the control plane across restarts. It will be used to store the state of a resource
+independent of the desired or actual state.
+
+The following sequence will be followed when creating a resource:
+
+1. Add resource specification to the store with a state of “creating”
+
+2. Create the resource
+
+3. Mark the state of the resource as “complete”
+
+If the control plane then crashes mid-operation, on restart it can query the state of each resource. Any resource not in
+the “complete” state can then be destroyed as they will be remnants of a failed operation. The expectation here will be
+that the user will reissue the operation if they wish to.
+
+Likewise, deleting a resource will look like:
+
+1. Mark resources as “deleting” in the store
+
+2. Delete the resource
+
+3. Remove the resource from the store.
+
+For complex operations like creating a volume, all resources that make up the volume will be marked as “creating”. Only
+when all resources have been successfully created will their corresponding states be changed to “complete”. This will
+look something like:
+
+1. Add volume specification to the store with a state of “creating”
+
+2. Add nexus specifications to the store with a state of “creating”
+
+3. Add replica specifications to the store with a state of “creating”
+
+4. Create replicas
+
+5. Create nexus
+
+6. Mark replica states as “complete”
+
+7. Mark nexus states as “complete”
+
+8. Mark volume state as “complete”
@@ -0,0 +1,47 @@
+# DiskPool Custom Resource for K8s
+
+The DiskPool operator is a [K8s] specific component which manages pools in a K8s environment. \
+Simplistically, it drives pools across the various states listed below.
+
+In [K8s], mayastor pools are represented as [Custom Resources][k8s-cr], which is an extension on top of the existing [K8s API][k8s-api]. \
+This allows users to declaratively create [diskpool], and mayastor will not only eventually create the corresponding mayastor pool but will
+also ensure that it gets re-imported after pod restarts, node restarts, crashes, etc...
+
+> **NOTE**: mayastor pool (msp) has been renamed to diskpool (dsp)
+
+## DiskPool States
+
+> *NOTE*
+> Non-exhaustive enums could have additional variants added in the future. Therefore, when matching against variants of non-exhaustive enums, an extra
+> wildcard arm must be added to account for future variants.
+
+- Creating \
+The pool is a new OR missing resource, and it has not been created or imported yet. The pool spec ***MAY*** be present but ***DOES NOT*** have a status field.
+
+- Created \
+The pool has been created in the designated i/o engine node by the control-plane.
+
+- Terminating \
+A deletion request has been issued by the user. The pool will eventually be deleted by the control-plane and eventually the DiskPool Custom Resource will also get removed from the K8s API.
+
+- Error (*Deprecated*) \
+The attempt to transition to the next state has exceeded the maximum number of retries. The retry counts are implemented using an exponential back-off, which by default is set to 10. Once the error state is entered, reconciliation stops. Only external events (a new resource version) will trigger a new attempt.
+  > NOTE: this State has been deprecated since API version **v1beta1**
+
+## Reconciler actions
+
+The operator responds to two types of events:
+
+- Scheduled \
+When, for example, we try to submit a new PUT request for a pool. On failure (i.e., network) we will reschedule the operation after 5 seconds.
+
+- CRD updates \
+When the CRD is changed, the resource version is changed. This will trigger a new reconcile loop. This process is typically known as “watching.”
+
+- Observability \
+During the transition, the operator will emit events to K8s, which can be obtained by kubectl. This gives visibility into the state and its transitions.
+
+[K8s]: https://kubernetes.io/
+[k8s-cr]: https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/
+[k8s-api]: https://kubernetes.io/docs/concepts/overview/kubernetes-api/
+[diskpool]: https://openebs.io/docs/user-guides/replicated-storage-user-guide/replicated-pv-mayastor/rs-configuration
@@ -0,0 +1,179 @@
+# Kubectl Plugin
+
+## Overview
+
+The kubectl-mayastor plugin follows the instructions outlined in
+the [K8s] [official documentation](https://kubernetes.io/docs/tasks/extend-kubectl/kubectl-plugins/).
+
+The name of the plugin binary dictates how it is used. From the documentation:
+> For example, a plugin named `kubectl-foo` provides a command `kubectl foo`.
+
+In our case the name of the binary is specified in the Cargo.toml file as `kubectl-mayastor`, therefore the command is
+`kubectl mayastor`.
+
+This document outlines all workflows and interactions between the plugin, the Mayastor control plane, and [K8s].
+It provides a high-level overview of the plugin's general operation, the features it currently supports, and how
+ these features integrate with the APIs.
+
+This is the general flow of the request to generate an output from the plugin:
+
+1. The flow starts with the CLI command, to be entered from console.
+
+2. The respective command is supposed to hit the specific API endpoint dedicated for that purpose.
+
+3. The API request is then forwarded to the Core Agent of the Control Plane.
+
+4. Core Agent is responsible for the further propagation of the request based on its METHOD and purpose.
+
+5. A GET request would not bring in any change in spec or state, it would get the needed information from registry and
+   return it as a response to the request.
+
+6. A PUT request would bring a change in the spec, and thus a synchronous action would be performed by mayastor.
+   And updated spec and state would thus be returned as a response.
+
+> ***NOTE***: A command might have targets other than the Core Agent, and it might not even be sent to the
+> control-plane, example: could be sent to a K8s endpoint.
+
+For a list of commands you can refer to the
+docs [here](https://github.com/openebs/mayastor-extensions/blob/HEAD/k8s/plugin/README.md#usage).
+
+## Command Line Interface
+
+Some goals for the kubectl-mayastor plugin are:
+
+- Provide an intuitive and user-friendly CLI for Mayastor.
+- Function in similar ways to existing Kubernetes CLI tools.
+- Support common Mayastor operations.
+
+> **NOTE**: There are many principles for a good CLI. An interesting set of guidelines can be
+> seen [here](https://clig.dev/) for example.
+
+All the plugin commands are verb based, providing the user with a similar experience to
+the official [kubectl](https://kubernetes.io/docs/reference/kubectl/#operations).
+
+All the plugin commands and their arguments are defined using a very powerful cli library: [clap].
+Some of these features are:
+
+- define every command and their arguments in a type-safe way
+- add default values for any argument
+- custom long and short (single letter) argument names
+- parse any argument with a powerful value parser
+- add custom or well-defined possible values for an argument
+- define conflicts between arguments
+- define requirements between arguments
+- flatten arguments for code encapsulation
+- many more!
+
+Each command can be output in either `tabled`, `JSON` or `YAML` format.
+The `tabled` format is mainly useful for human usage where the others allow for integration with tools (ex: jq, yq) which
+can capture, parse and filter.
+
+Each command (and sub-commands) accepts the `--help | -h` argument, which documents the operation and the supported
+arguments.
+
+> **NOTE**: Not all commands and their arguments are as well documented as we'd wish, and any help improving this would
+> be very welcome! \
+> We can also consider auto-generating CLI documenting as markdown.
+
+## Connection to the K8s Cluster
+
+Exactly like the K8s kubectl, the kubectl-mayastor plugin runs on the users' system whereas mayastor is running in the K8s cluster.
+A mechanism is then required in order to bridge this gap and allow the plugin to talk to the mayastor services running in the cluster.
+
+The plugin currently supports 2 distinct modes:
+
+1. Kube ApiServer Proxy
+2. Port Forwarding
+
+### Kube ApiServer Proxy
+
+It's built-in to the K8s apiserver and allows a user outside of the cluster to connect via the apipserver to a clusterIp which would otherwise
+be unreachable.
+It proxies using HTTPS and it's capable of doing load balancing for service endpoints.
+
+```mermaid
+graph LR
+  subgraph Control Plane
+    APIServer["Api Server"]
+  end
+
+  subgraph Worker Nodes
+    Pod_1["pod"]
+    Pod_2["pod"]
+    Pod_3["pod"]
+    SLB["Service<br>LB"]
+  end
+
+  %% These don't display on GitHub :(
+  %%Internet(<img src='https://icons.terrastruct.com/azure%2FCompute%20Service%20Color%2FCloud%20Services%20%28Classic%29.svg' />)
+  %%User(<img src='https://icons.terrastruct.com/essentials/005-programmer.svg' width='32' />)
+
+  User ==> |"kubectl"| APIServer
+  User -.- |proxied| Pod_1
+  APIServer -.-> |"kubectl"| Pod_1
+  Internet --> SLB
+  SLB --> Pod_1
+  SLB --> Pod_2
+  SLB --> Pod_3
+```
+
+Above we highlight the difference between this approach and a load balancer service which exposes the IP externally.
+You can try this out yourself with the [kubect-plugin][kubectl-proxy].
+
+### Port Forwarding
+
+K8s provides a [Port Forwarding](https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/) to access
+applications in a cluster.
+This works by forwarding local ports to the cluster.
+
+You can try this out yourself with the [kubect-plugin][kubectl-port-forward].
+
+> *NOTE*: kubect port-forward is currently implemented for TCP ports only.
+
+<br>
+
+## Distribution
+
+We distribute the plugin in similar ways to what's recommended by the kubectl plugin docs:
+
+1. Krew \
+   [Krew] offers a cross-platform way to package and distribute your plugins. This way, you use a single packaging format
+   for all target platforms (Linux, Windows, macOS etc) and deliver updates to your users. \
+   Krew also maintains a plugin index so that other people can discover your plugin and install it.
+2. "Naked" binary packaged in a tarball \
+   This is available as a [GitHub] release asset for the specific version: \
+   `vX.Y.Z: https://github.com/openebs/mayastor/releases/download/v$X.$Y.$Z/kubectl-mayastor-$platform.tar.gz` \
+   Example, you can get the x86_64 plugin for v2.7.3 can be
+   retrieved [here](https://github.com/openebs/mayastor/releases/download/v2.7.3/kubectl-mayastor-x86_64-linux-musl.tar.gz).
+3. Source code \
+   You can download the source code for the released version and build it yourself. \
+   You can check the build docs for reference [here](../../build-all.md#building).
+
+## Supported Platforms
+
+Although the mayastor installation is only officially supported for Linux x86_64 at the time of writing, the plugin
+actually supports a wider range of platforms. \
+This is because although most production K8s cluster are running Linux x86_64, users and admins may interact with the
+clusters from a wider range of platforms.
+
+- [x] Linux
+  - [x] x86_64
+  - [x] aarch64
+- [x] MacOs
+  - [x] x86_64
+  - [x] aarch64
+- [ ] Windows
+  - [x] x86_64
+  - [ ] aarch64
+
+[K8s]: https://kubernetes.io/
+
+[clap]: https://docs.rs/clap/latest/clap/
+
+[GitHub]: https://github.com/openebs/mayastor
+
+[Krew]: https://krew.sigs.k8s.io/
+
+[kubectl-proxy]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#proxy
+
+[kubectl-port-forward]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#port-forward
@@ -50,24 +50,25 @@ thin provisioning) within Mayastor.\
    Users can resize volumes online.
    Snapshots are managed transparently.
 
-Features
----
-
-- [x] Pool Operations
-  - [x] Create
-  - [x] Destroy
-  - [x] Import
-  - [x] List
-- [x] Replica Operations
-  - [x] Create
-  - [x] Destroy
-  - [x] Share/Unshare
-  - [x] Resize
-  - [x] List
-- [ ] Thin Provisioning
-- [ ] Snapshots
-- [ ] Clones
-- [ ] RAIDx
+## Features
+
+- [ ] I/O Engine (data-plane)
+  - [x] Pool Operations
+    - [x] Create
+    - [x] Destroy
+    - [x] Import
+    - [x] List
+  - [x] Replica Operations
+    - [x] Create
+    - [x] Destroy
+    - [x] Share/Unshare
+    - [x] Resize
+    - [x] List
+  - [ ] Thin Provisioning
+  - [ ] Snapshots
+  - [ ] Clones
+  - [ ] RAIDx
+- [ ] Control-Plane
 
 ### Limitation
 
@@ -97,9 +98,9 @@ graph TD;
    end
 
    subgraph Physical Volumes
-      PV_1 --> VG_1["Volume Group - VG 1"]
-      PV_2 --> VG_1
-      PV_3 --> VG_2["Volume Group - VG 2"]
+      PV_1["PV 1"] --> VG_1["Vol Group 1"]
+      PV_2["PV 2"] --> VG_1
+      PV_3["PV 3"] --> VG_2["Vol Group 2"]
    end
 
    subgraph Node1

@@ -0,0 +1,30 @@
+# Mayastor Public API
+
+Mayastor exposes a public api from its [REST] service.
+This is a [RESTful][REST] API which can be leveraged by external to mayastor (ex: users or 3rd party tools) as well as
+mayastor components which are part of the control-plane.
+
+## OpenAPI
+
+The mayastor public API is defined using the [OpenAPI] which has many benefits:
+
+1. Standardized: OpenAPI allows us to define an API in a standard way, well-used in the industry.
+
+2. Integration: As a standard, it's easy to integrate with other systems, tools, and platforms (anyone can write a
+   plugin for it!).
+
+3. Automation: Auto generate the server and client libraries, reducing manual effort and the potential for errors.
+
+4. Documentation: Each method and type is documented which makes it easier to understand.
+
+5. Tooling: There's an abundance of tools and libraries which support the OpenAPI spec, making it easier to develop,
+   test, and deploy.
+
+The spec is
+available [here](https://raw.githubusercontent.com/openebs/mayastor-control-plane/HEAD/control-plane/rest/openapi-specs/v0_api_spec.yaml),
+and you interact with it using one of the many ready-made
+tools [here](https://editor.swagger.io/?url=https://raw.githubusercontent.com/openebs/mayastor-control-plane/HEAD/control-plane/rest/openapi-specs/v0_api_spec.yaml).
+
+[OpenAPI]: https://www.openapis.org/what-is-openapi
+
+[REST]: https://en.wikipedia.org/wiki/REST
@@ -0,0 +1,115 @@
+# REST Authentication
+
+## References
+
+- https://auth0.com/blog/build-an-api-in-rust-with-jwt-authentication-using-actix-web/
+- https://jwt.io/
+- https://russelldavies.github.io/jwk-creator/
+- https://blog.logrocket.com/how-to-secure-a-rest-api-using-jwt-7efd83e71432/
+- https://blog.logrocket.com/jwt-authentication-in-rust/
+
+## Overview
+
+The [REST API][REST] provides a means of controlling Mayastor. It allows the consumer of the API to perform operations
+such as creation and deletion of pools, replicas, nexus and volumes.
+
+It is important to secure the [REST] API to prevent access to unauthorised personnel. This is achieved through the use
+of
+[JSON Web Tokens (JWT)][JWT] which are sent with every [REST] request.
+
+Upon receipt of a request the [REST] server extracts the [JWT] and verifies its authenticity. If authentic, the request
+is
+allowed to proceed otherwise the request is failed with an [HTTP] `401` Unauthorized error.
+
+## JSON Web Token (JWT)
+
+Definition taken from here:
+
+> JSON Web Token ([JWT]) is an open standard ([RFC 7519][JWT]) that defines a compact and self-contained way for
+> securely transmitting information between parties as a JSON object. \
+> This information can be verified and trusted because it is digitally signed. \
+> [JWT]s can be signed using a secret (with the [HMAC] algorithm) or a public/private key pair using [RSA] or
+> [ECDSA].
+
+The [REST] server expects the [JWT] to be signed with a private key and for the public key to be accessible as
+a [JSON Web Key (JWK)][JWK].
+
+The JWK is used to authenticate the [JWT] by checking that it was indeed signed by the corresponding private key.
+
+The [JWT] comprises three parts, each separated by a fullstop:
+
+`<header>.<payload>.<signature>`
+
+Each of the above parts are [Base64-URL] encoded strings.
+
+## JSON Web Key (JWK)
+
+Definition taken from here:
+
+> A [JSON] Web Key ([JWK]) is a JavaScript Object Notation ([JSON - RFC 7159][JSON]) data structure that represents a
+> cryptographic key.
+
+An example of the [JWK] structure is shown below:
+
+```json
+{
+  "kty": "RSA",
+  "n": "tTtUE2YgN2te7Hd29BZxeGjmagg0Ch9zvDIlHRjl7Y6Y9Gankign24dOXFC0t_3XzylySG0w56YkAgZPbu-7NRUbjE8ev5gFEBVfHgXmPvFKwPSkCtZG94Kx-lK_BZ4oOieLSoqSSsCdm6Mr5q57odkWghnXXohmRgKVgrg2OS1fUcw5l2AYljierf2vsFDGU6DU1PqeKiDrflsu8CFxDBAkVdUJCZH5BJcUMhjK41FCyYImtEb13eXRIr46rwxOGjwj6Szthd-sZIDDP_VVBJ3bGNk80buaWYQnojtllseNBg9pGCTBtYHB-kd-NNm2rwPWQLjmcY1ym9LtJmrQCXvA4EUgsG7qBNj1dl2NHcG03eEoJBejQ5xwTNgQZ6311lXuKByP5gkiLctCtwn1wGTJpjbLKo8xReNdKgFqrIOT1mC76oZpT3AsWlVH60H4aVTthuYEBCJgBQh5Bh6y44ANGcybj-q7sOOtuWi96sXNOCLczEbqKYpeuckYp1LP",
+  "e": "AQAB",
+  "alg": "RS256",
+  "use": "sig"
+}
+```
+
+The meaning of these keys (as defined on [RFC 7517][[JWK]]) are:
+
+| Key Name |      Meaning       |                                                                                     Purpose                                                                                     |
+|:---------|:------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
+| kty      |      Key Type      |                                                                 Denotes the cryptographic algorithm family used                                                                 |
+| n        |      Modulus       |                                                                       The modulus used by the public key                                                                        |
+| e        |      Exponent      |                                                                       The exponent used by the public key                                                                       |
+| alg      | The algorithm used |                                                        This corresponds to the algorithm used to sign/encrypt the [JWT]                                                         |
+| use      |   Public Key Use   | Can take one of two values sig or enc. sig indicates the public key should be used only for signature verification, whereas enc denotes that it is used for encrypting the data |
+
+<br>
+
+## REST Server Authentication
+
+### Prerequisites
+
+1. The [JWT] is included in the [HTTP] Authorization Request Header
+2. The [JWK], used for signature verification, is accessible
+
+### Process
+
+The [REST] server makes use of the [jsonwebtoken] crate to perform [JWT] authentication.
+
+Upon receipt of a [REST] request the [JWT] is extracted from the header and split into two parts:
+
+1. message (comprising the header and payload)
+2. signature
+
+This is passed to the jsonwebtoken crate along with the decoding key and algorithm extracted from the [JWK].
+
+If authentication succeeds the [REST] request is permitted to continue. If authentication fails, the [REST] request is
+rejected with an [HTTP] `401` Unauthorized error.
+
+[REST]: https://en.wikipedia.org/wiki/REST
+
+[JWT]: https://datatracker.ietf.org/doc/html/rfc7519
+
+[JWK]: https://datatracker.ietf.org/doc/html/rfc7517
+
+[HTTP]: https://developer.mozilla.org/en-US/docs/Web/HTTP
+
+[Base64-URL]: https://base64.guru/standards/base64url
+
+[HMAC]: https://datatracker.ietf.org/doc/html/rfc2104
+
+[RSA]: https://en.wikipedia.org/wiki/RSA_(cryptosystem)
+
+[ECDSA]: https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm
+
+[JSON]: https://datatracker.ietf.org/doc/html/rfc7159
+
+[jsonwebtoken]: https://github.com/Keats/jsonwebtoken
@@ -76,17 +76,40 @@ impl TryFrom<&Url> for Malloc {
             512
         };
 
-        let size: u32 = if let Some(value) = parameters.remove("size_mb") {
-            value.parse().context(bdev_api::IntParamParseFailed {
+        let size_mb: Option<u64> = if let Some(value) = parameters.remove("size_mb") {
+            Some(value.parse().context(bdev_api::IntParamParseFailed {
                 uri: uri.to_string(),
                 parameter: String::from("size_mb"),
                 value: value.clone(),
-            })?
+            })?)
         } else {
-            0
+            None
         };
 
-        let num_blocks: u32 = if let Some(value) = parameters.remove("num_blocks") {
+        let size_b: Option<u64> = if let Some(value) = parameters.remove("size") {
+            Some(
+                byte_unit::Byte::parse_str(&value, true)
+                    .map_err(|error| BdevError::InvalidUri {
+                        uri: uri.to_string(),
+                        message: format!("'size' is invalid: {error}"),
+                    })?
+                    .as_u64(),
+            )
+        } else {
+            None
+        };
+
+        let size = match (size_mb, size_b) {
+            (Some(_), Some(_)) => Err(BdevError::InvalidUri {
+                uri: uri.to_string(),
+                message: "Can't specify both size and size_mb".to_string(),
+            }),
+            (Some(size_mb), None) => Ok(size_mb * 1024 * 1024),
+            (None, Some(size)) => Ok(size),
+            (None, None) => Ok(0),
+        }?;
+
+        let num_blocks: u64 = if let Some(value) = parameters.remove("num_blocks") {
             value.parse().context(bdev_api::IntParamParseFailed {
                 uri: uri.to_string(),
                 parameter: String::from("num_blocks"),
@@ -133,8 +156,8 @@ impl TryFrom<&Url> for Malloc {
             num_blocks: if num_blocks != 0 {
                 num_blocks
             } else {
-                (size << 20) / blk_size
-            } as u64,
+                size / (blk_size as u64)
+            },
             blk_size,
             uuid,
             resizing,

@@ -63,16 +63,39 @@ impl TryFrom<&Url> for Null {
             });
         }
 
-        let size: u64 = if let Some(value) = parameters.remove("size_mb") {
-            value.parse().context(bdev_api::IntParamParseFailed {
+        let size_mb: Option<u64> = if let Some(value) = parameters.remove("size_mb") {
+            Some(value.parse().context(bdev_api::IntParamParseFailed {
                 uri: uri.to_string(),
                 parameter: String::from("size_mb"),
                 value: value.clone(),
-            })?
+            })?)
         } else {
-            0
+            None
+        };
+
+        let size_b: Option<u64> = if let Some(value) = parameters.remove("size") {
+            Some(
+                byte_unit::Byte::parse_str(&value, true)
+                    .map_err(|error| BdevError::InvalidUri {
+                        uri: uri.to_string(),
+                        message: format!("'size' is invalid: {error}"),
+                    })?
+                    .as_u64(),
+            )
+        } else {
+            None
         };
 
+        let size = match (size_mb, size_b) {
+            (Some(_), Some(_)) => Err(BdevError::InvalidUri {
+                uri: uri.to_string(),
+                message: "Can't specify both size and size_mb".to_string(),
+            }),
+            (Some(size_mb), None) => Ok(size_mb * 1024 * 1024),
+            (None, Some(size)) => Ok(size),
+            (None, None) => Ok(0),
+        }?;
+
         let num_blocks: u64 = if let Some(value) = parameters.remove("num_blocks") {
             value.parse().context(bdev_api::IntParamParseFailed {
                 uri: uri.to_string(),
@@ -86,8 +109,9 @@ impl TryFrom<&Url> for Null {
         if size != 0 && num_blocks != 0 {
             return Err(BdevError::InvalidUri {
                 uri: uri.to_string(),
-                message: "conflicting parameters num_blocks and size_mb are mutually exclusive"
-                    .to_string(),
+                message:
+                    "conflicting parameters num_blocks and size/size_mb are mutually exclusive"
+                        .to_string(),
             });
         }
 
@@ -104,7 +128,7 @@ impl TryFrom<&Url> for Null {
             num_blocks: if num_blocks != 0 {
                 num_blocks
             } else {
-                (size << 20) / (blk_size as u64)
+                size / (blk_size as u64)
             },
             blk_size,
             uuid: uuid.or_else(|| Some(Uuid::new_v4())),