Skip to content

Commit

Permalink
Add error codes to the tests
Browse files Browse the repository at this point in the history
  • Loading branch information
xing-yang committed Jan 28, 2020
1 parent 73e206a commit 7333982
Showing 1 changed file with 15 additions and 11 deletions.
26 changes: 15 additions & 11 deletions keps/sig-storage/20190530-pv-health-monitor.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,13 +93,11 @@ Details will be described in the following proposal section.

Volume monitoring is the main focus of this proposal. Reactions are not in the scope of this proposal.

The following areas will be the focus of this proposal at first:
The following area will be the focus of this proposal at first:

- Provide a mechanism for CSI drivers to report volume health problems. For example, whether the volume is deleted, whether the usage is reaching the threshold, and so on.
- Mounting conditions checking.
- Other errors that could affect the usability of the volume.
* Provide a mechanism for CSI drivers to report volume health problems at the controller and node levels.

Three main parts are involved here in the architecture.
Two main parts are involved here in the architecture.

- External Controller:
- The external controller will be deployed as a sidecar together with the CSI controller driver, similar to how the external-provisioner sidecar is deployed.
Expand Down Expand Up @@ -370,10 +368,10 @@ message NodeServiceCapability {
### External controller

#### CSI interface
Call GetVolume() RPC for volumes periodically to check the health condition of volumes themselves. The frequency of the check should be tunalbe. A configure option will be available in the external controller to adjust this value.
Call GetVolume() RPC for volumes periodically to check the health condition of volumes themselves. The frequency of the check should be tunable. A configure option will be available in the external controller to adjust this value.

#### Node down event
* Watch node down events.
* Watch node down events by checking node status and also pinging the node.
* The controller will track which pods are using which PVCs and what nodes they got scheduled to.
* In the case that a node goes down, the controller will report an event for all PVCs on that node.

Expand Down Expand Up @@ -551,12 +549,18 @@ This option is not in the main proposal because it is a push-based method while

## Test Plan
### Unit tests
* Unit tests for external controller volume health monitoring.
* Unit tests for external node agent volume health monitoring.
* Unit tests for external controller and external node agent volume health monitoring. The following error codes will be simulated in the unit tests:
* VolumeNotFound
* OutOfCapacity
* VolumeUnmounted
* NodeDown

### E2E tests
* e2e tests for external controller volume health monitoring.
* e2e tests for external node agent volume health monitoring.
* e2e tests for external controller and external node agent volume health monitoring. The following error codes will be tested in the e2e tests:
* VolumeNotFound
* OutOfCapacity
* VolumeUnmounted
* Add stress and scale tests before moving from beta to GA.

## Implementation History

Expand Down

0 comments on commit 7333982

Please sign in to comment.