From 73339822ea693cc8a079db09f9e5471782307367 Mon Sep 17 00:00:00 2001 From: xing-yang Date: Tue, 28 Jan 2020 20:32:07 +0000 Subject: [PATCH] Add error codes to the tests --- .../sig-storage/20190530-pv-health-monitor.md | 26 +++++++++++-------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/keps/sig-storage/20190530-pv-health-monitor.md b/keps/sig-storage/20190530-pv-health-monitor.md index 7b477818aba4..c756562c728d 100644 --- a/keps/sig-storage/20190530-pv-health-monitor.md +++ b/keps/sig-storage/20190530-pv-health-monitor.md @@ -93,13 +93,11 @@ Details will be described in the following proposal section. Volume monitoring is the main focus of this proposal. Reactions are not in the scope of this proposal. -The following areas will be the focus of this proposal at first: +The following area will be the focus of this proposal at first: -- Provide a mechanism for CSI drivers to report volume health problems. For example, whether the volume is deleted, whether the usage is reaching the threshold, and so on. -- Mounting conditions checking. -- Other errors that could affect the usability of the volume. +* Provide a mechanism for CSI drivers to report volume health problems at the controller and node levels. -Three main parts are involved here in the architecture. +Two main parts are involved here in the architecture. - External Controller: - The external controller will be deployed as a sidecar together with the CSI controller driver, similar to how the external-provisioner sidecar is deployed. @@ -370,10 +368,10 @@ message NodeServiceCapability { ### External controller #### CSI interface -Call GetVolume() RPC for volumes periodically to check the health condition of volumes themselves. The frequency of the check should be tunalbe. A configure option will be available in the external controller to adjust this value. +Call GetVolume() RPC for volumes periodically to check the health condition of volumes themselves. The frequency of the check should be tunable. A configure option will be available in the external controller to adjust this value. #### Node down event -* Watch node down events. +* Watch node down events by checking node status and also pinging the node. * The controller will track which pods are using which PVCs and what nodes they got scheduled to. * In the case that a node goes down, the controller will report an event for all PVCs on that node. @@ -551,12 +549,18 @@ This option is not in the main proposal because it is a push-based method while ## Test Plan ### Unit tests -* Unit tests for external controller volume health monitoring. -* Unit tests for external node agent volume health monitoring. +* Unit tests for external controller and external node agent volume health monitoring. The following error codes will be simulated in the unit tests: + * VolumeNotFound + * OutOfCapacity + * VolumeUnmounted + * NodeDown ### E2E tests -* e2e tests for external controller volume health monitoring. -* e2e tests for external node agent volume health monitoring. +* e2e tests for external controller and external node agent volume health monitoring. The following error codes will be tested in the e2e tests: + * VolumeNotFound + * OutOfCapacity + * VolumeUnmounted +* Add stress and scale tests before moving from beta to GA. ## Implementation History