Skip to content

Commit

Permalink
docs: cli/advanced.md, environment-vars.md, https.md, authn.md, cli.md
Browse files Browse the repository at this point in the history
* new content mostly around https
* v3.24 updates
* cross-references, etc. text works

Signed-off-by: Alex Aizman <[email protected]>
  • Loading branch information
alex-aizman committed Sep 17, 2024
1 parent e53aaf3 commit e40ec36
Show file tree
Hide file tree
Showing 8 changed files with 212 additions and 61 deletions.
12 changes: 11 additions & 1 deletion api/env/ais.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,17 @@ var (
LocalRedirectCIDR string
PubIPv4CIDR string

// https
//
// HTTPS
// for details and background, see: https://github.com/NVIDIA/aistore/blob/main/docs/environment-vars.md#https
//
UseHTTPS string
// TLS: client side
Certificate string
CertKey string
ClientCA string
SkipVerifyCrt string
// TLS: server (aistore, AuthN) side (NOTE comment below)

// tests, CI
NumTarget string
Expand All @@ -53,12 +57,18 @@ var (

// false: HTTP transport, with all the TLS config (below) ignored
// true: HTTPS/TLS
// for details and background, see: https://github.com/NVIDIA/aistore/blob/main/docs/environment-vars.md#https
UseHTTPS: "AIS_USE_HTTPS", // cluster config: "net.http.use_https"

// TLS: client side
Certificate: "AIS_CRT",
CertKey: "AIS_CRT_KEY",
ClientCA: "AIS_CLIENT_CA",

// TLS: server (aistore, AuthN) side
// "AIS_SERVER_CRT" - TLS certificate (pathname)
// "AIS_SERVER_KEY" - private key (ditto)

// TLS: common
SkipVerifyCrt: "AIS_SKIP_VERIFY_CRT", // cluster config: "net.http.skip_verify"

Expand Down
3 changes: 2 additions & 1 deletion cmn/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,8 @@ func NewClientTLS(cargs TransportArgs, sargs TLSArgs, intra bool) *http.Client {
return &http.Client{Transport: transport, Timeout: cargs.Timeout}
}

// see related: HTTPConf.ToTLS()
// EnvToTLS usage is limited to aisloader and tools
// NOTE that embedded intra-cluster clients utilize a similar method: `HTTPConf.ToTLS`
func EnvToTLS(sargs *TLSArgs) {
if s := os.Getenv(env.AIS.Certificate); s != "" {
sargs.Certificate = s
Expand Down
14 changes: 7 additions & 7 deletions docs/authn.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,17 +176,17 @@ Further references:
Environment variables used by the deployment script to set up the AuthN server:
| Variable | Default Value | Description |
|----------------------|---------------------|-------------------------------------------------------------------------------------------------|
| Variable | Default Value | Description |
|------------------------|---------------------|-------------------------------------------------------------------------------------------------|
| `AIS_AUTHN_SECRET_KEY` | `aBitLongSecretKey` | Secret key used to sign tokens |
| `AIS_AUTHN_ENABLED` | `false` | Enable AuthN server and token-based access in AIStore proxy (`true` to enable) |
| `AIS_AUTHN_PORT` | `52001` | Port on which AuthN listens to requests |
| `AIS_AUTHN_TTL` | `24h` | Token expiration time. Can be set to `0` for no expiration |
| `AIS_AUTHN_USE_HTTPS` | `false` | Enable HTTPS for AuthN server. If `true`, requires `AIS_SERVER_CRT` and `AIS_SERVER_KEY` to be set |
| `AIS_SERVER_CRT` | `""` | OpenSSL certificate. Required when `AIS_AUTHN_USE_HTTPS` is `true` |
| `AIS_SERVER_KEY` | `""` | OpenSSL key. Required when `AIS_AUTHN_USE_HTTPS` is `true` |
| `AIS_AUTHN_SU_NAME` | `admin` | Superuser (admin) name for AuthN |
| `AIS_AUTHN_SU_PASS` | `admin` | Superuser (admin) password for AuthN |
| `AIS_SERVER_CRT` | `""` | TLS certificate. Required when `AIS_AUTHN_USE_HTTPS` is `true` |
| `AIS_SERVER_KEY` | `""` | private key for the TLS certificate (above). |
| `AIS_AUTHN_SU_NAME` | `admin` | Superuser (admin) name for AuthN |
| `AIS_AUTHN_SU_PASS` | `admin` | Superuser (admin) password for AuthN |
All variables can be set at AIStore cluster deployment and will override values in the config.
Example of starting a cluster with AuthN enabled:
Expand Down Expand Up @@ -420,4 +420,4 @@ When a cluster is registered, an arbitrary alias can be assigned to the cluster.
| Operation | HTTP Action | Example |
|------------------------------|-------------|-----------------------------------------------------------------------------------------------|
| Get AuthN configuration | GET /v1/daemon | `curl -X GET $AUTHSRV/v1/daemon -H 'Authorization: Bearer <token>'` |
| Update AuthN configuration | PUT /v1/daemon | `curl -X PUT $AUTHSRV/v1/daemon -d '{"log":{"dir":"<log-dir>","level":"<log-level>"},"net":{"http":{"port":<port>,"use_https":false,"server_crt":"","server_key":""}},"auth":{"secret":"aBitLongSecretKey","expiration_time":"24h0m"},"timeout":{"default_timeout":"30s"}}' -H 'Authorization: Bearer <token>'` |
| Update AuthN configuration | PUT /v1/daemon | `curl -X PUT $AUTHSRV/v1/daemon -d '{"log":{"dir":"<log-dir>","level":"<log-level>"},"net":{"http":{"port":<port>,"use_https":false,"server_crt":"","server_key":""}},"auth":{"secret":"aBitLongSecretKey","expiration_time":"24h0m"},"timeout":{"default_timeout":"30s"}}' -H 'Authorization: Bearer <token>'` |
12 changes: 9 additions & 3 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,9 +245,15 @@ In addition, environment can be used to **override** client-side TLS (aka, HTTPS
| `AIS_CLIENT_CA` | Certificate authority that authorized (signed) the certificate | "cluster.client_ca_tls" |
| `AIS_SKIP_VERIFY_CRT` | true: skip X.509 cert verification (usually enabled to circumvent limitations of self-signed certs) | "cluster.skip_verify_crt" |

See also:

* [HTTPS: loading, reloading, and generating certificates; switching cluster between HTTP and HTTPS](/docs/https.md)
### Further references

- [Generating self-signed certificates](/docs/https.md#generating-self-signed-certificates)
- [Deploying: 4 targets, 1 gateway, 6 mountpaths, AWS backend](/docs/https.md#deploying-4-targets-1-gateway-6-mountpaths-aws-backend)
- [Accessing HTTPS-based cluster](/docs/https.md#accessing-https-based-cluster)
- [Testing with self-signed certificates](/docs/https.md#testing-with-self-signed-certificates)
- [Observability: TLS related alerts]((/docs/https.md#observability-tls-related-alerts)
- [Updating and reloading X.509 certificates](/docs/https.md#updating-and-reloading-x509-certificates)
- [Switching cluster between HTTP and HTTPS](/docs/https.md#switching-cluster-between-http-and-https)

## First steps

Expand Down
122 changes: 92 additions & 30 deletions docs/cli/advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,42 +7,38 @@ redirect_from:
- /docs/cli/advanced.md/
---

# `ais advanced` commands
Commands for special use cases (e.g. scripting) and *advanced* usage scenarios, whereby a certain level of understanding of possible consequences is assumed (and required).

Commands for special use cases (e.g. scripting) and *advanced* usage scenarios, whereby a certain level of understanding of possible consequences is implied and required:
## Table of Contents
- [`ais advanced`](#ais-advanced)
- [Manual Resilvering](#manual-resilvering)
- [Preload bucket](#preload-bucket)
- [Remove node from Smap](#remove-node-from-smap)
- [Rotate logs: individual nodes or entire cluster](#rotate-logs-individual-nodes-or-entire-cluster)
- [Disable/Enable cloud backend at runtime](#disableenable-cloud-backend-at-runtime)
- [Load TLS certificate](#load-tls-certificate)

## `ais advanced`

```console
$ ais advanced --help
NAME:
ais advanced - special commands intended for development and advanced usage

USAGE:
ais advanced command [command options] [arguments...]

COMMANDS:
gen-shards generate and write random TAR shards, e.g.:
- gen-shards 'ais://bucket1/shard-{001..999}.tar' - write 999 random shards (default sizes) to ais://bucket1
- gen-shards 'gs://bucket2/shard-{01..20..2}.tgz' - 10 random gzipped tarfiles to Cloud bucket
(notice quotation marks in both cases)
resilver resilver user data on a given target (or all targets in the cluster): fix data redundancy
with respect to bucket configuration, remove migrated objects and old/obsolete workfiles
resilver resilver user data on a given target (or all targets in the cluster); entails:
- fix data redundancy with respect to bucket configuration;
- remove migrated objects and old/obsolete workfiles.
preload preload object metadata into in-memory cache
remove-from-smap immediately remove node from cluster map (beware: potential data loss!)
random-node print random node ID (by default, ID of a randomly selected target)
random-mountpath print a random mountpath from a given target
rotate-logs rotate aistore logs
enable-backend (re)enable cloud backend
disable-backend disable cloud backend
```

AIS CLI features a number of miscellaneous and advanced-usage commands.

## Table of Contents
- [Manual Resilvering](#manual-resilvering)
- [Preload bucket](#preload-bucket)
- [Remove node from Smap](#remove-node-from-smap)
- [Rotate logs: individual nodes or entire cluster](#rotate-logs-individual-nodes-or-entire-cluster)
- [Disable/Enable cloud backend at runtime](#disableenable-cloud-backend-at-runtime)
enable-backend (re)enable cloud backend (see also: 'ais config cluster backend')
disable-backend disable cloud backend (see also: 'ais config cluster backend')
load-X.509 (re)load TLS certificate
```

## Manual Resilvering

Expand All @@ -65,9 +61,7 @@ Started resilver "NGxmOthtE", use 'ais show job xaction NGxmOthtE' to monitor th

`ais advanced preload BUCKET`

Preload bucket's objects metadata into in-memory caches.

### Examples
Preload objects metadata into in-memory cache.

```console
$ ais advanced preload ais://bucket
Expand All @@ -77,7 +71,7 @@ $ ais advanced preload ais://bucket

`ais advanced remove-from-smap NODE_ID`

Immediately remove node from the cluster map.
Immediately remove node from the cluster map (a.k.a. Smap).

Beware! When the node in question is ais target, the operation may (and likely will) result in a data loss that cannot be undone. Use decommission and start/stop maintenance operations to perform graceful removal.

Expand All @@ -93,11 +87,15 @@ xVMNp8081 0.16% 31.12GiB 6m50s
MvwQp8080[P] 0.18% 31.12GiB 6m40s
NnPLp8082 0.16% 31.12GiB 6m50s


$ ais advanced remove-from-smap MvwQp8080
Node MvwQp 8080 is primary: cannot remove

$ ais advanced remove-from-smap p[xVMNp8081]
```

And the result:

```console
$ ais show cluster proxy
PROXY MEM USED % MEM AVAIL UPTIME
BcnQp8083 0.16% 31.12GiB 8m
Expand Down Expand Up @@ -151,6 +149,7 @@ This capability is now supported, and will be included in v3.24 release. And the
### Examples

**1)** say, there's a cloud bucket with 4 objects:

```console
$ ais ls s3://test-bucket
NAME SIZE CACHED
Expand All @@ -163,6 +162,7 @@ NAME SIZE CACHED
Note that only 2 objects out of 4 are in-cluster.

**2)** disable s3 backend:

```console
$ ais advanced disable-backend <TAB-TAB>
gcp aws azure
Expand All @@ -172,12 +172,14 @@ cluster: disabled aws backend
```

**3)** observe "offline" error when trying to list the bucket:

```console
$ ais ls s3://test-bucket
Error: ErrRemoteBucketOffline: bucket "s3://test-bucket" is currently unreachable
```

**4)** but (!) all in-cluster objects can still be listed:

```console
$ ais ls s3://test-bucket --cached
NAME SIZE
Expand All @@ -186,24 +188,28 @@ NAME SIZE
```

**5)** and read:

```console
$ ais get s3://test-bucket/111 /dev/null
GET and discard 111 from s3://test-bucket (15.97KiB)
GET (and discard) 111 from s3://test-bucket (15.97KiB)
```

**6)** expectedly, remote objects are not accessible:

```console
$ ais get s3://test-bucket/333 /dev/null
Error: object "s3://test-bucket/333" does not exist
```

**7)** let's now reconnect s3:

```console
$ ais advanced enable-backend aws
cluster: enabled aws backend
```

**8)** and observer that both in-cluster and remote content is now again available:
**8)** finally, observe that both in-cluster and remote content is now again available:

```console
$ ais ls s3://test-bucket
NAME SIZE CACHED
Expand All @@ -213,5 +219,61 @@ NAME SIZE CACHED
444 15.97KiB no

$ ais get s3://test-bucket/333 /dev/null
GET and discard 333 from s3://test-bucket (15.97KiB)
GET (and discard) 333 from s3://test-bucket (15.97KiB)
```

## Load TLS certificate

HTTPS deployment implies (and requires) that each AIS node has a valid TLS (a.k.a. [X.509](https://www.ssl.com/faqs/what-is-an-x-509-certificate/)) certificate.

The latter has a number of interesting properties ultimately intended to authenticate clients (users) to servers (AIS nodes). And vice versa.

In addition, TLS certfificates tend to expire from time to time. In fact, each TLS certificate has expiration date with the standard-defined maximum being 13 months (397 days).

> Some sources claim 398 days but the (much) larger point remains: TLS certificates do expire. Which means, they must be periodically updated and timely reloaded.
Starting v3.24, AIStore:

* tracks certificate expiration times;
* automatically - upon update - reloads updated certificates;
* raises associated alerts.

### Associated alerts

```console
$ ais show cluster

PROXY MEM AVAIL LOAD AVERAGE UPTIME STATUS ALERT
p[KKFpNjqo][P] 127.77GiB [5.2 7.2 3.1] 108h30m40s online tls-cert-will-soon-expire
...

TARGET MEM AVAIL CAP USED(%) CAP AVAIL LOAD AVERAGE UPTIME STATUS ALERT
t[pDztYhhb] 98.02GiB 16% 960.824GiB [9.1 13.4 8.3] 108h30m1s online tls-cert-will-soon-expire
...
...
```

Overall, there are currentky 3 (three) related alerts:

| alert | comment |
| -- | -- |
| `tls-cert-will-soon-expire` | a warning that X.509 cert will expire in less than 3 days |
| `tls-cert-expired` | red alert (as the name implies) |
| `tls-cert-invalid` | ditto |

### Loading and reloading certificate on demand

```console
$ ais advanced load-X.509
Done: all nodes.
```

### Further references

- [Generating self-signed certificates](/docs/https.md#generating-self-signed-certificates)
- [Deploying: 4 targets, 1 gateway, 6 mountpaths, AWS backend](/docs/https.md#deploying-4-targets-1-gateway-6-mountpaths-aws-backend)
- [Accessing HTTPS-based cluster](/docs/https.md#accessing-https-based-cluster)
- [Testing with self-signed certificates](/docs/https.md#testing-with-self-signed-certificates)
- [Observability: TLS related alerts](/docs/https.md#observability-tls-related-alerts)
- [Updating and reloading X.509 certificates](/docs/https.md#updating-and-reloading-x509-certificates)
- [Switching cluster between HTTP and HTTPS](/docs/https.md#switching-cluster-between-http-and-https)
2 changes: 1 addition & 1 deletion docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -534,7 +534,7 @@ If extended attributes are disabled globally when deploying a cluster, node IDs

## Enabling HTTPS

To switch from HTTP protocol to an encrypted HTTPS, configure `net.http.use_https`=`true` and modify `net.http.server_crt` and `net.http.server_key` values so they point to your OpenSSL certificate and key files respectively (see [AIStore configuration](/deploy/dev/local/aisnode_config.sh)).
To switch from HTTP protocol to an encrypted HTTPS, configure `net.http.use_https`=`true` and modify `net.http.server_crt` and `net.http.server_key` values so they point to your TLS certificate and key files respectively (see [AIStore configuration](/deploy/dev/local/aisnode_config.sh)).

The following HTTPS topics are also covered elsewhere:

Expand Down
Loading

0 comments on commit e40ec36

Please sign in to comment.