Show nvmeof gateways in "ceph status" #801

VallariAg · 2024-08-13T15:37:06Z

Register nvmeof gw service to service_map.
This brings up nvmeof in "ceph status" output.

After gateway deployment, It shows all 4 gateways in "ceph -s" output):

2024-08-20T11:51:41.617 INFO:tasks.workunit.client.2.smithi067.stdout:  cluster:
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:    id:     de9bdec8-5ee8-11ef-bccf-c7b262605968
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:    health: HEALTH_WARN
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:            Degraded data redundancy: 484/1452 objects degraded (33.333%), 32 pgs degraded, 32 pgs undersized
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:  services:
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:    mon:    3 daemons, quorum a,c,b (age 9m)
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:    mgr:    x(active, since 10m)
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:    osd:    2 osds: 2 up (since 8m), 2 in (since 8m)
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:    nvmeof: 4 gateways active (4 hosts)
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:  data:
2024-08-20T11:51:41.618 INFO:tasks.workunit.client.2.smithi067.stdout:    pools:   1 pools, 32 pgs
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:    objects: 484 objects, 1020 MiB
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:    usage:   1.5 GiB used, 177 GiB / 179 GiB avail
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:    pgs:     484/1452 objects degraded (33.333%)
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:             32 active+undersized+degraded
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:  io:
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:    client:   32 KiB/s rd, 2 op/s rd, 0 op/s wr
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:  progress:
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:    Global Recovery Event (0s)
2024-08-20T11:51:41.619 INFO:tasks.workunit.client.2.smithi067.stdout:      [............................]

After nvmeof service is removed, nvmeof disappears from "ceph status" too:

2024-08-20T11:52:02.450 INFO:tasks.workunit.client.2.smithi067.stderr:+ ceph -s
2024-08-20T11:52:02.987 INFO:tasks.workunit.client.2.smithi067.stdout:  cluster:
2024-08-20T11:52:02.987 INFO:tasks.workunit.client.2.smithi067.stdout:    id:     de9bdec8-5ee8-11ef-bccf-c7b262605968
2024-08-20T11:52:02.987 INFO:tasks.workunit.client.2.smithi067.stdout:    health: HEALTH_WARN
2024-08-20T11:52:02.987 INFO:tasks.workunit.client.2.smithi067.stdout:            Degraded data redundancy: 484/1452 objects degraded (33.333%), 32 pgs degraded, 32 pgs undersized
2024-08-20T11:52:02.987 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:  services:
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:    mon: 3 daemons, quorum a,c,b (age 9m)
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:    mgr: x(active, since 10m)
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:    osd: 2 osds: 2 up (since 8m), 2 in (since 8m)
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:  data:
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:    pools:   1 pools, 32 pgs
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:    objects: 484 objects, 1020 MiB
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:    usage:   1.4 GiB used, 177 GiB / 179 GiB avail
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:    pgs:     484/1452 objects degraded (33.333%)
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:             32 active+undersized+degraded
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:52:02.988 INFO:tasks.workunit.client.2.smithi067.stdout:  io:
2024-08-20T11:52:02.989 INFO:tasks.workunit.client.2.smithi067.stdout:    client:   87 KiB/s rd, 0 B/s wr, 97 op/s rd, 55 op/s wr
2024-08-20T11:52:02.989 INFO:tasks.workunit.client.2.smithi067.stdout:
2024-08-20T11:52:02.989 INFO:tasks.workunit.client.2.smithi067.stdout:  progress:
2024-08-20T11:52:02.989 INFO:tasks.workunit.client.2.smithi067.stdout:    Global Recovery Event (0s)
2024-08-20T11:52:02.989 INFO:tasks.workunit.client.2.smithi067.stdout:      [............................]

https://pulpito.ceph.com/vallariag-2024-08-20_11:27:49-nvmeof-main-distro-default-smithi/

Fix problem "4 stray daemon(s) not managed by cephadm"
Ensure it shows 4 gateways in "ceph -s"
Run it again with good build - this build gives [WRN] Health check failed: Degraded data redundancy: 2/6 objects degraded (33.333%), 2 pgs degraded (PG_DEGRADED)

control/server.py

control/state.py

control/server.py

control/cephutils.py

control/server.py

baum

lgtm 🖖

VallariAg · 2024-09-05T15:16:50Z

Added group to metadata.
In our Github CI, gateway is set with group name "" (ref: start_up.sh). So I tested it on a deployment with "mygroup" group name:

[vallariag@smithi049 ~]$ ceph service dump
{
    "epoch": 2517,
    "modified": "2024-09-05T14:55:56.434331+0000",
    "services": {
        "nvmeof": {
            "daemons": {
                "summary": "",
                "mypool.mygroup.smithi049.gaaaet": {
                    "start_epoch": 2517,
                    "start_stamp": "2024-09-05T14:55:55.953098+0000",
                    "gid": 14613,
                    "addr": "172.21.15.49:0/2235415121",
                    "metadata": {
                        "arch": "x86_64",
                        "ceph_release": "squid",
                        "ceph_version": "ceph version 19.3.0-4585-gb59673c4 (b59673c44bd569f9f3db37f87bced695dec5fcbf) squid (dev)",
                        "ceph_version_short": "19.3.0-4585-gb59673c4",
                        "container_hostname": "smithi049",
                        "container_image": "quay.io/vallari/nvmeof:1.3",
                        "cpu": "Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz",
                        "daemon_type": "gateway",
                        "distro": "rhel",
                        "distro_description": "Red Hat Enterprise Linux 9.4 (Plow)",
                        "distro_version": "9.4",
                        "group": "mygroup",
                        "hostname": "smithi049",
                        "id": "mypool.mygroup.smithi049.gaaaet",
                        "kernel_description": "#1 SMP PREEMPT_DYNAMIC Tue Apr 9 12:57:02 UTC 2024",
                        "kernel_version": "5.14.0-437.el9.x86_64",
                        "mem_swap_kb": "0",
                        "mem_total_kb": "32488316",
                        "os": "Linux",
                        "pool_name": "mypool"
                    },
                    "task_status": {}
                }
            }
        }
    }
}

This brings up nvmeof in "ceph status" output. Signed-off-by: Vallari Agrawal <[email protected]>

Verify nvmeof service in ceph status output. Signed-off-by: Vallari Agrawal <[email protected]>

caroav requested review from baum and gbregman August 14, 2024 04:38

gbregman requested changes Aug 14, 2024

View reviewed changes

control/server.py Show resolved Hide resolved

control/server.py Outdated Show resolved Hide resolved

VallariAg force-pushed the nvmeof-ceph-status branch from 375b1da to bf4f9bb Compare August 14, 2024 09:55

gbregman approved these changes Aug 14, 2024

View reviewed changes

control/server.py Show resolved Hide resolved

VallariAg force-pushed the nvmeof-ceph-status branch from bf4f9bb to 7cf9ff1 Compare August 14, 2024 13:54

VallariAg requested a review from gbregman August 14, 2024 13:55

gbregman requested changes Aug 14, 2024

View reviewed changes

control/state.py Show resolved Hide resolved

VallariAg force-pushed the nvmeof-ceph-status branch from 7cf9ff1 to 1268c2a Compare August 14, 2024 14:10

gbregman approved these changes Aug 14, 2024

View reviewed changes

VallariAg force-pushed the nvmeof-ceph-status branch from 1268c2a to 4f12b35 Compare August 15, 2024 03:35

VallariAg commented Aug 15, 2024

View reviewed changes

control/server.py Outdated Show resolved Hide resolved

VallariAg force-pushed the nvmeof-ceph-status branch from 4f12b35 to f701c3c Compare August 15, 2024 07:21

VallariAg requested a review from gbregman August 15, 2024 13:53

gbregman requested changes Aug 15, 2024

View reviewed changes

control/cephutils.py Outdated Show resolved Hide resolved

VallariAg force-pushed the nvmeof-ceph-status branch from f701c3c to 1e67f0f Compare August 16, 2024 05:34

VallariAg requested a review from gbregman August 19, 2024 06:58

gbregman approved these changes Aug 19, 2024

View reviewed changes

VallariAg force-pushed the nvmeof-ceph-status branch 2 times, most recently from 0875891 to 4eeb243 Compare August 19, 2024 16:00

VallariAg force-pushed the nvmeof-ceph-status branch 2 times, most recently from 91164e7 to c575bca Compare September 5, 2024 09:09

baum reviewed Sep 5, 2024

View reviewed changes

control/server.py Show resolved Hide resolved

baum approved these changes Sep 5, 2024

View reviewed changes

VallariAg force-pushed the nvmeof-ceph-status branch from c575bca to 9010383 Compare September 5, 2024 15:12

VallariAg added 2 commits September 9, 2024 10:48

Register nvmeof gw service to service_map

abf9ba9

This brings up nvmeof in "ceph status" output. Signed-off-by: Vallari Agrawal <[email protected]>

CI: add tests/ha/ceph_status.sh

d5e408b

Verify nvmeof service in ceph status output. Signed-off-by: Vallari Agrawal <[email protected]>

VallariAg force-pushed the nvmeof-ceph-status branch from 9010383 to d5e408b Compare September 9, 2024 05:19

VallariAg merged commit 63ce22c into ceph:devel Sep 9, 2024
38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show nvmeof gateways in "ceph status" #801

Show nvmeof gateways in "ceph status" #801

VallariAg commented Aug 13, 2024 •

edited

Loading

baum left a comment

VallariAg commented Sep 5, 2024 •

edited

Loading

Show nvmeof gateways in "ceph status" #801

Show nvmeof gateways in "ceph status" #801

Conversation

VallariAg commented Aug 13, 2024 • edited Loading

baum left a comment

Choose a reason for hiding this comment

VallariAg commented Sep 5, 2024 • edited Loading

VallariAg commented Aug 13, 2024 •

edited

Loading

VallariAg commented Sep 5, 2024 •

edited

Loading