Skip to content

Commit

Permalink
Include resolving steps in check_cephcluster_status partial (#153)
Browse files Browse the repository at this point in the history
Co-authored-by: Stephan Feurer <[email protected]>
  • Loading branch information
DebakelOrakel and Stephan Feurer authored Jan 16, 2024
1 parent 4d230ed commit 718e309
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 27 deletions.
27 changes: 0 additions & 27 deletions docs/modules/ROOT/pages/runbooks/CephOSDFlapping.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -32,33 +32,6 @@ In particular, check whether section `State` for container `osd` has a note that

include::partial$runbooks/check_cephcluster_status.adoc[]

==== Check Ceph crash logs

[source,console]
----
$ ceph_cluster_ns=syn-rook-ceph-cluster
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- ceph crash ls <1>
[ ... list of crash logs ... ]
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- \
ceph crash info <CRASH_ID> <2>
[ ... detailed crash info ... ]
----
<1> List currently not archived crash logs
<2> Show detailed information of crash log with id `<CRASH_ID>`

==== Archive Ceph crash logs

[source,console]
----
$ ceph_cluster_ns=syn-rook-ceph-cluster
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- \
ceph crash archive-all <1>
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- \
ceph crash archive <CRASH_ID> <2>
----
<1> Archive all currently not archived crash logs
<2> Archive crash log with id `<CRASH_ID>`

== icon:book[] Upstream documentation

https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-osd#flapping-osds
27 changes: 27 additions & 0 deletions docs/modules/ROOT/partials/runbooks/check_cephcluster_status.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,30 @@ $ kubectl -n ${ceph_cluster_ns} exec -it deploy/rook-ceph-tools -- ceph status
<1> General cluster health status
<2> One or more lines of information giving details why the cluster state is degraded.
Only available if the cluster health isn't `HEALTH_OK`.

==== Check Ceph crash logs

[source,console]
----
$ ceph_cluster_ns=syn-rook-ceph-cluster
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- ceph crash ls <1>
[ ... list of crash logs ... ]
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- \
ceph crash info <CRASH_ID> <2>
[ ... detailed crash info ... ]
----
<1> List currently not archived crash logs
<2> Show detailed information of crash log with id `<CRASH_ID>`

==== Archive Ceph crash logs

[source,console]
----
$ ceph_cluster_ns=syn-rook-ceph-cluster
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- \
ceph crash archive-all <1>
$ kubectl -n "${ceph_cluster_ns}" exec -it deploy/rook-ceph-tools -- \
ceph crash archive <CRASH_ID> <2>
----
<1> Archive all currently not archived crash logs
<2> Archive crash log with id `<CRASH_ID>`

0 comments on commit 718e309

Please sign in to comment.