Skip to content

Commit

Permalink
rgw: enhances rgw-restore-bucket-index script
Browse files Browse the repository at this point in the history
This enhances the script to both process versioned buckets correctly
and to handle object names that begin with underscore.

If the bucket is versioned it submits each version chronologically
(based on mtime) to be reindexed in order to "replay" the modification
of objects. However mtime is not a perfect indicator. So additionally
it looks at the OLH object to determine the most recent version and
the script makes sure that it is replayed last. The order of previous
versions is likely correct, but not guaranteed to be so.

Additional logic is added to handle objects with names that begin with
underscore ('_') since that's used as a delimiter and needs to be
escaped and rados object locators are also used.

A man page for the script is added.

Signed-off-by: J. Eric Ivancich <[email protected]>
  • Loading branch information
ivancich committed Aug 30, 2023
1 parent f731570 commit c02906a
Show file tree
Hide file tree
Showing 6 changed files with 272 additions and 128 deletions.
1 change: 1 addition & 0 deletions ceph.spec.in
Original file line number Diff line number Diff line change
Expand Up @@ -1704,6 +1704,7 @@ exit 0
%{_mandir}/man8/rbd-replay-many.8*
%{_mandir}/man8/rbd-replay-prep.8*
%{_mandir}/man8/rgw-orphan-list.8*
%{_mandir}/man8/rgw-restore-bucket-index.8*
%dir %{_datadir}/ceph/
%{_datadir}/ceph/known_hosts_drop.ceph.com
%{_datadir}/ceph/id_rsa_drop.ceph.com
Expand Down
1 change: 1 addition & 0 deletions debian/radosgw.install
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ usr/bin/radosgw-token
usr/share/man/man8/ceph-diff-sorted.8
usr/share/man/man8/radosgw.8
usr/share/man/man8/rgw-orphan-list.8
usr/share/man/man8/rgw-restore-bucket-index.8
3 changes: 2 additions & 1 deletion doc/man/8/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,8 @@ if(WITH_RADOSGW)
radosgw-admin.rst
rgw-orphan-list.rst
rgw-policy-check.rst
ceph-diff-sorted.rst)
ceph-diff-sorted.rst
rgw-restore-bucket-index.rst)
endif()

if(WITH_RBD)
Expand Down
91 changes: 91 additions & 0 deletions doc/man/8/rgw-restore-bucket-index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
:orphan:

==================================================================================
rgw-restore-bucket-index -- try to restore a bucket's objects to its bucket index
==================================================================================

.. program:: rgw-restore-bucket-index

Synopsis
========

| **rgw-restore-bucket-index**
Description
===========

:program:`rgw-restore-bucket-index` is an *EXPERIMENTAL* RADOS gateway
user administration utility. It scans the data pool for objects that
belong to a given bucket and tries to add those objects back to the
bucket index. It's intended as a **last resort** after a
**catastrophic** loss of a bucket index. Please thorougly review the
*Warnings* listed below.

The utility works with regular (i.e., un-versioned) buckets, versioned
buckets, and buckets were versioning has been suspended.

Warnings
========

This utility is currently considered *EXPERIMENTAL*.

The results are unpredictable if the bucket is in
active use while this utility is running.

The results are unpredictable if only some bucket's objects are
missing from the bucket index. In such a case, consider using the
"object reindex" subcommand of `radosgw-admin` to restore object's to
the bucket index one-by-one.

For objects in versioned buckets, if the latest version is a delete
marker, it will be restored. If a delete marker has been written over
with a new version, then that delete marker will not be restored. This
should have minimal impact on results in that the it recovers the
latest version and previous versions are all accessible.

Command-Line Arguments
======================

.. option:: -b <bucket>

Specify the bucket to be reindexed.

.. option:: -p <pool>

Optional, specify the data pool containing head objects for the
bucket. If omitted the utility will try to determine the data pool
on its own.

.. option:: -l <rados-ls-output-file>

Optional, specify a file containing the output of a rados listing
of the data pool. Since listing the data pool can be an expensive
and time-consuming operation, if trying to recover the indices for
multiple buckets, it could be more efficient to re-use the same
listing.

.. option:: -y

Optional, proceed without further prompting. Without this option
the utility will display some information and prompt the user as to
whether to proceed. When provided, the utility will simply
proceed. Please use caution when using this option.

Examples
========

Attempt to restore the index for a bucket named *summer-2023-photos*::

$ rgw-restore-bucket-index -b summer-2023-photos

Availability
============

:program:`rgw-restore-bucket-index` is part of Ceph, a massively
scalable, open-source, distributed storage system. Please refer to
the Ceph documentation at https://docs.ceph.com for more information.

See also
========

:doc:`radosgw-admin <radosgw-admin>`\(8)
1 change: 1 addition & 0 deletions doc/man_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,3 +48,4 @@
man/8/ceph-immutable-object-cache
man/8/ceph-diff-sorted
man/8/rgw-policy-check
man/8/rgw-restore-bucket-index
Loading

0 comments on commit c02906a

Please sign in to comment.