Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(DNM) D4N dirty prefix removal #20

Conversation

samarahu
Copy link

@samarahu samarahu commented Oct 29, 2024

Includes commits to do the following:

  1. Remove the dirty prefix from the D4N logic
  2. Remove the del method's implementation from the RedisDriver and replace del with a delete call in the RedisDriver unit test
  3. Update the D4N policy test by correcting the cache key to use the correct delimiter and add checks for the cache blocks

into bucket directory, object directory and then versions
representing file names.

This commit for extracts bucket id, object name, version,
offset, and length from input key and organizes data into
directories - with bucket id, object name being used to
create directories and filenames within them using version
and offset and length.

Data can be restored back to in memory LFUDA and dirty object
data structure once rgw is restarted.

Updating unit tests.

Signed-off-by: Pritha Srivastava <[email protected]>
@github-actions github-actions bot added the rgw label Oct 29, 2024
@samarahu samarahu force-pushed the d4n-dirty-prefix-removal branch from ac85da1 to 062c198 Compare November 4, 2024 17:04
src/rgw/rgw_ssd_driver.cc Outdated Show resolved Hide resolved
@samarahu samarahu force-pushed the d4n-dirty-prefix-removal branch from 062c198 to cc2beac Compare November 4, 2024 20:43
@github-actions github-actions bot added the tests label Nov 4, 2024
Samarah and others added 7 commits November 5, 2024 14:41
for unversioned buckets.

1. d4n/directory: Update logging
2. d4n/filter: Update `set_head_obj_dir_entry` to handle null versioning and update `prevVersion` for versioned head objects
3. rgw/d4n: Simplify code to only support unversioned bucket deletes

Signed-off-by: Samarah <[email protected]>
clean (non-dirty) objects for delete_obj method.

1. rgw/d4n: miscellaneous fixes that include modifying
set_head_obj_dir_entry() and delete_obj method to handle
clean objects only.
2. rgw/d4n: changes to design when write cache is turned off:
a. do not store a hash entry for latest version for versioned
buckets, the latest version will always be fetched from backend store,
to maintain correct order of versions during a delete object request.

Signed-off-by: Pritha Srivastava <[email protected]>
and BlockDirectory.

Adds methods to add, list a range and remove members from a
redis ordered set ordered by score passed in. Also adds unit
test cases to test the methods.

Signed-off-by: Pritha Srivastava <[email protected]>
WATCH, EXEC, INCR, MULTI and DISCARD redis commands.

1. d4n/directory: support for watch, exec and incr methods in Block
and Object Directory.
2. d4n/directory: support for redis MULTI and DISCARD command,
adding a test case to exercise usage of MULTI.

Signed-off-by: Pritha Srivastava <[email protected]>
operations atomic while modifying the latest and null hash entry
in case of non-versioned buckets.

Signed-off-by: Pritha Srivastava <[email protected]>
1. ordered set to maintain versions of a dirty object
2. creation of delete marker in case of a simple delete request
3. deletion of a specific version from the ordered set
4. cleaning method deletes from ordered set for dirty objects
5. use of redis atomicity constructs wherever needed

Signed-off-by: Pritha Srivastava <[email protected]>
to lazy deletion - delete dirty objects data
blocks in cleaning method and non-dirty objects
data blocks in eviction method

1. rgw/d4n: Implement lazy deletion
2. rgw/d4n: cleaning method now supports delete markers
also (both versioned and null). The delete markers
are written correctly to the backend store.
Also modifying the description of rgw_d4n_cache_cleaning_interval
to explicitly state that the duration is in seconds.
3. rgw/d4n: do not call invalidate_dirty_object in case
of a simple delete request for dirty objects belonging
to a versioned bucket.
In this case, a delete marker needs to be created instead of
invalidating/deleting an object.
4. rgw/d4n: Update lazy deletion in policy

Signed-off-by: Samarah <[email protected]>
Signed-off-by: Pritha Srivastava <[email protected]>
@samarahu samarahu force-pushed the d4n-dirty-prefix-removal branch from cc2beac to 3e82b54 Compare November 5, 2024 21:15
filter driver and ssd backed cache.

1. rgw/d4n: addressing concurrency issues by adding a refcount
to each block.
Blocks with positive refcounts are pinned (similar to dirty blocks)
and not eligible for eviction or deletion. Updating unit tests also.
2. rgw/cache: addressing concurrency issues while directories creation,
deletion, updating xattrs, get_attr and put.

Signed-off-by: Pritha Srivastava <[email protected]>
@samarahu samarahu force-pushed the d4n-dirty-prefix-removal branch from 3e82b54 to 67f2b59 Compare November 7, 2024 17:25
@samarahu samarahu changed the title rgw/d4n: Begin dirty prefix removal (DNM) D4N dirty prefix removal Nov 7, 2024
@@ -521,6 +514,13 @@ void D4NFilterObject::set_attrs_from_obj_state(const DoutPrefixProvider* dpp, op

bl_val.append(this->get_bucket()->get_name());
attrs[RGW_CACHE_ATTR_BUCKET_NAME] = std::move(bl_val);

if (dirty) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one generic comment for this PR: since you do not add an xattr for a clean block in handle_data(), lets do the same everywhere. We can avoid a call to cache by not setting an xattr in case of a clean block. So modify set_attrs_from_obj_state() to set the xattr for a head block only when it is dirty. The absence of the xattr signifies that the block is clean. Make bool dirty=false as the default argument to this method, so that you just need to pass in true only for dirty blocks. Also make changes while restoring data, only if the xattr is present and set to 1, then set that block to dirty, else set the block to clean.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what is the status of testing of this PR? Have you tested these changes? and the data restoration also? Let me know.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested the PR with lazy deletion for both unversioned and versioned buckets (suspended and enabled), and I tested data restoration as well as restoration during a delete before the cleaning cycle. The data is handled correctly and the cache entries are deleted as expected during the cleaning cycles.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please test 'get' requests, after a 'put' for both versioned and unversioned buckets if you haven't done so.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also ensure that the 'dirty' xattr is set by using getfattr -d -- <filename>.

@samarahu samarahu force-pushed the d4n-dirty-prefix-removal branch 3 times, most recently from 4d0ac4c to c0d5e9d Compare November 11, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants