Skip to content

Synchronize the update of feature refcount #17632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

youzhongyang
Copy link
Contributor

@youzhongyang youzhongyang commented Aug 13, 2025

Motivation and Context

As reported and analyzed in #17184, the concurrent execution of feature_sync() can lead to panic due to unprotected update of the feature refcount.

Description

Use spa->spa_feat_stats_lock to synchronize the update of the refcount.

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Quality assurance (non-breaking change which makes the code more robust against bugs)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Aug 13, 2025
@@ -432,7 +433,20 @@ feature_do_action(spa_t *spa, spa_feature_t fid, feature_action_t action,
break;
}

feature_sync(spa, feature, refcount, tx);
Copy link
Contributor

@snajpa snajpa Aug 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the code duplication? wouldn't it be better to ensure feature_sync is OK to be called with spa_feat_stats_lock held and then enforce the lock being held in all instances feature_sync is called? (via an ASSERT)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@snajpa - I thought about it and didn't like the code duplication too. A new version is force pushed, please take a look. Thanks.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned with how much is now being done in feature_sync() under this mutex. But after reading through it I don't see anything clearly problematic, so I'm good with this straight forward fix to close the race.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Code Review Needed Ready for review and testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants