Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: fix pd client metrics registration #8994

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Tema
Copy link
Contributor

@Tema Tema commented Jan 14, 2025

What problem does this PR solve?

Issue Number: ref #8678

What is changed and how does it work?

@JmPotato if there is a code referencing metrics from the global context like in tikv/client-go#1543 (comment) then they could consume metrics created in the metrics.init() method. However, when latter the global metrics are recreated, reinitialized and registered with metrics.InitAndRegisterMetrics() the metrics consumed before are abandoned and not reported to prometheus. This PR allows to register metrics consumers, so that if they are consumed before metrics.InitAndRegisterMetrics() in invoked, they are reinitilized and registered later.

cc: @rleungx

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Code changes

Side effects

  • Possible performance regression
  • Increased code complexity
  • Breaking backward compatibility

Related changes

Release note

None.

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates the PR's author has signed the dco. labels Jan 14, 2025
Copy link
Contributor

ti-chi-bot bot commented Jan 14, 2025

Hi @Tema. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 14, 2025
@ti-chi-bot ti-chi-bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Jan 14, 2025
@ti-chi-bot
Copy link
Member

Now you can start all CI jobs with /test all in comment or query the triggers with /test ?

Copy link

codecov bot commented Jan 14, 2025

Codecov Report

Attention: Patch coverage is 84.21053% with 3 lines in your changes missing coverage. Please review.

Project coverage is 76.28%. Comparing base (ad172c7) to head (f9a6540).
Report is 13 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8994      +/-   ##
==========================================
- Coverage   76.33%   76.28%   -0.05%     
==========================================
  Files         465      466       +1     
  Lines       70565    70750     +185     
==========================================
+ Hits        53867    53975     +108     
- Misses      13355    13421      +66     
- Partials     3343     3354      +11     
Flag Coverage Δ
unittests 76.28% <84.21%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Comment on lines 33 to 34
var mutex sync.Mutex
var consumersInitializers []func()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using an atomic.Value to store []func()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just storing reference to []func() won't be enough as array append is not atomic operation in golang (it might trigger races in the resize flow). Or am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your consideration is thorough. My initial intention was to merge the lock on line 33 with line 34 into one structure, making it look more elegant.

Copy link
Contributor Author

@Tema Tema Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial intention was to merge the lock on line 33 with line 34 into one structure, making it look more elegant.

Do you still insist on the change? If so, could you please show me how you propose to make it work with atomic.Value which will be atomic and race free?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still prefer using a single structure, you could use a customed structure rather than atomic.Value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JmPotato the lock also covers init consumer to make sure there is no races. It is hard to reason whether we can give up on it or not. This not a hot code path, so should be fine to cover with lock to simplify reasoning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rleungx @niubell could you please take a look as well. I appreciate if you could elaborate a bit more what I'm missing here and how to achieve that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @JmPotato means is:

type consumersInitializers struct {
	sync.Mutex
	initializers []func()
}

var ci consumersInitializers

func RegisterConsumer(initConsumer func()) {
	ci.Lock()
	defer ci.Unlock()
	ci.initializers = append(ci.initializers, initConsumer)
	initConsumer()
}

func initRegisteredConsumers() {
	ci.Lock()
	defer ci.Unlock()
	for _, initConsumer := range ci.initializers {
		initConsumer()
	}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thx! @JmPotato PTAL one more time.

artem_danilov added 2 commits January 22, 2025 17:18
Signed-off-by: artem_danilov <[email protected]>
Signed-off-by: artem_danilov <[email protected]>
Copy link
Member

@JmPotato JmPotato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. My final question is whether a CircuitBreaker has a lifecycle? If we have a scenario where the old one is deprecated and a new CircuitBreaker is created, do we need to remove the corresponding consumer?

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jan 23, 2025
Copy link
Contributor

ti-chi-bot bot commented Jan 23, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JmPotato

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

ti-chi-bot bot commented Jan 23, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-01-23 07:34:23.4816742 +0000 UTC m=+338990.812593603: ☑️ agreed by JmPotato.

@ti-chi-bot ti-chi-bot bot added the approved label Jan 23, 2025
@Tema
Copy link
Contributor Author

Tema commented Jan 24, 2025

@rleungx could you please approve as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved dco-signoff: yes Indicates the PR's author has signed the dco. needs-1-more-lgtm Indicates a PR needs 1 more LGTM. ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants