Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements metrics using armon/go-metrics library #1616

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

Eclion
Copy link

@Eclion Eclion commented Aug 9, 2022

As suggested in the issue #1395, this PR implements a metric system using the armon/go-metrics library along with few metrics.

To achieve such implementation, the metric system was based on the implementation done in Consul and the exposed metrics have been implemented based on the PR #1378.

However, this implementation only partially implements the mentioned metrics:

  • the armon/go-metrics library doesn't support yet histograms so these metrics are not implemented in this PR.
  • to avoid import cycling between the dependency, config and telemetry packages, metrics related to the vault token were not implemented

Resolves #1395

@Eclion Eclion requested a review from a team August 9, 2022 08:57
@hashicorp-cla
Copy link

hashicorp-cla commented Aug 9, 2022

CLA assistant check
All committers have signed the CLA.

@Eclion Eclion force-pushed the armon-gometrics-implementation branch from 7cce0dd to 9fc25e6 Compare August 19, 2022 12:57
@Eclion
Copy link
Author

Eclion commented Aug 23, 2022

Hello @eikenb !
Do you perhaps know someone who may be available to review this PR ?

@c0deaddict
Copy link

Any update on this? Would love to see this merged. We're kind of in the dark now when consul-template fails to render a template. Except when a service using the rendered template eventually breaks, but I would like to know that before hand.

@Oloremo
Copy link

Oloremo commented Apr 18, 2023

Any updates on this?

@cybervedaa
Copy link

+1 for this please. this will go a long way in reliably monitoring consul-template

@roncodingenthusiast roncodingenthusiast self-requested a review June 21, 2023 15:34
Copy link
Contributor

@roncodingenthusiast roncodingenthusiast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 @Eclion Thank you so much for the thorough work that went into making this PR happen.
Happy to work with you and a few others to get this merged/released within the next couple of weeks.

✍🏾 I did a first pass review and i am leaving a few comments for consideration. Let me know if any of them don't make sense to you.

❓ Do you mind updating this PR to the main branch, looks like it is quite behind now so it will be nice to get this updated before we work on merging it in

🔢 Happy to help/assist in anyway you want me to

README.md Outdated Show resolved Hide resolved
telemetry/metrics.go Outdated Show resolved Hide resolved
telemetry/metrics.go Show resolved Hide resolved
config/convert.go Outdated Show resolved Hide resolved
config/convert.go Outdated Show resolved Hide resolved
manager/runner.go Outdated Show resolved Hide resolved
telemetry/telemetry.go Outdated Show resolved Hide resolved
telemetry/telemetry.go Outdated Show resolved Hide resolved
telemetry/telemetry.go Outdated Show resolved Hide resolved
telemetry/telemetry.go Outdated Show resolved Hide resolved
@Eclion
Copy link
Author

Eclion commented Jun 22, 2023

tada @Eclion Thank you so much for the thorough work that went into making this PR happen. Happy to work with you and a few others to get this merged/released within the next couple of weeks.

Hello @roncodingenthusiast !
Thank you very much for your PR!

✍🏾 I did a first pass review and i am leaving a few comments for consideration. Let me know if any of them don't make sense to you.

question Do you mind updating this PR to the main branch, looks like it is quite behind now so it will be nice to get this updated before we work on merging it in

Sure! I will rebase my branch before working on your comments.

1234 Happy to help/assist in anyway you want me to

@Eclion Eclion force-pushed the armon-gometrics-implementation branch from 997b530 to 82c33c4 Compare June 22, 2023 08:29
@Eclion Eclion requested a review from a team as a code owner June 22, 2023 08:29
@Eclion Eclion force-pushed the armon-gometrics-implementation branch from 43166c3 to 80b5f0a Compare June 22, 2023 10:10
@c0deaddict
Copy link

I would very much like for this PR to be merged - and released. Any update on this?

@cybervedaa
Copy link

I'm eagerly waiting on this to be merged as well. Can't wait for this to be merged

@Eclion Eclion force-pushed the armon-gometrics-implementation branch from 750c214 to 011d417 Compare October 20, 2023 12:04
@Eclion Eclion requested review from a team and kisunji and removed request for a team October 20, 2023 12:04
@kisunji kisunji removed their request for review January 5, 2024 21:00
@Eclion Eclion force-pushed the armon-gometrics-implementation branch from 011d417 to ec189f9 Compare April 15, 2024 08:53
@Eclion
Copy link
Author

Eclion commented Apr 15, 2024

fork rebased on current main branch

@hkrutzer
Copy link

Hoping this year will be the year of consul-template metrics. @nathancoleman @jmurret @jm96441n @DanStough I see you have recently done work on this repo, perhaps one of you can give a status update?

Eclion added 3 commits August 23, 2024 17:28
implements the metric system following the implementation
done in Consul and trying to implement the same metrics as
done in the previous metric system (c.f. PR1378:
https://github.com/hashicorp/consul-template/pull/1378/files )
@Eclion Eclion force-pushed the armon-gometrics-implementation branch from ec189f9 to 4caed4f Compare August 23, 2024 15:31
Copy link
Member

@jm96441n jm96441n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good, just a few additional changes and I'm happy to merge

// Merge combines all values in this configuration with the values in the other
// configuration, with values in the other configuration taking precedence.
// Maps and slices are merged, most other values are overwritten.
func (c *TelemetryConfig) Merge(o *TelemetryConfig) *TelemetryConfig {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add tests for this merge function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!
I added some tests as requested

return fmt.Sprintf("&TelemetryConfig{"+
"Disable:%v, "+
"CirconusAPIApp:%s, "+
"CirconusAPIToken:%s, "+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we worried about leaking the API token in logs here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this issue. The output value has been replaced by <empty> and <configured>.

@@ -49,6 +49,14 @@ const (
DefaultContextTimeout = 60 * time.Second
)

func (t Type) String() string {
if t > 2 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: could we have a comment here indicating why 2 or use a named constant?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was used in the upstream PR to default to unknown probably because Nomad was not part of the Type list at that time.
I replaced this section by a switch case, more explicit

@@ -1478,3 +1511,30 @@ func newWatcher(c *config.Config, clients *dep.ClientSet) *watch.Watcher {
RetryFuncNomad: watch.RetryFunc(c.Nomad.Retry.RetryFunc()),
})
}

func recordDependencyCounts(deps map[string]dep.Dependency) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function doesn't return anything or modify the input in any way, what is it meant to do?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section of code from the PR #1378 was used to compute metrics for the histograms.
However, as histograms are yet to be implemented in go-metrics, this code is therefor unneeded.
I will remove it.

Comment on lines 100 to 110
for _, actualMetric := range actualMetrics {
if strings.HasPrefix(actualMetric, "# HELP go_") ||
strings.HasPrefix(actualMetric, "# TYPE go_") ||
strings.HasPrefix(actualMetric, "go_") ||
strings.HasPrefix(actualMetric, "# HELP process_") ||
strings.HasPrefix(actualMetric, "# TYPE process_") ||
strings.HasPrefix(actualMetric, "process_") ||
actualMetric == "" {
continue
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor but this might be a bit more readable

Suggested change
for _, actualMetric := range actualMetrics {
if strings.HasPrefix(actualMetric, "# HELP go_") ||
strings.HasPrefix(actualMetric, "# TYPE go_") ||
strings.HasPrefix(actualMetric, "go_") ||
strings.HasPrefix(actualMetric, "# HELP process_") ||
strings.HasPrefix(actualMetric, "# TYPE process_") ||
strings.HasPrefix(actualMetric, "process_") ||
actualMetric == "" {
continue
}
prefixes := []string{"# HELP go_", "#TYPE go", "go_", "# HELP process_", "# TYPE process_", "process_"}
for _, actualMetric := range actualMetrics {
if slices.ContainsFunc(prefixes, func(p string) bool { return strings.HasPrefix(actualMetric, p) }) ||
actualMetric == "" {
continue
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I agree with it being more readable.

@Eclion
Copy link
Author

Eclion commented Oct 14, 2024

Hello @jm96441n,
Thank you very much for your review!!
I updated the code to address your comments. Could you please review them once you have a bit of time?

@Eclion Eclion requested a review from jm96441n October 14, 2024 12:18
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nomad embeds consul-template as a library rather than using it as a standalone binary. The HTTP server and telemetry sink should be configurable by passing in an existing server and sink.

@Eclion
Copy link
Author

Eclion commented Oct 14, 2024

Nomad embeds consul-template as a library rather than using it as a standalone binary. The HTTP server and telemetry sink should be configurable by passing in an existing server and sink.

Hello @tgross ,

If I am not wrong, there is no need to call telemetry.Init() if metrics.NewGlobal is called in the code somewhere.

So the only needed call, as far as I understand, would be telemetry.CounterTemplatesRendered.Add(1) when a template is rendered.

However, such call is not implemented in the renderer.Render method, used by Nomad, but in the manager.RunnerRun. method, where we can also keep track of which template was rendered.

@tgross
Copy link
Member

tgross commented Oct 14, 2024

However, such call is not implemented in the renderer.Render method, used by Nomad, but in the manager.RunnerRun. method, where we can also keep track of which template was rendered.

The Render call there is only so that Nomad sandbox the rendered. Nomad calls manager.Runner.Start() which in turn calls manager.Runner.Run(). But to your point, you're right only the cli here is calling Init and instantiating the sink. No objections from our end.

@Eclion Eclion requested a review from tgross November 21, 2024 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace OpenTelemetry with armon/go-metrics
10 participants