Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements metrics using armon/go-metrics library #1616

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
ada5e70
implements metric system using armon gometrics lib
Eclion Aug 4, 2022
6395482
Corrects the telemetry config comments
Eclion Aug 9, 2022
2dce549
prevents crashes due to empty Telemetry config
Eclion Aug 9, 2022
1c1fc87
unmaintained badge
eikenb Jan 9, 2023
a6b7020
we're trying to figure out a maintanence story... stay tuned
eikenb Jan 10, 2023
c2060d9
fixes README typo
Eclion Jun 22, 2023
20aae40
removes unused methods from config.convert
Eclion Jun 22, 2023
b6e1ca4
provides context for method TelemetryConfig.Merge
Eclion Jun 22, 2023
fbdfc8f
reformats metrics label computing in runner
Eclion Jun 22, 2023
b4445ce
simplifies method CounterMetric.Add
Eclion Jun 22, 2023
30d31a3
initializes all counters
Eclion Jun 22, 2023
36b2819
removes debug logging
Eclion Jun 22, 2023
5cdbb6d
uses multierror for metrics sink configuration
Eclion Jun 22, 2023
4742f4b
removes duplicate metrics sink addition
Eclion Jun 22, 2023
4caed4f
removes commented metrics call
Eclion Jun 23, 2023
95f391d
adds armon/go-metrics to allowed depencies for golantci-linter
Eclion Aug 26, 2024
5e7bb54
adds promclient dependency to golangci-lint
Eclion Aug 26, 2024
61e9218
fix(config/telemetry): prevents displaying the circonus API token
Eclion Oct 11, 2024
552ffaa
fix(config/telemetry): uses the correct type for 'CirconusCheckForceM…
Eclion Oct 11, 2024
80287f9
feat(config/telemetry): adds config merge tests
Eclion Oct 11, 2024
7762510
feat(config/telemetry): adds gostring test for telemetry config
Eclion Oct 11, 2024
37fc559
fix(dependency/type): makes use of switch-case for Type's String method
Eclion Oct 11, 2024
81d87ce
fix(manager/runner): removes unneeded recordDependencyCounts method
Eclion Oct 11, 2024
c7b4679
fix(telemetry_test/formatting): applies @jm96441n 's recommendations …
Eclion Oct 11, 2024
1b4810c
fix(config/telemetry): corrects some circonus config json field
Eclion Oct 11, 2024
6dd0f56
feat(telemetry/doc): adds a telemetry config example in the configura…
Eclion Oct 11, 2024
4ae8d02
fix(telemetry/init): extracts the Metrics server init in order to reu…
Eclion Oct 14, 2024
c8f4f0f
fix(telemetry/test): fixes typo in go metrics prefixes
Eclion Oct 14, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .golangci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ linters-settings:
# List of allowed packages.
allow:
- $gostd
- github.com/armon/go-metrics
- github.com/BurntSushi/toml
- github.com/Masterminds/sprig/v3
- github.com/davecgh/go-spew/spew
Expand All @@ -84,10 +85,11 @@ linters-settings:
- github.com/mitchellh/hashstructure
- github.com/mitchellh/mapstructure
- github.com/pkg/errors
- github.com/prometheus/client_golang/prometheus
- github.com/stretchr/testify/assert
- github.com/stretchr/testify/require
- github.com/coreos/go-systemd

run:
timeout: 10m
concurrency: 4
concurrency: 4
66 changes: 66 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ this functionality might prove useful.
- [Multiple Commands](#multiple-commands)
- [Multi-phase Execution](#multi-phase-execution)
- [Debugging](#debugging)
- [Telemetry](#telemetry)
- [FAQ](#faq)
- [Contributing](#contributing)

Expand Down Expand Up @@ -407,6 +408,71 @@ $ consul-template -log-level debug ...
# ...
```

## Telemetry

Consul Template uses the [armon/go-metrics](https://github.com/armon/go-metrics) library to implement the Consul Template metrics system. It currently supports metrics exported to circonus API, statsd server, statsite server, dogstatsd server, and prometheus endpoint.

### Key Metrics

These metrics offer insight into Consul Template and capture subprocess activities. The number of dependencies are aggregated from the configured templates, and metrics are collected around a dependency when it is updated from source. This is useful to correlate any upstream changes to downstream actions originating from Consul Template.

Metrics are monitored around template rendering and execution of template commands. These
metrics indicate the rendering status of a template and how long commands for a template takes
to provide insight on performance of the templates.

| Metric Name | Labels | Description |
|-|:-:|-|
| `consul-template.dependencies_received` | type=(consul\|vault\|local), id=dependencyString | A counter of dependencies received from monitoring value changes |
| `consul-template.templates_rendered` | id=templateID, status=(rendered\|would\|quiescence) | A counter of templates rendered |
| `consul-template.runner_actions` | action=(start\|stop\|run) | A count of runner actions |
| `consul-template.commands_exec` | status=(success\|error) | The number of commands executed after rendering templates |

#### Metrics yet to be implemented

The current metrics were implemented by takin as reference the [previous metric-related PR](https://github.com/hashicorp/consul-template/pull/1378/files#diff-d980d9aed26114a3414812b58d45770a201c1f29b7f67ddc0ef0891a8f1b7736), but as the `armon/go-metrics` library doesn't implement all types of metrics yet, histogram metrics could not be implemented.

Said metrics are described as below:

| Metric Name | Labels | Description |
|-|:-:|-|
| `consul-template.dependencies` | type=(consul\|vault\|local) | The number of dependencies grouped by types |
| `consul-template.templates` | | The number of templates configured |
| `consul-template.commands_exec_time` | id=tmplDestination | The execution time (seconds) of a template command |


### Metric Samples

#### DogStatsD

```
2020-05-05 11:57:46.143979 consul-template.runner_actions:1|c|#action:start
consul-template.runner_actions:2|c|#action:run
consul-template.dependencies_received:1|c|#id:kv.block(hello),type:consul
consul-template.templates_rendered:1|c|#id:aadcafd7f28f1d9fc5e76ab2e029f844,status:rendered
consul-template.commands_exec:1|c|#status:success
consul-template.commands_exec:0|c|#status:error
```

#### Prometheus

```
$ curl localhost:8888/metrics
# HELP consul_template_commands_exec The number of commands executed with labels status=(success|error)
# TYPE consul_template_commands_exec counter
consul_template_commands_exec{status="error"} 0
consul_template_commands_exec{status="success"} 1
# HELP consul_template_dependencies_received A counter of dependencies received with labels type=(consul|vault|local) and id=dependencyString
# TYPE consul_template_dependencies_received counter
consul_template_dependencies_received{id="kv.block(hello)",type="consul"} 1
# HELP consul_template_runner_actions A count of runner actions with labels action=(start|stop|run)
# TYPE consul_template_runner_actions counter
consul_template_runner_actions{action="run"} 2
consul_template_runner_actions{action="start"} 1
# HELP consul_template_templates_rendered A counter of templates rendered with labels id=templateID and status=(rendered|would|quiescence)
# TYPE consul_template_templates_rendered counter
consul_template_templates_rendered{id="aadcafd7f28f1d9fc5e76ab2e029f844",status="rendered"} 1
```

## FAQ

**Q: How is this different than confd?**<br>
Expand Down
8 changes: 8 additions & 0 deletions cli.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (
"github.com/hashicorp/consul-template/manager"
"github.com/hashicorp/consul-template/service_os"
"github.com/hashicorp/consul-template/signals"
"github.com/hashicorp/consul-template/telemetry"
"github.com/hashicorp/consul-template/version"
)

Expand Down Expand Up @@ -132,6 +133,13 @@ func (cli *CLI) Run(args []string) int {
}
}()

// Initialize telemetry
tel, err := telemetry.Init(config.Telemetry)
if err != nil {
return logError(err, ExitCodeConfigError)
}
defer tel.Stop()

// Initial runner
runner, err := manager.NewRunner(config, dry)
if err != nil {
Expand Down
20 changes: 20 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,9 @@ type Config struct {
// Syslog is the configuration for syslog.
Syslog *SyslogConfig `mapstructure:"syslog"`

// Telemetry is the configuration for collecting and emitting telemetry.
Telemetry *TelemetryConfig `mapstructure:"telemetry"`

// Templates is the list of templates.
Templates *TemplateConfigs `mapstructure:"template"`

Expand Down Expand Up @@ -174,6 +177,10 @@ func (c *Config) Copy() *Config {
o.Syslog = c.Syslog.Copy()
}

if c.Telemetry != nil {
o.Telemetry = c.Telemetry.Copy()
}

if c.Templates != nil {
o.Templates = c.Templates.Copy()
}
Expand Down Expand Up @@ -265,6 +272,10 @@ func (c *Config) Merge(o *Config) *Config {
r.Syslog = r.Syslog.Merge(o.Syslog)
}

if o.Telemetry != nil {
r.Telemetry = r.Telemetry.Merge(o.Telemetry)
}

if o.Templates != nil {
r.Templates = r.Templates.Merge(o.Templates)
}
Expand Down Expand Up @@ -336,6 +347,7 @@ func Parse(s string) (*Config, error) {
"nomad.transport",
"ssl",
"syslog",
"telemetry",
"vault",
"vault.retry",
"vault.ssl",
Expand Down Expand Up @@ -494,6 +506,7 @@ func (c *Config) GoString() string {
"ReloadSignal:%s, "+
"FileLog:%#v, "+
"Syslog:%#v, "+
"Telemetry:%#v, "+
"Templates:%#v, "+
"TemplateErrFatal:%#v"+
"Vault:%#v, "+
Expand All @@ -513,6 +526,7 @@ func (c *Config) GoString() string {
SignalGoString(c.ReloadSignal),
c.FileLog,
c.Syslog,
c.Telemetry.GoString(),
c.Templates,
c.TemplateErrFatal,
c.Vault,
Expand Down Expand Up @@ -561,6 +575,7 @@ func DefaultConfig() *Config {
FileLog: DefaultLogFileConfig(),
Nomad: DefaultNomadConfig(),
Syslog: DefaultSyslogConfig(),
Telemetry: DefaultTelemetryConfig(),
Templates: DefaultTemplateConfigs(),
Vault: DefaultVaultConfig(),
Wait: DefaultWaitConfig(),
Expand Down Expand Up @@ -634,6 +649,11 @@ func (c *Config) Finalize() {
}
c.Syslog.Finalize()

if c.Telemetry == nil {
c.Telemetry = DefaultTelemetryConfig()
}
c.Telemetry.Finalize()

if c.Templates == nil {
c.Templates = DefaultTemplateConfigs()
}
Expand Down
2 changes: 1 addition & 1 deletion config/syslog.go
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ func (c *SyslogConfig) GoString() string {

return fmt.Sprintf("&SyslogConfig{"+
"Enabled:%s, "+
"Facility:%s"+
"Facility:%s, "+
"Name:%s"+
"}",
BoolGoString(c.Enabled),
Expand Down
Loading