Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promsafe: Strongly-typed safe labels #1598

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

amberpixels
Copy link

@amberpixels amberpixels commented Aug 28, 2024

Promsafe

Introducing promsafe lib (optional helper lib, similar to promauto) that allows to use type-safe labels.

Motivation

This PR only covers Counter functionality as an example. If idea is fine for community, I'll push further commits expanding logic to Gauge, Histogram, etc
For detailed motivation see my comment below

Fixes #1599

Why?

Currently having unsafe labels lead to several problems: either err-handling nightmare, either panicing (in case if you use "promauto")

Having unsafe labels can lead to following issues:

  • Misspelling
  • Mistyping
  • Misremembering
  • Too many labels
  • Too few labels

As of state of art of modern Go version, we can use Go Generics to solve these issues.

Examples of how to use it

1. Multi-label mode (safe structs)

type MyCounterLabels struct {
	promsafe.StructLabelProvider
	EventType string
	Success   bool
	Position  uint8 // yes, it's a number, but be careful with high-cardinality labels

	ShouldNotBeUsed string `promsafe:"-"`
}

// Note on performance: 
// By default if no custom logic provider the MyCounterLabels will use reflect to extract label names and values
// But if performance matters you can define your own exporting methods:
// Optional! (if not specified, it will fallback to reflect)
func (f MyCounterLabels) ToPrometheusLabels() prometheus.Labels {
	return prometheus.Labels{"event_type": f.EventType, "success": fmt.Sprintf("%v", f.Success), "position": fmt.Sprintf("%d", f.Position)}
}
// Optional! (if not specified it will fallback to reflect)
func (f MyCounterLabels) ToLabelNames() []string { return []string{"event_type", "success", "position"} }

// Creating a typed counter providing specific labels type
c := promsafe.NewCounterVec[MyCounterLabels](prometheus.CounterOpts{
	Name: "items_counted",
})

// Manually register the counter
if err := prometheus.Register(c.Unsafe()); err != nil {
	log.Fatal("could not register: ", err.Error())
}

//  it ONLY allows you to fill the MyCounterLabels here
c.With(MyCounterLabels{
	EventType: "request", Success: true, Position: 1,
}).Inc()

Compatibility with promauto

1. promauto.With call migration

var myReg = prometheus.NewRegistry()

counterOpts := prometheus.CounterOpts{
	Name: "items_counted",
}

// Old unsafe code
// promauto.With(myReg).NewCounterVec(counterOpts, []string{"event_type", "source"})
// becomes:

type MyLabels struct {
	promsafe.StructLabelProvider
	EventType string
	Source    string
}
c := promauto_adapter.With(myReg).NewCounterVec[MyLabels](counterOpts)

c.With(MyLabels{
	EventType: "reservation", Source: "source1",
}).Inc()

Note:

All non-string value types will be automatically converted to string. Here we can add a reasonable type-validation, so we can make it only to work with fields that are strings/bools/ints

@amberpixels amberpixels changed the title Promsafe feature introduced Promsafe: Strongly-typed safe labels Aug 28, 2024
@amberpixels amberpixels force-pushed the feature/promsafe branch 6 times, most recently from 011822d to 80149ab Compare September 2, 2024 10:39
@bwplotka
Copy link
Member

bwplotka commented Sep 3, 2024

Hi! Thanks for innovating here 💪🏽

I presume this is about using generics for label values type safety -- in the relation to defined label names.

Currently having unsafe labels lead to several problems: either err-handling nightmare, either panicing (in case if you use "promauto")

Can you share exactly the requirements behind promsafe. Perhaps it would allow us to make decision if such package is useful to enough to maintain in client_golang OR existing solutions are enough OR is there a way to extend existing packages with improvements for the same goals.

For example, how often you see those err handling nightmare and panics in practice? Can you share some experience/data?

Generally, what's recommended is hardcoding label values in WithLabelValues, which by testing given code-path you know immediately if it's panicking. If you use dynamic values (e.g. variable) as your values then it's generally prone to cardinality issues anyway, thus we experimented with constraint labels solution.

Thus, let's circle back to barebone requirements we want here 🤗 e.g. generally you should avoid using With. What are the cases we are solving here?

Additionally, performance is important for this increment flow, so it would be nice to check how this applies.

@amberpixels
Copy link
Author

amberpixels commented Sep 4, 2024

Hey. Thanks for a feedback. Let me share details on my motivation behind the provided promsafe package.

By err handling nightmare / panics I meant the following cases:

// Counter registration: we're fine with possible panic here :)
myCounter := promauto.NewCounterVec(prometheus.CounterOpts{
    Name: "items_counted",
}, []string{"event_type", "success", "slot" /* 1/2/3/.../9 */})

// But using counter: where there motivation comes from:

// Using .GetMetricWith*() methods will error if labels are messed up
myCounterWithValues, err := myCounter.GetMetricWith(prometheus.Labels{
    "event_type": "reservation",
    "success":    "true",
    "slot":       "1",
})
if err != nil {
    // TODO: handle error
}
// Same error can happen if using *WithLabelValues version:
// myCounterWithValues, err := myCounter.GetMetricWithLabelValues("reservation", "true", "1")

// To avoid error-handling we can use .With/.WithLabelValues, but it will just panic for the same reasons:
myCounter.WithLabelValues("reservation", "true", "2").Inc()

💡 So here and further i call "panic" both panicing of .With* methods or error-handling in .GetMetricWith* methods

Why Panic? why it matters?

Here are several reasons:

  1. Misspelling. You can misspell label names. (Not relevant for WithLabelValues though)
  2. Misremembering. You can forget the name of the label. In case of using WithLabelValues you still need to remember the number of labels and their order (and what they mean)
  3. Missing labels. You can forget a label (both in map of With() or in slice of WithLabelValues())
  4. Extra labels. You can accidentally pass extra labels (both in map of With() or in slice of WithLabelValues())
  5. Manual string conversion can lead to failures as well. E.g. you must know to use fmt.Sprintf("%v", boolValue), and choosing wrong "%v" placeholder can ruin values.

All these reasons are possible ways to break code because of panicking in .With() or .WithLabelValues(). Let's not spend time and efforts on code-reviews to ensure that new usage of "counter inc" is not breaking everything.

Also, one more reason is not about failing but about consistency:
6. Type-safety allows you to be both less error-prone and more consistent.
E.g. you just pass bool values as label values, and you know it will always be "true"/"false" not "1","0","on","off","yes",...
same for numbers

How it's solved by promsafe?

// Promsafe example:
// Registering a metric with simply providing the type containing labels
type MyCounterLabels struct {
    promsafe.StructLabelProvider
    EventType string
    Success   bool
    Slot      uint8 // yes, it's a number, still should be careful with high-cardinality labels
}
myCounterSafe := promsafe.NewCounterVec[MyCounterLabels](prometheus.CounterOpts{
    Name: "items_counted_detailed",
})

// Calling counter is simple: just provide the filled struct of the dedicated type.
//
// Neither of 5 reasons can panic here. You simply can't mess up the struct.
// With() accepts ONLY this type of struct, you can't send any other struct.
// You don't need to remember the fields and their order. IDE will show you them.
// You can't send more fields.
// You can send less fields (but it can easily fill up with default values, or other custom non-panicy logic).
// You can't mess up types.
// You're consistent with types.
myCounterSafe.With(MyCounterLabels{
    EventType: "reservation", Success: true, Slot: 1,
}).Inc()

P.S. issue with inconsistency of promsafe-version of WithLabelValues() method

// One thing that I need to specify here is the inconsistency with promsafe-version of WithLabelValues()

// WithLabelValues() excepts ordered raw strings, that unfortunately breaks the "safety" concept. 
// We can't control the order of given strings and even the length of it
// That's why in promsafe, .WithLavelValues() and .GetMetricWithLabelValues() are disabled:
// They are marked deprecated and panic (so they are strongly considered not to be used)

@dmvinson
Copy link

This API is really nice, would love to see this merged. I've already ran into a few of the failure modes @amberpixels mentioned in my first few weeks of using this library.

@amberpixels
Copy link
Author

Small update.

I've pushed some improvements in API, so it's more consistent and stable.
Also, I've update the PR description with cleaner and clearer examples, and added a note on performance issue.

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for amazing work!

I think this is a great direction, but I'm not sure it's at the stage where we want to claim full stability and maintenance of it in the client_golang v1.

I would like to explore slimmer "adapter" that just offers With(labels T) K method -- it would simplify the code to maintain and allow composability.

Then there is efficiency aspect I would like to understand, given this is a hot path.

Also before committing to any of this we have to ask ourselves what to recommend or deprecate in this place. We are getting to the place where there are many ways of doing the same thing, so would love to decide what to remove, if we think this is the way to go.

To achieve and answer all of this, I wonder:
A) How bad would it be to host promsafe in another repository for incubation period?
B) Is there a room for prometheus/client_golang/exp module which we could version v0.x and put other experimental stuff like Remote API?

// limitations under the License.

// Package promauto_adapter provides compatibility adapter for migration of calls of promauto into promsafe
package promauto_adapter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put it in promsafe honestly.

Copy link
Author

@amberpixels amberpixels Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the promauto_adapter idea is very weak.
But the issue does exist - my intention was to support both APIs: the prometheus and promauto. So if you had either prometheus.NewCounterVec or promauto.NewCounterVec - you can easily switch to promsafe.NewCounterVec (using named import in case of promauto_adapter)

The other approach is - if we want to keep both, and keep them in one package, is to give them different names e.g. promsafe.NewCounterVec makes a Typed version of prometheus.NewCounterVec, but promsafe.NewAutoCounterVec makes a Typed version of promauto.NewCounterVec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say in this case we could ONLY support promauto like package. If we could we would deprecate non registering definition TBH.

@amberpixels
Copy link
Author

Then there is efficiency aspect I would like to understand, given this is a hot path.

There are benchmarks in safe_test.go, and here are the results (on a MacBook Air M3):

BenchmarkCompareCreatingMetric/Prometheus_NewCounterVec-8                1252017               959.5 ns/op
BenchmarkCompareCreatingMetric/Promsafe_(reflect)_NewCounterVec-8         637221              1891 ns/op
BenchmarkCompareCreatingMetric/Promsafe_(fast)_NewCounterVec-8           1000000              1030 ns/op

Using automatic label extraction (via reflection) results in a 2x performance overhead compared to simple prometheus.NewCounterVec. However, the fast method (manual implementation of ToPrometheusLabels) is only about 10% slower.

Also before committing to any of this we have to ask ourselves what to recommend or deprecate in this place. We are getting to the place where there are many ways of doing the same thing, so would love to decide what to remove, if we think this is the way to go.

I completely agree with this.

To achieve and answer all of this, I wonder: A) How bad would it be to host promsafe in another repository for incubation period? B) Is there a room for prometheus/client_golang/exp module which we could version v0.x and put other experimental stuff like Remote API?

I like the idea of a prometheus/client_golang/exp module (or /x/, as is commonly done with Go packages). This approach makes it clear that the contents are experimental while still keeping them close to the main client library.

I would like to explore slimmer "adapter" that just offers With(labels T) K method -- it would simplify the code to maintain and allow composability.

Understood. For a minimal implementation, I see the following setup:
1. A wrapper (custom type) for each metric type (e.g., CounterVec, GaugeVec, HistogramVec, etc.).
2. Only the With[T] method, with no need for GetMetricWith, CurryWith, MustCurryWith, and especially no GetMetricWithLabelValues or WithLabelValues.

So are you ok with such "slim" adapter? (it will be required to be implemented per metric type)

// NewCounterVec creates a new CounterVec with type-safe labels.
func NewCounterVec[T LabelsProviderMarker](opts prometheus.CounterOpts) *CounterVec[T] {
	emptyLabels := NewEmptyLabels[T]()
	inner := prometheus.NewCounterVec(opts, extractLabelNames(emptyLabels))

	return &CounterVec[T]{inner: inner}
}

// CounterVec is a wrapper around prometheus.CounterVec that allows type-safe labels.
type CounterVec[T LabelsProviderMarker] struct {
	inner *prometheus.CounterVec
}


// With behaves like prometheus.CounterVec.With but with type-safe labels.
func (c *CounterVec[T]) With(labels T) prometheus.Counter {
	return c.inner.With(extractLabelsWithValues(labels))
}

@amberpixels amberpixels force-pushed the feature/promsafe branch 2 times, most recently from 55582e1 to 7828200 Compare November 29, 2024 09:58
@amberpixels
Copy link
Author

@bwplotka I'd like to continue work on the branch (adding gauge, histogram, tests, etc) soon, so my main blocker is the question:

Should I wrap it as experimental functionality? If yes, then what name of the folder do you confirm? /x/ or /exp/?

Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy new year!

This work is amazing, but I not decided yet if it deserves official experiment package or if you should just put it to separate repo for now. I'm starting to like this more and more I think, but I might need more time to play with it.

If we decide for an experiment I would create github.com/prometheus/client_golang/exp module.

Note that there is an ongoing (brave) experiment to switch metric definitions to... yaml (: OpenTelemetry starts this new pattern with https://github.com/open-telemetry/weaver project and with @vesari we want to check if there's a room to allow Prometheus metrics to follow same pattern. The pattern is to define metric details in yaml and auto-generate e.g. client_golang (as typed as possible) code like yours. This has benefits of unified tooling that uses yaml as the source, plus we won't be constrained by limitations of generics and reflection. One approach does not cancel the other, there will be cases where defining metrics only in Go is preferred, however perhaps for typed approach generation might be one approach to consider.

Thanks for this work!

// limitations under the License.

// Package promauto_adapter provides compatibility adapter for migration of calls of promauto into promsafe
package promauto_adapter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say in this case we could ONLY support promauto like package. If we could we would deprecate non registering definition TBH.

prometheus/promsafe/safe.go Outdated Show resolved Hide resolved
//
// type Labels1 struct {
// promsafe.StructLabelProvider
// Code int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this could be a value with limited elements (const int)

// Method string
// }
//
// func (c Labels2) ToPrometheusLabels() prometheus.Labels {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here one could implement constraints as well perhaps to contain values (cardinality). We have some "hidden" logic for this here

}

// With behaves like prometheus.CounterVec.With but with type-safe labels.
func (c *CounterVec[T]) With(labels T) prometheus.Counter {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how this is better or worse than alternatives e.g.

  1. Generating With(code CodeConstant, method MethodConstant)
  2. Builder chain pattern e.g. myCounter.WithCode(code).WithMethod(method).Inc()

(2) feels pretty tempting, but one would need to manually implement it.. or we create generator for it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t see code generation as inherently better or worse — it’s just another way to approach this problem. Initially, promsafe was designed to provide type-safe labels at runtime without relying on code generation, instead using either reflection (which is slow) or custom interface implementations.

Code generation is a valid alternative for solving the same issue, and whether it’s better or worse depends on the use case. Personally, I wouldn’t use it for 100% of projects or all metrics — sometimes, avoiding code generation is preferable, especially when performance isn’t a major concern.

Between alternatives (1) and (2), I would prefer (1). Once each metric is generated and has its own fully type-safe With() method, it becomes the cleanest possible approach.

(2), despite its appeal, requires implementing a With***() method for every label manually (or generating it as well), and it still isn’t entirely safe. If a With***() method is omitted and the metric isn’t constrained with a default value, a missing label could lead to a panic.

@bwplotka
Copy link
Member

Here is what I meant as an alternative for type safety (one does not exclude another one) bwplotka/metric-rename-demo#1


// With behaves like prometheus.CounterVec.With but with type-safe labels.
func (c *CounterVec[T]) With(labels T) prometheus.Counter {
return c.inner.With(extractLabelsWithValues(labels))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem here is that we striginfy all which can be expensive here and this is usually hot path. We might want to check how hard is to optimize this internally in client_golang.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently prometheus.Labels is defined as map[string]string. This means we are already operating within the hot path of client_golang. Ideally, promsafe should not introduce any additional overhead. In the best-case scenario, a manually implemented ToPrometheusLabels() method on the labels struct can be as simple as constructing a map directly from the struct fields.

@amberpixels
Copy link
Author

Happy new year!

Hello from 2025 :) Thanks

... switch metric definitions to... yaml (: OpenTelemetry starts this new pattern with https://github.com/open-telemetry/weaver project ...

Thanks for noting this. I get the idea. That's an interesting and (imho, reasonable) experiment. I agree on your point that if this experiment succeeds - these safe typed metrics should be made and working exactly the same no matter if they were generated from yaml, or built dynamically in code.

@amberpixels
Copy link
Author

amberpixels commented Jan 31, 2025

Hey @bwplotka

Considering everything we’ve discussed, I’d like to summarize where we stand.

Code generation (e.g., YAML-based) is viable, by existing projects may choose to generate only some metrics or stick to a code-centric approach. That’s why I tend to find a flexible, combined approach (in aspects of type-safety)

I see ensuring a smooth migration from promauto to either promsafe or code-generated metrics as essential - to be seamless, not disruptive.

So, we have 2 Approaches (who can live separately or together)

Context:

Let’s say we work with a counter MyCounter that has three labels:

type MyLabels struct {
    MyInt         int
    MyCustomConst string // "enum"-like string
    MyFloat       float64
}

Approach 1. Code-declared metrics (promsafe)

// With promauto, we typically declare a metric like this:
c := promauto.With(reg).NewCounterVec(opts)
//
// Using promsafe, this would be replaced with:
//
c := promsafe.With[MyLabels](reg).NewCounterVec()
// or with the default registry
c := promsafe.NewCounterVec[MyLabels]()
// or even this (to be consistent with your code-generated approach
c := promsafe.MustNewCounterVec[MyLabels](reg) 

// which provides **type-safe methods**, allowing usage like:
c.With(MyLabels{myInt, myCustomConst, myFloat}).Inc()

I don’t think custom-tailored type-safe methods (e.g. c.WithMyLabels(myInt, myCustomConst, myFloat)) are feasible within promsafe without requiring too much boilerplate. However, the required setup for promsafe is minimal:

  • Define a typed struct for labels (e.g. MyLabels)
  • (Optional) Implement .ToPrometheusLabels() for efficient conversion (avoiding reflection).
    • This ensures zero performance overhead compared to standard promauto calls.

Approach 2. Generated Metrics (semconv-like)

// The original declaration:
c := promauto.With(reg).NewCounterVec(opts)
//
// Using generated metrics, this would be replaced with:
//
c := myGeneratedMetrics.MustNewMyCounterVec(reg) // where myGenMetrics is generated user package
// which provides a **fully type-safe interface,** where With* methods are generated:
c.WithLabelValues(myInt, myCustomConst, myFloat).Inc()

To ensure consistency with code-declared type-safe metrics, we could also support:

c.With(MyLabels{MyInt, MyCustomConst, MyFloat})
// We're ok as all this boilerplate code (sturcts, methods) are simply generated

Note: here MyLabels would be generated with the fast implementation (as it knows all the types).

Extra notes:

  1. For now, this will just work well with even ConstraintLabels. But in future we can make ConstraintLabels concept be optimized to be part of generation metrics process (default values, min-max validation checks, etc)
  2. Code gen approach can also even generate chainable .WithMyInt(3) but I think this is making it too complicated.

Conclusion

Both promsafe and semconv / promgen offer type safety in different ways, and they can coexist smoothly.

Would love to hear your thoughts—let me know if you have any concerns or need help with implementations! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Type-safe labels support?
3 participants