Skip to content
forked from tarantool/metrics

Metric collection library for Tarantool, polished with love by Picodata.

License

Notifications You must be signed in to change notification settings

picodata/metrics

 
 

Repository files navigation

Build Status

Metrics

Metrics is a library to collect and expose Tarantool-based applications metrics.

Library includes:

  • four base metric collectors: Counter, Gauge, Histogram, Summary
  • ready to use Tarantool stats collectors built on top of base collectors
  • exporters to expose collected metrics in Prometheus, Graphite and generic JSON format
  • module to integrate into Tarantool Cartridge based applications

Table of contents

Installation

cd ${PROJECT_ROOT}
tt rocks install metrics

Plugins export

In order to easily export metrics to any TSDB, you can use one of the supported export plugins:

or you can write your custom plugin and use it. Hopefully, plugins for other TSDBs will be supported soon.

Metric types

There are four basic metric collectors available: Counter, Gauge, Summary and Histogram. The exact semantics of each metric follows the prometheus metric types.

Counter

Counter is a cummulative metric which value can only be incremented or reset to zero on restart. Counters are useful for accumulating number of events, e.g. requests processed, orders in e-shop. Counter is exposed as a single numerical value.

local metrics = require('metrics')

-- create a counter
local http_requests_total_counter = metrics.counter('http_requests_total')

-- somewhere in HTTP requests middleware:
http_requests_total_counter:inc(1, {method = 'GET'})

Gauge

Gauge is a metric that represents a single numerical value that can be changed arbitrarily. Gauges are useful for capturing a snapshot of the current state, e.g. CPU utilization, number of open connections. Gauge is exposed as a single numerical value.

local metrics = require('metrics')

-- create a gauge
local cpu_usage_gauge = metrics.gauge('cpu_usage', 'CPU usage')

-- register a lazy gauge value update
-- this will be called whenever the export is invoked in any plugins
metrics.register_callback(function()
    local current_cpu_usage = math.random()
    cpu_usage_gauge:set(current_cpu_usage, {app = 'tarantool'})
end)

Histogram

Histogram counts observed values into configurable buckets. Histograms are useful for tracking request latencies, processing time. Histogram is exposed as multiple numerical values:

  • the total count of observed events
  • the total sum of observed values
  • counters of observed events per bucket
local metrics = require('metrics')

-- create a histogram
local http_requests_latency_hist = metrics.histogram(
    'http_requests_latency', 'HTTP requests total', {2, 4, 6})

-- somewhere in the HTTP requests middleware:
local latency = math.random(1, 10)
http_requests_latency_hist:observe(latency)

Summary

Summary aggregates observed values into configurable quantiles. Summaries are useful as a service level indicator (e.g. SLAs, SLOs). Summary is exposed as multiple numerical values:

  • the total count of observed events
  • the total sum of observed values
  • number of observed events per quantile
local metrics = require('metrics')

-- create a summary with a sliding window of 5 age buckets and 60s bucket lifetime
local http_requests_latency = metrics.summary(
    'http_requests_latency', 'HTTP requests total',
    {[0.5]=0.01, [0.9]=0.01, [0.99]=0.01},
    {max_age_time = 60, age_buckets_count = 5}
)

-- somewhere in the HTTP requests middleware:
local latency = math.random(1, 10)
http_requests_latency:observe(latency)

Instance health check

In production environments Tarantool Cluster usually has a large number of so called "routers", Tarantool instances that handle input load and it is required to evenly distribute the load. Various load-balancers are used for this, but any load-balancer have to know which "routers" are ready to accept the load at that very moment. Metrics library has a special plugin that creates an http handler that can be used by the load-balancer to check the current state of any Tarantool instance. If the instance is ready to accept the load, it will return a response with a 200 status code, if not, with a 500 status code.

Next steps

See:

Contribution

Feel free to send Pull Requests. To increase the chance of having your pull request accepted, make sure it follows these guidelines:

  • Title and description matches the implementation.
  • Code follows styleguide.
  • The pull request closes one or more of related issues. If not, please add an issue first.
  • The pull request contains necessary tests that verify the intended behavior.
  • The pull request contains a CHANGELOG note and documentation update if needed.

Your pull request will be reviewed in 3-5 days.

Contacts

If you have questions, please ask it on StackOverflow or contact us in Telegram:

Credits

We would like to thank Prometheus for a great API that we brusquely borrowed.

About

Metric collection library for Tarantool, polished with love by Picodata.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Lua 99.1%
  • Other 0.9%