Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Vault Prometheus #454

Merged
merged 1 commit into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion doc/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,9 @@ over the local interface, and hence doesn't work remotely.

## Prometheus

pgagroal has support for [Prometheus](https://prometheus.io/) when the `metrics` port is specified.
pgagroal has support for [Prometheus](https://prometheus.io/) when the `metrics` port is specified.

**Note:** It is crucial to carefully initialize Prometheus memory in any program files for example functions like `pgagroal_init_prometheus()` and `pgagroal_init_prometheus_cache()` should only be invoked if `metrics` is greater than 0.

The module serves two endpoints

Expand Down
5 changes: 5 additions & 0 deletions doc/VAULT.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@ The available keys and their accepted values are reported in the table below.
|----------|---------|------|----------|-------------|
| host | | String | Yes | The bind address for pgagroal-vault |
| port | | Int | Yes | The bind port for pgagroal-vault |
| metrics | 0 | Int | No | The metrics port (disable = 0) |
| metrics_cache_max_age | 0 | String | No | The number of seconds to keep in cache a Prometheus (metrics) response. If set to zero, the caching will be disabled. Can be a string with a suffix, like `2m` to indicate 2 minutes |
| metrics_cache_max_size | 256k | String | No | The maximum amount of data to keep in cache when serving Prometheus responses. Changes require restart. This parameter determines the size of memory allocated for the cache even if `metrics_cache_max_age` or `metrics` are disabled. Its value, however, is taken into account only if `metrics_cache_max_age` is set to a non-zero value. Supports suffixes: 'B' (bytes), the default if omitted, 'K' or 'KB' (kilobytes), 'M' or 'MB' (megabytes), 'G' or 'GB' (gigabytes).|
| authentication_timeout | 5 | Int | No | The number of seconds the process will wait for valid credentials |
| log_type | console | String | No | The logging type (console, file, syslog) |
| log_level | info | String | No | The logging level, any of the (case insensitive) strings `FATAL`, `ERROR`, `WARN`, `INFO` and `DEBUG` (that can be more specific as `DEBUG1` thru `DEBUG5`). Debug level greater than 5 will be set to `DEBUG5`. Not recognized values will make the log_level be `INFO` |
| log_path | pgagroal.log | String | No | The log file location. Can be a strftime(3) compatible string. |
Expand All @@ -36,6 +40,7 @@ The available keys and their accepted values are reported in the table below.
| log_mode | append | String | No | Append to or create the log file (append, create) |
| log_connections | `off` | Bool | No | Log connects |
| log_disconnections | `off` | Bool | No | Log disconnects |
| hugepage | `try` | String | No | Huge page support (`off`, `try`, `on`) |

## [main]

Expand Down
22 changes: 22 additions & 0 deletions doc/man/pgagroal_vault.conf.5.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,22 @@ host
port
The bind port for pgagroal-vault. Mandatory

metrics
The metrics port. Default is 0 (disabled)

metrics_cache_max_age
The number of seconds to keep in cache a Prometheus (metrics) response.
If set to zero, the caching will be disabled. Can be a string with a suffix, like ``2m`` to indicate 2 minutes.
Default is 0 (disabled)

metrics_cache_max_size
The maximum amount of data to keep in cache when serving Prometheus responses. Changes require restart.
This parameter determines the size of memory allocated for the cache even if ``metrics_cache_max_age`` or
``metrics`` are disabled. Its value, however, is taken into account only if ``metrics_cache_max_age`` is set
to a non-zero value. Supports suffixes: ``B`` (bytes), the default if omitted, ``K`` or ``KB`` (kilobytes),
``M`` or ``MB`` (megabytes), ``G`` or ``GB`` (gigabytes).
Default is 256k

log_type
The logging type (console, file, syslog). Default is console

Expand Down Expand Up @@ -67,6 +83,12 @@ log_connections
log_disconnections
Log disconnects. Default is off

authentication_timeout
The number of seconds the process will wait for valid credentials. Default is 5

hugepage
Huge page support. Default is try

The options for the main section are

host
Expand Down
2 changes: 2 additions & 0 deletions doc/manual/dev-02-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,8 @@ over the local interface, and hence doesn't work remotely.

pgagroal has support for [Prometheus](https://prometheus.io/) when the `metrics` port is specified.

**Note:** It is crucial to carefully initialize Prometheus memory in any program files for example functions like `pgagroal_init_prometheus()` and `pgagroal_init_prometheus_cache()` should only be invoked if `metrics` is greater than 0.

The module serves two endpoints

* `/` - Overview of the functionality (`text/html`)
Expand Down
26 changes: 26 additions & 0 deletions doc/manual/user-11-prometheus.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,3 +233,29 @@ Number of sockets the client used
## pgagroal_self_sockets

Number of sockets used by pgagroal itself

[**pgagroal-vault**][pgagroal-vault] has the following [Prometheus][prometheus] metrics.

## pgagroal_vault_logging_info

The number of INFO statements

## pgagroal_vault_logging_warn

The number of WARN statements

## pgagroal_vault_logging_error

The number of ERROR statements

## pgagroal_vault_logging_fatal

The number of FATAL statements

## pgagroal_vault_client_sockets

Number of sockets the client used

## pgagroal_vault_self_sockets

Number of sockets used by pgagroal-vault itself
5 changes: 5 additions & 0 deletions doc/manual/user-12-vault.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ The available keys and their accepted values are reported in the table below.
|----------|---------|------|----------|-------------|
| host | | String | Yes | The bind address for pgagroal-vault |
| port | | Int | Yes | The bind port for pgagroal-vault |
| metrics | 0 | Int | No | The metrics port (disable = 0) |
| metrics_cache_max_age | 0 | String | No | The number of seconds to keep in cache a Prometheus (metrics) response. If set to zero, the caching will be disabled. Can be a string with a suffix, like `2m` to indicate 2 minutes |
| metrics_cache_max_size | 256k | String | No | The maximum amount of data to keep in cache when serving Prometheus responses. Changes require restart. This parameter determines the size of memory allocated for the cache even if `metrics_cache_max_age` or `metrics` are disabled. Its value, however, is taken into account only if `metrics_cache_max_age` is set to a non-zero value. Supports suffixes: 'B' (bytes), the default if omitted, 'K' or 'KB' (kilobytes), 'M' or 'MB' (megabytes), 'G' or 'GB' (gigabytes).|
| log_type | console | String | No | The logging type (console, file, syslog) |
| log_level | info | String | No | The logging level, any of the (case insensitive) strings `FATAL`, `ERROR`, `WARN`, `INFO` and `DEBUG` (that can be more specific as `DEBUG1` thru `DEBUG5`). Debug level greater than 5 will be set to `DEBUG5`. Not recognized values will make the log_level be `INFO` |
| log_path | pgagroal.log | String | No | The log file location. Can be a strftime(3) compatible string. |
Expand All @@ -38,6 +41,8 @@ The available keys and their accepted values are reported in the table below.
| log_mode | append | String | No | Append to or create the log file (append, create) |
| log_connections | `off` | Bool | No | Log connects |
| log_disconnections | `off` | Bool | No | Log disconnects |
| authentication_timeout | 5 | Int | No | The number of seconds the process will wait for valid credentials |
| hugepage | `try` | String | No | Huge page support (`off`, `try`, `on`) |

## [main]

Expand Down
41 changes: 41 additions & 0 deletions doc/tutorial/04_prometheus.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,3 +59,44 @@ It is also possible to get an explaination of what is the meaning of each metric
```
http://localhost:2346/
```

## Prometheus metrics for pgagroal-vault
This tutorial will show you how to do basic [Prometheus](https://prometheus.io/){:target="_blank"} setup for [**pgagroal-vault**](https://github.com/agroal/pgagroal).

**pgagroal-vault** is able to provide a set of metrics about what it is happening within the vault, so that a Prometheus instance can collect them and help you monitor the vault activities.

### Change the pgagroal-vault configuration

In order to enable to export of the metrics, you need to add the `metrics` option in the main `pgagroal_vault.conf` configuration. The value of this setting is the TCP/IP port number that Prometheus will use to grab the exported metrics.

Add a line like the following to `/etc/pgagroal/pgagroal_vault.conf` by editing such file with your editor of choice:

```
metrics = 2501
```

Place it within the `[pgagroal-vault]` section, like

```
[pgagroal-vault]
...
metrics = 2501
```

This will bind the TCP/IP port number `2501` to the metrics export.

See [the pgagroal-vault configuration settings](https://github.com/agroal/pgagroal/blob/master/doc/VAULT.md#pgagroal-vault) with particular regard to `metrics`, `metrics_cache_max_age` and `metrics_cache_max_size` for more details.

### Get Prometheus metrics

Once **pgagroal-vault** is running you can access the metrics with a browser at the pgagroal-vault address, specifying the `metrics` port number and routing to the `/metrics` page. For example, point your web browser at:

```
http://localhost:2501/metrics
```

It is also possible to get an explaination of what is the meaning of each metric by pointing your web browser at:

```
http://localhost:2501/
```
48 changes: 34 additions & 14 deletions src/include/pgagroal.h
Original file line number Diff line number Diff line change
Expand Up @@ -371,10 +371,28 @@ struct prometheus_cache
} __attribute__ ((aligned (64)));

/** @struct
* Defines the Prometheus metrics
* Defines the common Prometheus metrics
*/
struct prometheus
{
// logging
atomic_ulong logging_info; /**< Logging: INFO */
atomic_ulong logging_warn; /**< Logging: WARN */
atomic_ulong logging_error; /**< Logging: ERROR */
atomic_ulong logging_fatal; /**< Logging: FATAL */

// internal connections
atomic_int client_sockets; /**< The number of sockets the client used */
atomic_int self_sockets; /**< The number of sockets used by pgagroal itself */

} __attribute__ ((aligned (64)));

/** @struct
* Defines the Main Prometheus metrics
*/
struct main_prometheus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the point in splitting the prometheus structure into two parts, it seems to me it is useless: if metrics > 0 prometheus is active, so there is no advantage in memory allocation. Also, while vault_configuration added fields around configuration, here you are narrowing the structure only to the fields you need.
While I don't dislike in total this approach, it seems counterintuitive to have prometheus not being the main metrics structure as I would at glange expect. I would rather make a prometheus_logging structure, a prometheus_socket one and have prometheus to wrap the above, so that prometheus_vault can handle only the former twos.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if metrics > 0 prometheus is active, so there is no advantage in memory allocation.

True, the aim of splitting prometheus is to separate the memory space allocation for different programs (main/vault) as there as several fields which are not utilized by vault but are still allocated extra memory for unused fields.

Also, while vault_configuration added fields around configuration, here you are narrowing the structure only to the
fields you need.

As of now I have added just the required fields and more fields can be added as and when required.

While I don't dislike in total this approach, it seems counterintuitive to have prometheus not being the main metrics
structure as I would at glange expect. I would rather make a prometheus_logging structure, a prometheus_socket
one and have prometheus to wrap the above, so that prometheus_vault can handle only the former twos.

Can be done that way also. The thing i kept in my mind was that if in future if the project integrates other such service programs like vault, that will need prometheus then it can directly inherit the common fields (common for all services/servers) while keep on adding its specific fields in its prometheus structure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be done that way also. The thing i kept in my mind was that if in future if the project integrates other such service programs like vault, that will need prometheus then it can directly inherit the common fields (common for all services/servers) while keep on adding its specific fields in its prometheus structure.

This is clear, and is a good approach, but I don't think that common is the right name for the structure. It seems, so far, you need only a subset of the structure for the vault, so I'm not sure if this effort in splitting the structure is worth. I suspect you will find the need to add more fields here and there from time to time, so at this stage I would not split the structure. Surely, I will not name it as common, since that makes me think at common fields, while so far there are only "basic" fields like logging counters.
@jesperpedersen what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prometheus_base is an alternative

{
struct prometheus prometheus_base; /**< Common base class */
atomic_ulong session_time[HISTOGRAM_BUCKETS]; /**< The histogram buckets */
atomic_ulong session_time_sum; /**< Total session time */

Expand All @@ -390,11 +408,6 @@ struct prometheus
atomic_ulong connection_flush; /**< The number of flush calls */
atomic_ulong connection_success; /**< The number of success calls */

atomic_ulong logging_info; /**< Logging: INFO */
atomic_ulong logging_warn; /**< Logging: WARN */
atomic_ulong logging_error; /**< Logging: ERROR */
atomic_ulong logging_fatal; /**< Logging: FATAL */

/**< The number of connection awaiting due to `blocking_timeout` */
atomic_ulong connections_awaiting[NUMBER_OF_LIMITS];
atomic_ulong connections_awaiting_total;
Expand All @@ -413,15 +426,20 @@ struct prometheus
atomic_ullong network_sent; /**< The bytes sent by clients */
atomic_ullong network_received; /**< The bytes received from servers */

atomic_int client_sockets; /**< The number of sockets the client used */
atomic_int self_sockets; /**< The number of sockets used by pgagroal itself */

atomic_ulong server_error[NUMBER_OF_SERVERS]; /**< The number of errors for a server */
atomic_ulong failed_servers; /**< The number of failed servers */
struct prometheus_connection prometheus_connections[]; /**< The number of prometheus connections (FMA) */

} __attribute__ ((aligned (64)));

/** @struct
* Defines the Vault Prometheus metrics
*/
struct vault_prometheus
{
struct prometheus prometheus_base;
} __attribute__ ((aligned (64)));

/** @struct
* Defines the common configurations between pgagroal and vault
*/
Expand All @@ -430,6 +448,7 @@ struct configuration
char configuration_path[MAX_PATH]; /**< The configuration path */
char host[MISC_LENGTH]; /**< The host */
int port; /**< The port */
int authentication_timeout; /**< The authentication timeout in seconds */

// Logging
int log_type; /**< The logging type */
Expand All @@ -443,6 +462,12 @@ struct configuration
char log_line_prefix[MISC_LENGTH]; /**< The logging prefix */
atomic_schar log_lock; /**< The logging lock */
char default_log_path[MISC_LENGTH]; /**< The default logging path */

// Prometheus
unsigned char hugepage; /**< Huge page support */
int metrics; /**< The metrics port */
unsigned int metrics_cache_max_age; /**< Number of seconds to cache the Prometheus response */
unsigned int metrics_cache_max_size; /**< Number of bytes max to cache the Prometheus response */
};

/** @struct
Expand All @@ -469,9 +494,6 @@ struct main_configuration
char admins_path[MAX_PATH]; /**< The admins path */
char superuser_path[MAX_PATH]; /**< The superuser path */

int metrics; /**< The metrics port */
unsigned int metrics_cache_max_age; /**< Number of seconds to cache the Prometheus response */
unsigned int metrics_cache_max_size; /**< Number of bytes max to cache the Prometheus response */
int management; /**< The management port */
bool gracefully; /**< Is pgagroal in gracefully mode */

Expand Down Expand Up @@ -503,7 +525,6 @@ struct main_configuration
int validation; /**< Validation mode */
int background_interval; /**< Background validation timer in seconds */
int max_retries; /**< The maximum number of retries */
int authentication_timeout; /**< The authentication timeout in seconds */
int disconnect_client; /**< Disconnect client if idle for more than the specified seconds */
bool disconnect_client_force; /**< Force a disconnect client if active for more than the specified seconds */
char pidfile[MAX_PATH]; /**< File containing the PID */
Expand All @@ -514,7 +535,6 @@ struct main_configuration
bool nodelay; /**< Use NODELAY */
bool non_blocking; /**< Use non blocking */
int backlog; /**< The backlog for listen */
unsigned char hugepage; /**< Huge page support */
bool tracker; /**< Tracker support */
bool track_prepared_statements; /**< Track prepared statements (transaction pooling) */

Expand Down
13 changes: 13 additions & 0 deletions src/include/prometheus.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,12 +64,25 @@ extern "C" {
void
pgagroal_prometheus(int fd);

/**
* Create a prometheus instance for vault
* @param fd The client descriptor
*/
void
pgagroal_vault_prometheus(int fd);

/**
* Initialize prometheus shmem
*/
int
pgagroal_init_prometheus(size_t* p_size, void** p_shmem);

/**
* Initialize prometheus shmem for vault
*/
int
pgagroal_vault_init_prometheus(size_t* p_size, void** p_shmem);

/**
* Add session time information
* @param time The time
Expand Down
Loading
Loading