Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Vault Prometheus #454

Merged
merged 1 commit into from
Jul 30, 2024
Merged

Conversation

ashu3103
Copy link
Collaborator

@ashu3103 ashu3103 commented Jul 24, 2024

Enable Prometheus in vault

About this commit

  • Bind an HTTP server in pgagroal-vault at port metrics in configuration to track server activities
  • Added some fields like - metrics, hugepage, metric_cache_max_age and metric_cache_max_size in the common configuration
  • Added an additional check is_prometheus_enabled on all the prometheus functions and code blocks which basically checks whether the metrics port is active or not.
  • Based on the metric port value we allocate memory to prometheus structure and prometheus_cache

Future work

  • Segregate the prometheus structure for main and vault similar to configuration for optimal memory management.
  • prometheus should be initiated only when metric port is active --> Document it properly somewhere.

@jesperpedersen @fluca1978 PTAL.


void
pgagroal_prometheus(int client_fd)
{
if (!is_prometheus_enabled())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should do this after the variable definitions

void
pgagroal_vault_prometheus(int client_fd)
{
if (!is_prometheus_enabled())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@@ -244,6 +312,11 @@ pgagroal_init_prometheus(size_t* p_size, void** p_shmem)
void
pgagroal_prometheus_session_time(double time)
{
if (!is_prometheus_enabled())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@@ -330,6 +403,11 @@ pgagroal_prometheus_session_time(double time)
void
pgagroal_prometheus_connection_error(void)
{
if (!is_prometheus_enabled())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

@@ -340,6 +418,11 @@ pgagroal_prometheus_connection_error(void)
void
pgagroal_prometheus_connection_kill(void)
{
if (!is_prometheus_enabled())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

data = pgagroal_append(data, " </p>\n");
data = pgagroal_append(data, " <h2>pgagroal_self_sockets</h2>\n");
data = pgagroal_append(data, " <p>\n");
data = pgagroal_append(data, " Number of sockets used by pgagroal itself\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pgagroal-vault

@@ -503,6 +656,11 @@ pgagroal_prometheus_auth_user_error(void)
void
pgagroal_prometheus_client_wait_add(void)
{
if (!is_prometheus_enabled())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, all of them...

data = pgagroal_append(data, " <p>\n");
data = pgagroal_append(data, " <a href=\"/metrics\">Metrics</a>\n");
data = pgagroal_append(data, " </p>\n");
data = pgagroal_append(data, " <h2>pgagroal_logging_info</h2>\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be called pgagroal_vault_XYZ


prometheus = (struct prometheus*)prometheus_shmem;

data = pgagroal_append(data, "#HELP pgagroal_logging_info The number of INFO logging statements\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pgagroal_vault_logging_info


data = pgagroal_append(data, "#HELP pgagroal_client_sockets Number of sockets the client used\n");
data = pgagroal_append(data, "#TYPE pgagroal_client_sockets gauge\n");
data = pgagroal_append(data, "pgagroal_client_sockets ");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pgagroal_vault_XYZ

@jesperpedersen
Copy link
Collaborator

Remember the tutorial, man page and manual as well

@ashu3103
Copy link
Collaborator Author

Remember the tutorial, man page and manual as well

Yeah working on it parallely

@jesperpedersen
Copy link
Collaborator

Maybe enable metrics by default in the pgagroal_vault.conf as well

@jesperpedersen
Copy link
Collaborator

I think that your "Future" section should be easy as well

@jesperpedersen
Copy link
Collaborator

prometheus should be initiated only when metric port is active --> Document it properly somewhere.

It is documented in the configuration files - users doesn't need to know about the internals

@ashu3103
Copy link
Collaborator Author

prometheus should be initiated only when metric port is active --> Document it properly somewhere.

It is documented in the configuration files - users doesn't need to know about the internals

Yeah, but for new developers it might be required

@ashu3103 ashu3103 force-pushed the prometheus-active branch from 6bebd7a to 67ecc0d Compare July 25, 2024 18:25
@ashu3103
Copy link
Collaborator Author

@jesperpedersen The suggested changes have been integrated PTAL.

@@ -2267,6 +2824,7 @@ pgagroal_init_prometheus_cache(size_t* p_size, void** p_shmem)

error:
// disable caching

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the empty line and leave the comment near the code

static bool
is_prometheus_enabled(void)
{
struct configuration* config = (struct configuration*)shmem;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test against nulls?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required as we initiate prometheus memory allocation only if config->metrics > 0.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not required as we initiate prometheus memory allocation only if config->metrics > 0.

Well, but if we export the function to somewhere else, we could need some extra (redundant) checks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will add the NULL check for prometheus to avoid any unlikely cases that may occur.

@@ -742,20 +742,23 @@ main(int argc, char** argv)
errx(1, "Failed to start logging");
}

if (pgagroal_init_prometheus(&prometheus_shmem_size, &prometheus_shmem))
if (config->common.metrics > 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we export is_prometheus_active function and use it so the code is consistent?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah can be done, but won't it waste some memory and computation as config is already there in the program it will again define struct configuration in the check function.

However, as far as consistency is concerned it can be done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah can be done, but won't it waste some memory and computation as config is already there in the program it will again define struct configuration in the check function.

The only waste I can see is the function call, but I don't think we are optimizing at this level right now.
It could also be a macro.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only waste I can see is the function call, but I don't think we are optimizing at this level right now.

Okay 👍

It could also be a macro.

Not sure how a macro could solve the issue, can you please elaborate a little.

Also, if we add NULL check of prometheus in is_prometheus_enabled() function (as per the aforementioned change request) then we can't use it in this scenario as it will always fail the test (since prometheus is checked before it is being initialised), and the only test which should be checked is metrics > 0. So I believe the current setup is ok.

What do you think about it? @fluca1978.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if we add NULL check of prometheus in is_prometheus_enabled() function (as per the aforementioned change request) then we can't use it in this scenario as it will always fail the test (since prometheus is checked before it is being initialised), and the only test which should be checked is metrics > 0. So I believe the current setup is ok.

Indeed having the test against nulls will fail in this case, so better leave it off the is_prometheus_enabled. Sorry, I was running out of coffee.

@@ -1044,14 +1047,14 @@ main(int argc, char** argv)
ev_periodic_start (main_loop, &rotate_frontend_password);
}

if (config->metrics > 0)
if (config->common.metrics > 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -379,6 +435,25 @@ main(int argc, char** argv)
errx(1, "Failed to start logging");
}

if (config->common.metrics > 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@@ -445,21 +529,64 @@ main(int argc, char** argv)

start_vault_io();

if (config->common.metrics > 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

/** @struct
* Defines the Main Prometheus metrics
*/
struct main_prometheus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get the point in splitting the prometheus structure into two parts, it seems to me it is useless: if metrics > 0 prometheus is active, so there is no advantage in memory allocation. Also, while vault_configuration added fields around configuration, here you are narrowing the structure only to the fields you need.
While I don't dislike in total this approach, it seems counterintuitive to have prometheus not being the main metrics structure as I would at glange expect. I would rather make a prometheus_logging structure, a prometheus_socket one and have prometheus to wrap the above, so that prometheus_vault can handle only the former twos.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if metrics > 0 prometheus is active, so there is no advantage in memory allocation.

True, the aim of splitting prometheus is to separate the memory space allocation for different programs (main/vault) as there as several fields which are not utilized by vault but are still allocated extra memory for unused fields.

Also, while vault_configuration added fields around configuration, here you are narrowing the structure only to the
fields you need.

As of now I have added just the required fields and more fields can be added as and when required.

While I don't dislike in total this approach, it seems counterintuitive to have prometheus not being the main metrics
structure as I would at glange expect. I would rather make a prometheus_logging structure, a prometheus_socket
one and have prometheus to wrap the above, so that prometheus_vault can handle only the former twos.

Can be done that way also. The thing i kept in my mind was that if in future if the project integrates other such service programs like vault, that will need prometheus then it can directly inherit the common fields (common for all services/servers) while keep on adding its specific fields in its prometheus structure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be done that way also. The thing i kept in my mind was that if in future if the project integrates other such service programs like vault, that will need prometheus then it can directly inherit the common fields (common for all services/servers) while keep on adding its specific fields in its prometheus structure.

This is clear, and is a good approach, but I don't think that common is the right name for the structure. It seems, so far, you need only a subset of the structure for the vault, so I'm not sure if this effort in splitting the structure is worth. I suspect you will find the need to add more fields here and there from time to time, so at this stage I would not split the structure. Surely, I will not name it as common, since that makes me think at common fields, while so far there are only "basic" fields like logging counters.
@jesperpedersen what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prometheus_base is an alternative

/** @struct
* Defines the Main Prometheus metrics
*/
struct main_prometheus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prometheus_base is an alternative

@jesperpedersen
Copy link
Collaborator

I think it looks good but prometheus_common or prometheus_base is probably better as a name

@ashu3103 ashu3103 force-pushed the prometheus-active branch from 67ecc0d to bb3253a Compare July 30, 2024 04:30
@jesperpedersen jesperpedersen merged commit bb3253a into agroal:master Jul 30, 2024
2 checks passed
@jesperpedersen
Copy link
Collaborator

Merged.

Thanks for your contribution !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants