Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No performance data in data_bin after centreon-purge cron since new restart of centreon-broker #1874

Open
1 of 2 tasks
rmorandell-pgum opened this issue Nov 18, 2024 · 3 comments

Comments

@rmorandell-pgum
Copy link

Versions

centreon-plugins-selinux-0.0.8-083704.el8.x86_64
centreon-clib-23.10.11-1.el8.x86_64
centreon-widget-live-top10-memory-usage-23.10.0-1.el8.noarch
centreon-gorgone-23.10.10-1.el8.noarch
centreon-23.10.17-1.el8.noarch
centreon-broker-cbd-23.10.11-1.el8.x86_64
centreon-database-23.10.17-1.el8.noarch
centreon-plugin-Applications-Monitoring-Centreon-Poller-20241010-130148.el8.noarch
centreon-license-manager-23.10.2-1.el8.noarch
centreon-widget-engine-status-23.10.0-1.el8.noarch
centreon-plugin-Applications-Monitoring-Centreon-Map4-Jmx-20241010-130148.el8.noarch
centreon-web-23.10.17-1.el8.noarch
centreon-engine-selinux-23.10.11-1.el8.x86_64
centreon-widget-single-metric-23.10.0-1.el8.noarch
centreon-plugin-Applications-Protocol-Ldap-20241010-130148.el8.noarch
centreon-auto-discovery-server-23.10.4-1.el8.noarch
centreon-it-edition-extensions-23.10.4-1.el8.noarch
centreon-widget-grid-map-23.10.0-1.el8.noarch
centreon-plugin-Operatingsystems-Windows-Snmp-20241010-130148.el8.noarch
centreon-pp-manager-23.10.3-1.el8.noarch
centreon-plugin-Virtualization-VMWare-daemon-3.3.2-1.el8.noarch
centreon-common-23.10.17-1.el8.noarch
centreon-widget-hostgroup-monitoring-23.10.0-1.el8.noarch
centreon-widget-live-top10-cpu-usage-23.10.0-1.el8.noarch
centreon-trap-23.10.17-1.el8.noarch
centreon-plugin-Operatingsystems-Linux-Snmp-20241010-130148.el8.noarch
centreon-connector-ssh-23.10.11-1.el8.x86_64
centreon-central-23.10.17-1.el8.noarch
centreon-web-selinux-23.10.17-1.el8.noarch
centreon-broker-selinux-23.10.11-1.el8.x86_64
centreon-license-manager-common-23.10.2-1.el8.noarch
centreon-engine-daemon-23.10.11-1.el8.x86_64
centreon-widget-service-monitoring-23.10.0-1.el8.noarch
centreon-widget-tactical-overview-23.10.0-1.el8.noarch
centreon-plugin-Applications-Protocol-Http-20241010-130148.el8.noarch
centreon-plugin-Applications-Protocol-Dns-20241010-130148.el8.noarch
centreon-connector-perl-23.10.11-1.el8.x86_64
centreon-widget-host-monitoring-23.10.1-1.el8.noarch
centreon-broker-23.10.11-1.el8.x86_64
centreon-widget-httploader-23.10.0-1.el8.noarch
centreon-plugin-Applications-Monitoring-Centreon-Database-20241010-130148.el8.noarch
centreon-plugin-Applications-Databases-Mysql-20241010-130148.el8.noarch
centreon-gorgone-centreon-config-23.10.10-1.el8.noarch
centreon-poller-23.10.17-1.el8.noarch
centreon-gorgoned-selinux-23.10.10-1.el8.noarch
centreon-common-selinux-23.10.17-1.el8.noarch
centreon-perl-libs-23.10.17-1.el8.noarch
centreon-widget-servicegroup-monitoring-23.10.0-1.el8.noarch
centreon-plugin-Network-Cisco-Standard-Snmp-20241010-130148.el8.noarch
centreon-connector-23.10.11-1.el8.x86_64
centreon-widget-global-health-23.10.0-1.el8.noarch
centreon-plugin-Applications-Monitoring-Centreon-Central-20241010-130148.el8.noarch
centreon-broker-core-23.10.11-1.el8.x86_64
centreon-plugin-Hardware-Printers-Generic-Snmp-20241010-130148.el8.noarch
centreon-engine-23.10.11-1.el8.x86_64
centreon-plugin-Applications-Protocol-Ftp-20241010-130148.el8.noarch
centreon-broker-cbmod-23.10.11-1.el8.x86_64
centreon-widget-graph-monitoring-23.10.0-1.el8.noarch
centreon-plugin-Hardware-Ups-Standard-Rfc1628-Snmp-20241010-130148.el8.noarch

Operating System

Oracle Linux 8

How the the component has been installed and versions

  • From sources, from packages
  • components versions

Version: --

Additional environment details (AWS, VirtualBox, physical, etc.):

Description

I have a problem since a few days that after the data purge cron which is executed at 2am the centeron-broker gets a connection problem with an error message every 15 seconds.
From this point on no more performance data is written to the data_bin or to the logs table.
All other data is written without problems.
The problem is solved after a simple restart of the broker.
I have also tried to delete the oldest partition of the data_bin via SQL and the result is the same within a few seconds I get the same problem in the broker

###Logs

/var/log/centreon/centreon-purge.log

[Fri, 15 Nov 24 02:00:01 +0100] PURGE STARTED
[Fri, 15 Nov 24 02:00:01 +0100] Purging table data_bin...
[Fri, 15 Nov 24 02:00:01 +0100] Partition will be delete p20240518
[Fri, 15 Nov 24 02:00:01 +0100] Partitions deleted
[Fri, 15 Nov 24 02:00:01 +0100] Table data_bin purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging table logs...
[Fri, 15 Nov 24 02:00:01 +0100] Table logs purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging table log_archive_host...
[Fri, 15 Nov 24 02:00:01 +0100] Table log_archive_host purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging table log_archive_service...
[Fri, 15 Nov 24 02:00:01 +0100] Table log_archive_service purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging table comments...
[Fri, 15 Nov 24 02:00:01 +0100] Table comments purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging table downtimes...
[Fri, 15 Nov 24 02:00:01 +0100] Table downtimes purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging table log_action...
[Fri, 15 Nov 24 02:00:01 +0100] Table log_action purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging index_data...
[Fri, 15 Nov 24 02:00:01 +0100] index_data purged
[Fri, 15 Nov 24 02:00:01 +0100] Purging log_action_modification...
[Fri, 15 Nov 24 02:00:01 +0100] log_action_modification purged
[Fri, 15 Nov 24 02:00:01 +0100] PURGE COMPLETED

central-broker-master.log

[2024-11-14T15:52:10.557+01:00] [core] [info] New incoming connection 'central-broker-master-input-2'
[2024-11-14T15:52:10.558+01:00] [core] [info] multiplexing: 'central-broker-master-input-2' starts with 0 in queue and the queue file is disable
[2024-11-14T15:52:10.560+01:00] [core] [info] Available connections cleaning: Operation canceled
[2024-11-14T15:52:10.560+01:00] [core] [warning] Poller '' with id 0 already known as connected. Replacing it with ''
[2024-11-14T15:52:10.560+01:00] [core] [info] New incoming connection 'central-broker-master-input-3'
[2024-11-14T15:52:10.560+01:00] [core] [info] multiplexing: 'central-broker-master-input-3' starts with 0 in queue and the queue file is disable
[2024-11-15T02:00:08.584+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:00:08.584+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:00:23.572+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:00:23.573+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:00:38.575+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:00:38.576+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:00:53.572+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:00:53.573+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:01:08.572+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:01:08.573+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:01:23.573+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:01:23.574+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:01:38.574+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:01:38.575+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:01:53.573+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:01:53.575+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0
[2024-11-15T02:02:08.574+01:00] [sql] [error] connection fail to execute statement 0x7f914c020b00: error:  errno=0
[2024-11-15T02:02:08.575+01:00] [sql] [error] mysql_connection 0x7f914c020b00 attempts 2: error:  errno=0

The broker configuration is standard:

image

My workaround is restarting the broker daemon after the purge cron on 02:00 and the partitioning cron at 04:00.

What could be the problem here ?

Thanks for help

@rmorandell-pgum
Copy link
Author

I found the problem. The latest update on MariaDB 10.5.27 causes the problem. After a downgrade is working again.
Same issue on an other system. With 10.5.26 is working. With 10.5.27 not

@BenoitPoulet
Copy link
Contributor

@rmorandell-pgum I got the same PB on RH 9 with the same version on MariaDB.
Thank you for the report.

@bouda1
Copy link
Collaborator

bouda1 commented Nov 22, 2024

Hi,

The last mariadb version contains a bug or at least a big change in its manner to handle errors.

When mysql_stmt_execute() fails, it returns an non null integer.
To know about the error, we have access to mysql_stmt_errno().

But with the last version, this last function returns 0, that means "no error". This causes troubles in broker.

New versions with a turn around should be published very soon for versions 23.04, 23.10, 24.04 and 24.10.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants