Skip to content

Commit

Permalink
Cleanup failed systemd services during sync_workers
Browse files Browse the repository at this point in the history
It is possible for a systemd or container deployment to fail and no
longer be run (e.g. CrashLoopBackOff) but the deployment/service will
still exist in the runtime environment.

Add the ability for these failed services to be cleaned up during
MiqServer's sync_workers loop.
  • Loading branch information
agrare committed Aug 11, 2020
1 parent 484cf2c commit 146df10
Show file tree
Hide file tree
Showing 3 changed files with 71 additions and 2 deletions.
18 changes: 18 additions & 0 deletions app/models/miq_server/worker_management/monitor.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ module MiqServer::WorkerManagement::Monitor
include_concern 'Start'
include_concern 'Status'
include_concern 'Stop'
include_concern 'Systemd'
include_concern 'SystemLimits'
include_concern 'Validation'

Expand Down Expand Up @@ -45,6 +46,8 @@ def worker_not_responding(w)
end

def sync_workers
cleanup_failed_workers

result = {}
MiqWorkerType.worker_class_names.each do |class_name|
begin
Expand All @@ -63,6 +66,21 @@ def sync_workers
result
end

def cleanup_failed_workers
if podified?
elsif systemd?
cleaup_failed_systemd_services
end
end

def podified?
MiqEnvironment::Command.is_podified?
end

def systemd?
MiqEnvironment::Command.supports_systemd?
end

def clean_worker_records
worker_deleted = false
miq_workers.each do |w|
Expand Down
2 changes: 0 additions & 2 deletions app/models/miq_server/worker_management/monitor/quiesce.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,6 @@ def quiesce_workers_loop
miq_workers.each do |w|
if w.containerized_worker?
w.delete_container_objects
elsif w.systemd_worker?
w.stop_systemd_worker
else
stop_worker(w)
end
Expand Down
53 changes: 53 additions & 0 deletions app/models/miq_server/worker_management/monitor/systemd.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
module MiqServer::WorkerManagement::Monitor::Systemd
extend ActiveSupport::Concern

def systemd_manager
@systemd_manager ||= begin
require "dbus/systemd"
DBus::Systemd::Manager.new
end
end

def cleaup_failed_systemd_services
failed_service_names = systemd_failed_miq_services.map { |service| service[:name] }
systemd_manager.DisableUnitFiles(failed_service_names, false)
end

def systemd_miq_service_base_names
@systemd_miq_service_base_names ||= begin
MiqWorkerType.worker_class_names.map(&:constantize).map(&:service_base_name)
end
end

def systemd_failed_miq_services
miq_services(systemd_failed_services)
end

def systemd_all_miq_services
miq_services(systemd_services)
end

def miq_services(services)
services.select { |unit| systemd_miq_service_base_names.include?(systemd_service_base_name(unit)) }
end

def systemd_service_name(unit)
File.basename(unit[:name], ".*")
end

def systemd_service_base_name(unit)
systemd_service_name(unit).split("@").first
end

def systemd_failed_services
systemd_services.select { |service| service[:active_state] == "failed" }
end

def systemd_services
systemd_units.select { |unit| File.extname(unit[:name]) == ".service" }
end

def systemd_units
systemd_manager.units
end
end

0 comments on commit 146df10

Please sign in to comment.