Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action "slavelag" for master-slave replication (legacy) #156

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

heuri
Copy link

@heuri heuri commented May 28, 2015

Added the action "slavelag" to determine the lag behind the master of a master slave replication (legacy).

Currently the command rs.printSlaveReplicationInfo() is not supported by pymongo, so we determine the lag by using ('serverStatus', 1), ('repl', 2)

Added the action "slavelag" to determine the lag behind the master of a master slave replication (legacy).
@vkhatri
Copy link

vkhatri commented May 30, 2015

@heuri thank you for adding it to the plugin. how about adding perf data to it? this check throws an exception if missing lagSeconds, perhaps print the node type primary/arbi etc. in the message?

patch:

--- check_mongodb.py    2015-05-30 05:52:17.603794569 +0000
+++ check_mongodb.py.me 2015-05-30 05:47:36.707526065 +0000
@@ -253,7 +253,7 @@
     elif action == "replset_quorum":
         return check_replset_quorum(con, perf_data)
     elif action == "slavelag":
-        return check_slavelag(con, warning, critical)
+        return check_slavelag(con, warning, critical, perf_data)
     else:
         return check_connect(host, port, warning, critical, perf_data, user, passwd, conn_time)

@@ -1237,7 +1237,7 @@
         primary_status = 1
     return check_levels(primary_status, warning, critical, message)

-def check_slavelag(con, warning, critical):
+def check_slavelag(con, warning, critical, perf_data):
     warning = warning or 30
     critical = critical or 60

@@ -1251,11 +1251,12 @@
         try:
           seconds = int(data['repl']['sources'][0]['lagSeconds'])
           master = data['repl']['sources'][0]['host']
+          message = "Master Slave Replication " + str(seconds) + " secs behind '" + master + "' (master)."
         except (KeyError, IndexError):
-          raise Exception("Missing lagSeconds-Value, not running with --slave?")
-
-        message = "Master Slave Replication " + str(seconds) + " secs behind '" + master + "' (master)."
+          seconds = 0
+          message = "Missing lagSeconds-Value, not running with --slave or connected to a master/primary node"

+        message += performance_data(perf_data, [(seconds, "slavelag", warning, critical)])
         return check_levels(seconds, warning, critical, message)

     except Exception, e:

@heuri
Copy link
Author

heuri commented Jun 2, 2015

@vkhatri great idea, works like a charm. Yes we could determine the node status by checking con.admin.command("replSetGetStatus"), but here we need also then a version switch between >= 2.0 and < 2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants