Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stat pulling issues from US-8-150W #45

Open
UntestedEngineer opened this issue Jun 3, 2022 · 31 comments
Open

Stat pulling issues from US-8-150W #45

UntestedEngineer opened this issue Jun 3, 2022 · 31 comments

Comments

@UntestedEngineer
Copy link

UntestedEngineer commented Jun 3, 2022

Up until recently I have been using an older version of these templates and scripts. I think Version 1.0? This was back when there was the single mca-dump-short.sh file (and everything worked with no issues). I upgraded to the recent templates and shell scripts, however I appear to be having communication issues with model: US-8-150W. I upgraded to the recent templates because I have intention of moving Zabbix into a K3s cluster and need the UNIFI_CHECK_TIMEOUT macro variable.

I have several APs and other US switches that have no issues pulling data. All of my Unifi devices run the latest firmware that is available (as of this posting). Controller is also running the latest official software available.

Zabbix version: 6.0.4 (Non-container)

I continuously observe the following message in zabbix_server.log related to both of my US-8-150W:

1236:20220603:123802.511 item "Basement AP Switch 2:mca-dump-short.sh["-d","{HOST.CONN}", "-u", "{$UNIFI_USER}", "-i", "{$UNIFI_SSH_PRIV_KEY_PATH}", "-t", "SWITCH", "-p", "{$UNIFI_SSHPASS_PASSWORD_PATH}", "-o", "{$UNIFI_CHECK_TIMEOUT}" ]" became not supported: Timeout while executing a shell script.
1263:20220603:123803.800 Failed to execute command "/usr/lib/zabbix/externalscripts/mca-dump-short.sh '-d' '100.99.252.11' '-u' 'admin' '-i' '/.ssh/zabbix/zb_id_rsa' '-t' 'SWITCH_FEATURE_DISCOVERY' '-p' '{$UNIFI_SSHPASS_PASSWORD_PATH}' '-o' '20'": Timeout while executing a shell script.

Under the discovery rules in the web front end for Unifi Switch I observe:

Preprocessing failed for: {"mcaDumpError":"Error", "reason":"kex_exchange_identification: read: Connection reset by peer." }

  1. Failed: cannot extract value from json by path "$[?(@.has_temperature=="true")]": cannot parse as a valid JSON object: invalid control character in string data at: '
    " }'

Preprocessing failed for: /usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 108: /usr/bin/expect: No such file or dir...

  1. Failed: cannot extract value from json by path "$.port_table": cannot parse as a valid JSON object: invalid object format, expected opening character '{' or '[' at: '/usr/lib/zabbix/externalscripts/mca-dump-short.sh: line 108: /usr/bin/expect: No such file or directory
    {"mcaDumpError":"Error", "reason":"time out wit

Preprocessing failed for: {"mcaDumpError":"Error", "reason":"kex_exchange_identification: read: Connection reset by peer." }

  1. Failed: cannot extract value from json by path "$[?(@.power=="true")]": cannot parse as a valid JSON object: invalid control character in string data at: '
    " }'

I have increased my ZABBIX_TIMEOUT to 10 seconds in the zabbix_server.conf file and also increased the macro variable for UNIFI_CHECK_TIMEOUT to 20 with no difference in results.

Other US switches I have with no issues:

  • US-16-XG
  • US-8-60W
  • US-XG-6PoE

Other UAP I have with no issues:
-UAP-nanoHD
-UAP-HD

I have tried: removing the hosts and readding them, removed and unlinked the templates and imported the templates, re-copied the .sh files into the externalscripts directory.

I am sure the shell script has changed quite a bit since version 1.0 but only the US-8-150W devices are giving me these issues.

@patricegautier
Copy link
Owner

I think the root of the issue is

Preprocessing failed for: {"mcaDumpError":"Error", "reason":"kex_exchange_identification: read: Connection reset by peer." }

What happens if you run directly from the zabbixServer:

/usr/lib/zabbix/externalscripts/mca-dump-short.sh '-d' '100.99.252.11' '-u' 'admin' '-i' '/.ssh/zabbix/zb_id_rsa' '-t' 'SWITCH_FEATURE_DISCOVERY' '-o' '20'

?

@UntestedEngineer
Copy link
Author

I ran this command nearly a dozen+ times against both switches and the JSON string was returned every time.

sudo -u zabbix /usr/lib/zabbix/externalscripts/mca-dump-short.sh '-d' '100.99.252.11' '-u' 'admin' '-i' '/.ssh/zabbix/zb_id_rsa' '-t' 'SWITCH_FEATURE_DISCOVERY' '-o' '20'
[{"power":true,"total_power_consumed_key_name":"total_power_consumed","max_power_key_name":"max_power","max_power":140,"percent_power_consumed_key_name":"percent_power_consumed","has_eth1":false,"has_temperature":true,"temperature_key_name":"temperature","overheating_key_name":"overheating","has_fan":false,"fan_level_key_name":"fan_level"}]

@patricegautier
Copy link
Owner

.. and you are running this straight from the zabbix server, no zabbix proxy involved?

@patricegautier
Copy link
Owner

and you said this was not a containerized zabbix, but just double checking..

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jun 4, 2022 via email

@patricegautier
Copy link
Owner

So the mca-dump-short invocation you are issuing from the command line is I think exactly the same that the zabbix server issues when monitoring a device.. yet one fails.

I don't think it's the switch model, I have a US-8-150W and it works fine.

What's the exact value in {$UNIFI_CHECK_TIMEOUT} ?

@patricegautier
Copy link
Owner

also what's the platform that the zabbix server is running on?

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jun 4, 2022

NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Zabbix version: 6.0.4
PostgreSQL 14.3
Apache2

Running a virtual machine sitting on top of vCenter/ESXi 7.0.3e

image

image

1266:20220604:150801.736 Failed to execute command "/usr/lib/zabbix/externalscripts/mca-dump-short.sh '-d' '100.99.252.11' '-u' 'admin' '-i' '/.ssh/zabbix/zb_id_rsa' '-t' 'SWITCH' '-p' '{$UNIFI_SSHPASS_PASSWORD_PATH}' '-o' '20'": Timeout while executing a shell script.
1261:20220604:150802.009 Failed to execute command "/usr/lib/zabbix/externalscripts/mca-dump-short.sh '-d' '100.99.252.11' '-u' 'admin' '-i' '/.ssh/zabbix/zb_id_rsa' '-t' 'SWITCH_DISCOVERY' '-p' '{$UNIFI_SSHPASS_PASSWORD_PATH}' '-o' '20'": Timeout while executing a shell script.
1238:20220604:150802.576 item "Basement AP Switch 2:mca-dump-short.sh["-d","{HOST.CONN}", "-u", "{$UNIFI_USER}", "-i", "{$UNIFI_SSH_PRIV_KEY_PATH}", "-t", "SWITCH", "-p", "{$UNIFI_SSHPASS_PASSWORD_PATH}", "-o", "{$UNIFI_CHECK_TIMEOUT}" ]" became not supported: Timeout while executing a shell script.
1259:20220604:150803.005 Failed to execute command "/usr/lib/zabbix/externalscripts/mca-dump-short.sh '-d' '100.99.252.11' '-u' 'admin' '-i' '/.ssh/zabbix/zb_id_rsa' '-t' 'SWITCH_FEATURE_DISCOVERY' '-p' '{$UNIFI_SSHPASS_PASSWORD_PATH}' '-o' '20'": Timeout while executing a shell script.

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jun 5, 2022

So I did some more investigations into the templates themselves and for some reason these specific switches are not able to read the JSON Preprocessing:

Preprocessing failed for: {"mcaDumpError":"Error", "reason":"kex_exchange_identification: Connection closed by remote host." }

  1. Failed: cannot extract value from json by path "$[?(@.power=="true")]": cannot parse as a valid JSON object: invalid control character in string data at: '
    " }'

Preprocessing failed for: {"mcaDumpError":"Error", "reason":"kex_exchange_identification: Connection closed by remote host." }

  1. Failed: cannot extract value from json by path "$[?(@.has_temperature=="true")]": cannot parse as a valid JSON object: invalid control character in string data at: '
    " }'

If I remove the JSON Preprocessing from the discovery rule in the template I can pull the data:
Temperature Discovery:
JSONPath: $[?(@.has_temperature=="true")]

POE Discovery:
JSONPath: $[?(@.power=="true")]

The previous version of your templates did not have this JSON Preprocessing check. It's strange because all of the other switches with temperature and POE can read the JSON data. For the switches that do not have POE I disable that discovery and for the switches that do not have a FAN I disable that discovery rule.

@patricegautier
Copy link
Owner

Please try the latest commit - it changes mca-dump-short to the initial code path when -o is not explicitly specified and changes the switch templates to not use -o. Let's see if that unblocks it

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jun 5, 2022

Appear to be having the same issue:

image

@patricegautier
Copy link
Owner

Ok one more try.. please update to the latest and set the {$UNIFI_VERBOSE_SSH} to "-vvv"

it should cause SSH to output a whole lot of debug info in /tmp/

Let's see if we can get a clue..

@UntestedEngineer
Copy link
Author

Seems to be some type of key exchange issue? It's very strange because the Unifi controller has one SSH Fingerprint for all of my devices and it works on every other Unifi AP and switch (except the ones) from the automated interval shell script. I can manually invoke the switch discovery and switch feature discovery without issue:

OpenSSH_8.2p1 Ubuntu-4ubuntu0.5, OpenSSL 1.1.1f 31 Mar 2020
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: include /etc/ssh/ssh_config.d/*.conf matched no files
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug2: resolve_canonicalize: hostname 100.99.252.10 is address
debug2: ssh_connect_direct
debug1: Connecting to 100.99.252.10 [100.99.252.10] port 22.
debug2: fd 3 setting O_NONBLOCK
debug1: fd 3 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 4999 ms remain after connect
debug1: identity file /.ssh/zabbix/zb_id_rsa type 0
debug1: identity file /.ssh/zabbix/zb_id_rsa-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.5
debug1: Remote protocol version 2.0, remote software version dropbear_2020.81
debug1: no match: dropbear_2020.81
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to 100.99.252.10:22 as 'admin'
debug3: send packet: type 20
debug1: SSH2_MSG_KEXINIT sent
debug3: receive packet: type 20
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,[email protected],ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,ext-info-c
debug2: host key algorithms: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,[email protected],ssh-ed25519,[email protected],rsa-sha2-512,rsa-sha2-256,ssh-rsa
debug2: ciphers ctos: [email protected],aes128-ctr,aes192-ctr,aes256-ctr,[email protected],[email protected]
debug2: ciphers stoc: [email protected],aes128-ctr,aes192-ctr,aes256-ctr,[email protected],[email protected]
debug2: MACs ctos: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,[email protected],zlib
debug2: compression stoc: none,[email protected],zlib
debug2: languages ctos:
debug2: languages stoc:
debug2: first_kex_follows 0
debug2: reserved 0
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,[email protected],diffie-hellman-group14-sha256,diffie-hellman-group14-sha1,[email protected]
debug2: host key algorithms: rsa-sha2-256,ssh-rsa
debug2: ciphers ctos: [email protected],aes128-ctr,aes256-ctr
debug2: ciphers stoc: [email protected],aes128-ctr,aes256-ctr
debug2: MACs ctos: hmac-sha1,hmac-sha2-256
debug2: MACs stoc: hmac-sha1,hmac-sha2-256
debug2: compression ctos: none
debug2: compression stoc: none
debug2: languages ctos:
debug2: languages stoc:
debug2: first_kex_follows 0
debug2: reserved 0
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: rsa-sha2-256
debug1: kex: server->client cipher: [email protected] MAC: compression: none
debug1: kex: client->server cipher: [email protected] MAC: compression: none
debug3: send packet: type 30
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY


OpenSSH_8.2p1 Ubuntu-4ubuntu0.5, OpenSSL 1.1.1f 31 Mar 2020
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: include /etc/ssh/ssh_config.d/*.conf matched no files
debug1: /etc/ssh/ssh_config line 21: Applying options for *
debug2: resolve_canonicalize: hostname 100.99.252.11 is address
debug2: ssh_connect_direct
debug1: Connecting to 100.99.252.11 [100.99.252.11] port 22.
debug2: fd 3 setting O_NONBLOCK
debug1: fd 3 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 4999 ms remain after connect
debug1: identity file /.ssh/zabbix/zb_id_rsa type 0
debug1: identity file /.ssh/zabbix/zb_id_rsa-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.5
debug1: Remote protocol version 2.0, remote software version dropbear_2020.81
debug1: no match: dropbear_2020.81
debug2: fd 3 setting O_NONBLOCK
debug1: Authenticating to 100.99.252.11:22 as 'admin'
debug3: send packet: type 20
debug1: SSH2_MSG_KEXINIT sent
debug3: receive packet: type 20
debug1: SSH2_MSG_KEXINIT received
debug2: local client KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,[email protected],ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group16-sha512,diffie-hellman-group18-sha512,diffie-hellman-group14-sha256,ext-info-c
debug2: host key algorithms: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,[email protected],ssh-ed25519,[email protected],rsa-sha2-512,rsa-sha2-256,ssh-rsa
debug2: ciphers ctos: [email protected],aes128-ctr,aes192-ctr,aes256-ctr,[email protected],[email protected]
debug2: ciphers stoc: [email protected],aes128-ctr,aes192-ctr,aes256-ctr,[email protected],[email protected]
debug2: MACs ctos: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: MACs stoc: [email protected],[email protected],[email protected],[email protected],[email protected],[email protected],[email protected],hmac-sha2-256,hmac-sha2-512,hmac-sha1
debug2: compression ctos: none,[email protected],zlib
debug2: compression stoc: none,[email protected],zlib
debug2: languages ctos:
debug2: languages stoc:
debug2: first_kex_follows 0
debug2: reserved 0
debug2: peer server KEXINIT proposal
debug2: KEX algorithms: curve25519-sha256,[email protected],diffie-hellman-group14-sha256,diffie-hellman-group14-sha1,[email protected]
debug2: host key algorithms: rsa-sha2-256,ssh-rsa
debug2: ciphers ctos: [email protected],aes128-ctr,aes256-ctr
debug2: ciphers stoc: [email protected],aes128-ctr,aes256-ctr
debug2: MACs ctos: hmac-sha1,hmac-sha2-256
debug2: MACs stoc: hmac-sha1,hmac-sha2-256
debug2: compression ctos: none
debug2: compression stoc: none
debug2: languages ctos:
debug2: languages stoc:
debug2: first_kex_follows 0
debug2: reserved 0
debug1: kex: algorithm: curve25519-sha256
debug1: kex: host key algorithm: rsa-sha2-256
debug1: kex: server->client cipher: [email protected] MAC: compression: none
debug1: kex: client->server cipher: [email protected] MAC: compression: none
debug3: send packet: type 30
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY

@patricegautier
Copy link
Owner

patricegautier commented Jun 7, 2022 via email

@UntestedEngineer
Copy link
Author

UntestedEngineer commented Jun 8, 2022 via email

@patricegautier
Copy link
Owner

patricegautier commented Jun 8, 2022 via email

@patricegautier
Copy link
Owner

I have since run into other RSA related issues.. You might want to try the latest which now invokes ssh with

-o PubkeyAcceptedKeyTypes=+ssh-rsa -o HostKeyAlgorithms=+ssh-rsa

@kenbshinn
Copy link

Hey all, I wanted to see if you guys had an update on this. I am seeing a similar issue on my end.

@patricegautier
Copy link
Owner

@kenbshinn and this is with the latest version of mca-dump-short.sh? Can you run the same command zabbix issues from the command line from the zabbix server successfully?

@kenbshinn
Copy link

I just saw there is a newer version of mca-dump-short.sh. Let me pull that copy and I will give it a shot.

@kenbshinn
Copy link

I just tried it with the new version of mca-dump-short.sh and I am getting the same results as before.

I am just getting power statistics, but no port information. It is weird since I have a UDMP SE and that seems to work fine.

Let me know if there is anything else you want me to try.

@kenbshinn
Copy link

So the command I am running from the zabbix server is:

sudo -u zabbix /usr/lib/zabbix/externalscripts/mca-dump-short.sh '-d' 'ip' '-u' 'user' '-i' '/zabbix/zabbix/zb_id_rsa' '-t' 'SWITCH_FEATURE_DISCOVERY'

And that comes back with the statistics I mentioned eariler about power, temp, etc.

I decided to change it from SWITCH_FEATURE_DISCOVERY to just SWITCH and I appear to be seeing port statisitcs in the read out, but when I tried to run SWITCH_FEATURE it appeared to have timed out.

I then realized I took out the -o for the time out which I added back in after running it a few time is when I got the time out message and also a few of these:
{ "reason":"Error remote invoking mca-dump-short: Could not create directory /var/lib/zabbix/.ssh (No such file or directory).Failed to add the host to the list of known hosts (/var/lib/zabbix/.ssh/known_hosts).", "time":"Wed Jan 11 06:55:22 PM UTC 2023", "device":"ip", "mcaDumpError":"Error" }

not sure if any of this helps

@patricegautier
Copy link
Owner

patricegautier commented Jan 11, 2023 via email

@kenbshinn
Copy link

I created the directory that that seems to be working fine.

I am still seeing:
{ "reason":"Error remote invoking mca-dump-short: ", "time":"Wed Jan 11 07:41:27 PM UTC 2023", "device":"192.168.1.50", "mcaDumpError":"Error" }

@kenbshinn
Copy link

Also I am using SSH key

@kenbshinn
Copy link

kenbshinn commented Jan 11, 2023

@patricegautier Question, I am relatively new to all this, but from looking at the mcs-dump-short.sh file I noticed that on line 404 you have the connection timeout hard coded to 5. Does this superceede the Macro being set in Zabbix or when I am manually running the script and setting the -o option?

@patricegautier
Copy link
Owner

it does.. have you tried upping it to see if it makes a difference..

Also take a look in /tmp/mcaDumpShort.err please..

@kenbshinn
Copy link

I tried upping that value to 10 or 20, no noticable change.

Here is what I am seeing in /tmp/mcaDumpShort.err:


Wed Jan 11 07:51:59 PM UTC 2023 192.168.1.50
{ "reason":"Error remote invoking mca-dump-short: ", "time":"Wed Jan 11 07:51:59 PM UTC 2023", "device":"192.168.1.50", "mcaDumpError":"Error" }


Wed Jan 11 07:52:43 PM UTC 2023 192.168.1.50
{ "reason":"timeout (143)", "time":"Wed Jan 11 07:52:43 PM UTC 2023", "device":"192.168.1.50", "mcaDumpError":"Error" }

Now please forgive me if I am off base here, but when I run the command from my Zabbix server without the -o I do not get the timeout message, but when I do, it appears that every 5th or 6th attempt times out for 3 or 4 attempts.

What would happen, if we were to remove the -o timeout from being used on the devices I am having these issues on?

@patricegautier
Copy link
Owner

you would just get a timeout in zabbix.. Let's try this:

• in your zabbix server conf, usually /etc/zabbix/zabbix_server.conf add:

TimeOut=30

and then in zabbix set a macro in Administration > General Macros:

UNIFI_CHECK_TIMEOUT to 25

and let's see if if that does it..

@kenbshinn
Copy link

@patricegautier that seems to have worked. I am still getting an error in the lastest data section, but I am not seeing any errors in the zabbix log anymore and the Data appears to be populating.

Thank you for your help with this! I really apprecate it.

@patricegautier
Copy link
Owner

Great - so what's the error in the data section?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants