Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mattermost unstable for the past 2 weeks #401

Closed
bernardgut opened this issue May 2, 2023 · 3 comments
Closed

Mattermost unstable for the past 2 weeks #401

bernardgut opened this issue May 2, 2023 · 3 comments

Comments

@bernardgut
Copy link

Hello

We are running mattermost-team as a small team of 3 people in our local kubernetes and are very happy with it. However, in the past 2 weeks the instance has become very unstable with many restarts and some downtime that sometime last hours at around the same time of the day. After investigating we found the following in the logs

{"timestamp":"2023-05-02 08:33:35.174 Z","level":"error","msg":"plugin process exited","caller":"plugin/hclog_adapter.go:79","plugin_id":"com.mattermost.apps","wrapped_extras":"pathplugins/com.mattermost.apps/server/dist/plugin-linux-amd64pid87errorsignal: killed"}
{"timestamp":"2023-05-02 08:33:36.979 Z","level":"error","msg":"Failed to install prepackaged plugin","caller":"app/plugin.go:967","path":"/mattermost/prepackaged_plugins/mattermost-plugin-apps-v1.2.0-linux-amd64.tar.gz","error":"Failed to install extracted prepackaged plugin /mattermost/prepackaged_plugins/mattermost-plugin-apps-v1.2.0-linux-amd64.tar.gz: installExtractedPlugin: Unable to restart plugin on upgrade., unable to start plugin: com.mattermost.apps: timeout while waiting for plugin to start"}

then the pod restarts. sometimes it works. sometimes it fails on restart for a few more cycles. Each times it takes up to 5 minute to restart and either fails or works (until it fails randomly again.)

Thank you.

@clouedoc
Copy link

It looks like something is getting killed; why?

If it's because of an installation timeout, this might be caused because of permissions issues. Maybe taking a look at #410 can be worth it?
Otherwise, if you can get the signal that killed the process, it might be useful to you; is it the OS killing the pod?
Maybe you are running into an OOM error

@bernardgut
Copy link
Author

it was due to the hardware where the pod was running. The storage had issues. THanks. You can close this

@clouedoc
Copy link

Close it yourself, I don't have access, but you do since you created the issue ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants