-
Notifications
You must be signed in to change notification settings - Fork 3
Troubleshooting ScholarsArchive Breakage
Ack! Looks like something is wrong with ScholarsArchive@OSU! What do we do?
- Keep calm.
There are a few issues that can arise in SA@OSU and hopefully the solution is much more simple than thought.
- First off, what is the issue?
Is it failing on ingest? Is it not responding? or is it something else?
If SA@OSU is failing ingest, check the logs. tail -f production.log
is a good way to follow a log as you are trying to see what happens.
Next try and ingest a work. You'll most likely be met with an Ack! message and this is fine. Follow the logs and look for any 500 errors that are occurring.
If you see a LDP::Conflict, that is a good sign.
- Fire up the rails console
- Run
[::Noid::Rails::Service.new.minter.mint, ::Noid::Rails::Service.new.minter.mint]
- Try and ingest a work again.
Last resort if nothing else is working
- Kill sidekiq and puma and either let monit.d restart them or to restart them in the shared/app.sh or shared/sidekiq.sh scripts
Look to see if there are multiple puma processes running on the production server. You can do this by running ps -ef | grep puma
If there are multiple processes running and scholars archive intermittently goes down, then try restarting monit, which manages the puma processes. sudo /usr/bin/monit restart all
. Let the puma processes be killed and then a new one spun up.
If this happens, make sure tomcat is running as well with ps -ef | grep tomcat
. If no results are found then make sure tomcat is running, because thats what contains fedora. Run sudo /sbin/service tomcat restart
and let tomcat spin up.
This should resolve any memory issues that are around due to too many puma processes being spun up and killed.
NOTE: THIS CAN HAPPEN AFTER A MANUAL DEPLOY. SO MAKE SURE YOU RESTART MONIT AFTER A DEPLOY TO MAKE SURE ITS RUNNING CLEAN.
- For Web Server-based errors: Parse the puma.log for errors Parse puma.err.log
- For App-based Errors: parse production.log