-
Notifications
You must be signed in to change notification settings - Fork 0
Troubleshooting
Liz Krznarich edited this page Nov 3, 2023
·
4 revisions
ROR services are very stable and rarely experience issues and downtime.
Issue | Common causes | Action(s) |
---|---|---|
ROR API is down | Elastic search at 100% CPU usage | Typically nothing; app will recover on its own when traffic subsides. If it does not recover, force a restart with deployment to ror-api or (in case of emergency) AWS CLI request like aws ecs update-service --force-new-deployment --cluster CLUSTER_NAME --service SERVICE_NAME . If this happens repeatedly due to traffic from specific IPs, IPs can be blocked by adding them to the blacklist_ips_prod variable in Terraform Cloud and triggering a manual Terraform run. |
Ror-site won’t deploy | Dependency issues | Review Actions log; check dependencies pulled in during actions run .Trigger deployment again if needed |
No app logs from ECS containers (not really an issue itself, but makes it hard to troubleshoot) | Nginx logs are not being forwarded (bug in Phusion Passenger https://github.com/phusion/passenger-docker/issues/224 | SSH to container (see below) and restart nginx-log-forwarder |
In case of issues with the data release process, it's possible to delete and recreated the Elasticsearch index from a data dump.
-
SSH to running ECS container - see Bastion host entry in 1Password
-
Run the setup up command and pass the filename of the data dump you want to index (no file extension). File must exist in ror-data.
python manage.py setup v1.0-2022-03-17-ror-data