Troubleshooting

ROR services are very stable and rarely experience issues and downtime.

Common issues

Issue	Common causes	Action(s)
ROR API is down	Elastic search at 100% CPU usage	Typically nothing; app will recover on its own when traffic subsides. If it does not recover, force a restart with deployment to ror-api or (in case of emergency) AWS CLI request like aws ecs update-service --force-new-deployment --cluster CLUSTER_NAME --service SERVICE_NAME . If this happens repeatedly due to traffic from specific IPs, IPs can be blocked by adding them to the blacklist_ips_prod variable in Terraform Cloud and triggering a manual Terraform run.
Ror-site won’t deploy	Dependency issues	Review Actions log; check dependencies pulled in during actions run .Trigger deployment again if needed
No app logs from ECS containers (not really an issue itself, but makes it hard to troubleshoot)	Nginx logs are not being forwarded (bug in Phusion Passenger https://github.com/phusion/passenger-docker/issues/224	SSH to container (see below) and restart nginx-log-forwarder

In case of issues with the data release process, it's possible to delete and recreated the Elasticsearch index from a data dump.

SSH to running ECS container - see Bastion host entry in 1Password
Run the setup up command and pass the filename of the data dump you want to index (no file extension). File must exist in ror-data.
```
 python manage.py setup v1.0-2022-03-17-ror-data
```