Cohort-middleware provides a set of web-services (endpoints) for:
- providing information about cohorts to which a user has authorized access (Atlas DB cohorts as defined in Fence/Arborist?)
- getting clinical attribute values for a given cohort (aka CONCEPT values in Atlas/OMOP jargon)
- providing patient-level clinical attribute values matrix for use in backend workflows, like GWAS workflows (e.g. https://github.com/uc-cdis/vadc-genesis-cwl)
The cohorts and their clinical attribute values are retrieved from connected OHDSI/CMD/Atlas databases via SQL queries.
OpenAPI documentation available here.
YAML file for the OpenAPI documentation is found in the openapis
folder.
Overview of cohort-middleware and its connected systems:
Execute the following command to get help:
go run main.go -h
To just start with the default "development" settings:
go run main.go
See example config file in ./config/
folder.
The data which our code queries is currently assuming 2 separate databases. The "atlas" schema on one database, and the "results" and "cdm" schemas together on another DB. In practice, the databases could even be a mix from different vendors/engines (e.g. one a "sql server" and one a "postgres"). Therefore, the code does not have queries that do a direct join between tables in "atlas" and "results" or "atlas" and "cdm".
Below is an overview of the schemas and respective tables.
DB Instance1:
===============================
SCHEMA atlas
===============================
TABLE atlas.source
TABLE atlas.source_daimon
TABLE atlas.cohort_definition
DB Instance2:
===============================
SCHEMA results
===============================
TABLE results.COHORT
===============================
SCHEMA omop
===============================
TABLE omop.person
TABLE omop.observation
TABLE omop.concept
VIEW omop.observation_continuous
Setup the local Atlas DB by running the init_db.sh
script in the ./tests
folder:
cd tests/setup_local_db/
./init_db.sh
Test this setup by trying the following curl commands: JSON summary data endpoints:
curl http://localhost:8080/sources | python -m json.tool
curl "http://localhost:8080/cohortdefinition-stats/by-source-id/1/by-team-project?team-project=test" | python -m json.tool
curl http://localhost:8080/concept/by-source-id/1 | python -m json.tool
curl -d '{"ConceptIds":[2000000324,2000006885]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept/by-source-id/1 | python -m json.tool
curl -d '{"ConceptTypes":["Measurement","Person"]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept/by-source-id/1/by-type | python -m json.tool
curl http://localhost:8080/concept-stats/by-source-id/1/by-cohort-definition-id/3/breakdown-by-concept-id/2000007027 | python3 -m json.tool
curl -d '{"variables": [{"variable_type": "concept", "concept_id": 2000006885}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept-stats/by-source-id/1/by-cohort-definition-id/3/breakdown-by-concept-id/2000007027 | python3 -m json.tool
CSV data endpoints:
curl -d '{"variables":[{"variable_type": "concept", "concept_id": 2000000324},{"variable_type": "concept", "concept_id": 2000006885},{"variable_type": "concept", "concept_id": 2000007027},{"variable_type": "custom_dichotomous", "cohort_ids": [1, 2]}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/cohort-data/by-source-id/1/by-cohort-definition-id/3
curl -d '{"variables":[{"variable_type": "concept", "concept_id": 2000000324},{"variable_type": "concept", "concept_id": 2000006885},{"variable_type": "concept", "concept_id": 2000007027},{"variable_type": "custom_dichotomous", "provided_name": "test123", "cohort_ids": [1, 99]}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/concept-stats/by-source-id/1/by-cohort-definition-id/3/breakdown-by-concept-id/2000007027/csv
Histogram endpoint:
curl -d '{"variables":[{"variable_type": "custom_dichotomous", "cohort_ids": [1, 4]}]}' -H "Content-Type: application/json" -X POST http://localhost:8080/histogram/by-source-id/1/by-cohort-definition-id/4/by-histogram-concept-id/2000006885
For deployment in Gen3 simply use kube-setup-cohort-middleware
script:
gen3 kube-setup-cohort-middleware
The script will use ohdsi
database credentials and will result in cohort-middleware-g3auto
Kubernetes secret.
If any changes need to be made to the settings, find the .yaml config file:
find ~ -type f -path '*/g3auto/cohort-middleware/*.yaml'
and remove that first before running the gen3 kube-setup
command above.
To roll cohort-middleware (in case of version update), full kube-setup-cohort-middleware
is not required:
gen3 roll cohort-middleware
This will take care of all the secrets via g3auto
.
Example:
curl -H "Content-Type: application/json" -H "$(cat auth.txt)" https://<server-url-here>/sources | python -m json.tool
Note that the <server-url-here>
in these examples above needs to be replaced, and the ids used (by-source-id/2
, by-cohort-definition-id/3
) need
to be replaced with real values from your environment. The main addition in these curl
commands is the presence of https
and the
extra -H "$(cat auth.txt)"
. More explained in the subsections below.
Troubleshooting steps when using manifest.json based deployment:
- check
/home/<your-machine-name>/cdis-manifest/<your-machine-name>/manifest.json
to make sure the desired image name and tag for cohort-middleware are present. Do not edit this file directly on the server, but make a PR with changes if needed. - regarding
gen3 roll cohort-middleware
, see also https://github.com/uc-cdis/cloud-automation/blob/master/kube/services/cohort-middleware/cohort-middleware-deploy.yaml, which is used directly by thegen3 roll
command (see https://github.com/uc-cdis/cloud-automation/blob/master/gen3/bin/roll.sh).
Go to https:// and then to "Login"->"Profile"->"Create API key". Download the JSON to your local computer.
Run (e.g. if the downloaded JSON file is called credentials.json
):
export SERVER_NAME=<your-server-name-here>
curl -d "$(cat credentials.json)" -X POST -H "Content-Type: application/json" https://${SERVER_NAME}/user/credentials/api/access_token
Save the contents of token in a file, e.g. auth.txt
. Then try for example:
curl -H "Content-Type: application/json" -H "Authorization: bearer $(cat auth.txt)" https://${SERVER_NAME}/cohort-middleware/sources | python -m json.tool
Find the pod(s):
kubectl get pods --all-namespaces | grep cohort-middleware
or:
kubectl get pods -l app=cohort-middleware
Then run:
kubectl logs <pod-name-here>
or
kubectl logs -f -l app=cohort-middleware
See also https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-running-pods
Get help from "PE team":
- PE team = Platform Engineering team = GPE project Jira ticket = #gen3-devops-oncall (slack channel)
If networking changes are necessary:
If proxy changes are necessary:
Other config related to network policies:
To push a new generic dockerhub image to Quay (like a specific version of Golang), use something like this in slack:
@qa-bot run-jenkins-job gen3-self-service-push-dockerhub-img-to-quay jenkins {"SOURCE":"python:3.10-alpine","TARGET":"quay.io/cdis/python:3.10-alpine-master"}
Or use the self-service page:
The result will be a new image pushed to quay.io that we can start using in our Dockerfile, like:
FROM quay.io/cdis/golang:1.18-bullseye