Copied v2.0-2.1.5

Signed-off-by: david.perkins <[email protected]>
Alvearie · Sep 2, 2021 · 23329a3 · 23329a3
1 parent d2e4bc7
commit 23329a3
Show file tree

Hide file tree

Showing 81 changed files with 5,235 additions and 1,597 deletions.
diff --git a/README.md b/README.md
@@ -3,6 +3,8 @@ The Alvearie Health Record Ingestion service: a common 'Deployment Ready Compone
 
 This repo contains the code for the Management API of the HRI, which uses [IBM Functions](https://cloud.ibm.com/docs/openwhisk?topic=cloud-functions-getting-started) (Serverless built on [OpenWhisk](https://openwhisk.apache.org/)) with [Golang](https://golang.org/doc/). Basically, this repo defines an API and maps endpoints to Golang executables packaged into 'actions'. IBM Functions takes care of standing up an API Gateway, executing & scaling the actions, and transmitting data between them. [mgmt-api-manifest.yml](mgmt-api-manifest.yml) defines the actions, API, and the mapping between them. A separate OpenAPI specification is maintained in [Alvearie/hri-api-spec](https://github.com/Alvearie/hri-api-spec) for external user's reference. Please Note: Any changes to this (RESTful) Management API for the HRI requires changes in both the hri-api-spec repo and this hri-mgmt-api repo.
 
+This version is compatible with HRI `v2.1`.
+
 ## Communication
 * Please [join](https://alvearie.io/contributions/requestSlackAccess) our Slack channel for further questions: [#health-record-ingestion](https://alvearie.slack.com/archives/C01GM43LFJ6)
 * Please see recent contributors or [maintainers](MAINTAINERS.md)
@@ -45,13 +47,16 @@ rm src/exec
 ## CI/CD
 Since this application must be deployed using IBM Functions in an IBM Cloud account, there isn't a way to launch and test the API & actions locally. So, we have set up GitHub actions to automatically deploy every branch in its own IBM Function's namespace in our IBM cloud account and run integration tests. They all share common Elastic Search and Event Streams instances. Once it's deployed, you can perform manual testing with your namespace. You can also use the IBM Functions UI or IBM Cloud CLI to modify the actions or API in your namespace. When the GitHub branch is deleted, the associated IBM Function's namespace is also automatically deleted. 
 
+### Releases
+Releases are created by creating GitHub tags, which triggers a build that packages everything into a Docker image to deploy the Management API. See [docker/README.md](docker/README.md) for more details.
+
 ### Docker image build
-Images are published on every `develop` branch build with the tag `<branch>-timestamp`.
+Images are published on every `develop` branch build with the tag `develop-timestamp`.
 
 ## Code Overview
 
 ### IBM Function Actions - Golang Mains
-For each API endpoint, there is a Golang executable packaged into an IBM Function's 'action' to service the requests. There are several `.go` files in the base `src/` directory, one for each action and no others, each of which defines `func main()`. If you're familiar with Golang, you might be asking how there can be multiple files with different definitions of `func main()`. The Makefile takes care of compiling each one into a separate executable, and each file includes a [Build Constraint](https://golang.org/pkg/go/build/#hdr-Build_Constraints) to exclude it from unit tests. This also means these files are not unit tested and thus are kept as small as possible. Each one sets up any required clients and then calls an implementation method in a sub package. They also use `common.actionloopmin.Main()` to implement the OpenWhisk [action loop protocol](https://github.com/apache/openwhisk-runtime-go/blob/main/docs/ACTION.md). 
+For each API endpoint, there is a Golang executable packaged into an IBM Function's 'action' to service the requests. There are several `.go` files in the base `src/` directory, one for each action and no others, each of which defines `func main()`. If you're familiar with Golang, you might be asking how there can be multiple files with different definitions of `func main()`. The Makefile takes care of compiling each one into a separate executable, and each file includes a [Build Constraint](https://golang.org/pkg/go/build/#hdr-Build_Constraints) to exclude it from unit tests. This also means these files are not unit tested and thus are kept as small as possible. Each one sets up any required clients and then calls an implementation method in a sub package. They also use `common.actionloopmin.Main()` to implement the OpenWhisk [action loop protocol](https://github.com/apache/openwhisk-runtime-go/blob/master/docs/ACTION.md). 
 
 The compiled binaries have to be named `exec` and put in a zip file. Additionally, a `exec.env` file has to be included, which contains the name of the docker container to use when running the action. All the zip files are written to the `build` directory when running `make`. 
 
@@ -77,14 +82,15 @@ The goal is to have 90% code coverage with unit tests. The build automatically p
 The API that this repo implements is defined in [Alvearie/hri-api-spec](https://github.com/Alvearie/hri-api-spec) using OpenAPI 3.0. There are automated Dredd tests to make sure the implemented API meets the spec. If there are changes to the API, make them to the specification repo using a branch with the same name. Then the Dredd tests will run against the modified API specification. 
 
 ### Authentication & Authorization
-All endpoints (except the health check) require an OAuth 2.0 JWT bearer access token per [RFC8693](https://tools.ietf.org/html/rfc8693) in the `Authorization` header field. The Tenant and Stream endpoints require IAM tokens, but the Batch endpoints require a token with HRI and Tenant scopes for authorization. The Batch token issuer is configurable via a bound parameter, and must be OIDC compliant because the code dynamically uses the OIDC defined well know endpoints to validate tokens. Integration and testing have already been completed with App ID, the standard IBM Cloud solution.
+All endpoints (except the health check) require an OAuth 2.0 JWT bearer access token per [RFC8693](https://tools.ietf.org/html/rfc8693) in the `Authorization` header field. The Tenant and Stream endpoints require IAM tokens, but the Batch endpoints require a token with HRI and Tenant scopes for authorization. The Batch token issuer is configurable via a bound parameter, and must be OIDC compliant because the code dynamically uses the OIDC defined well know endpoints to validate tokens. Integration and testing have already been completed with [App ID](https://cloud.ibm.com/docs/appid), the standard IBM Cloud solution.
 
 Batch JWT access token scopes:
-- hri_data_integrator - Data Integrators can create, get, and change the status of batches, but only ones that they created.
-- hri_consumer - Consumers can list and get Batches
+- hri_data_integrator - Data Integrators can create, get, and call 'sendComplete' and 'terminate' endpoints for batches, but only ones that they created.
+- hri_consumer - Consumers can list and get batches.
+- hri_internal - For internal processing, can call batch 'processingComplete' and 'fail' endpoints.
 - tenant_<tenantId> - provides access to this tenant's batches. This scope must use the prefix 'tenant_'. For example, if a data integrator tries to create a batch by making an HTTP POST call to `tenants/24/batches`, the token must contain scope `tenant_24`, where the `24` is the tenantId.
 
-The scopes claim must contain one or more of the HRI roles ("hri_data_integrator", "hri_consumer") as well as the tenant id of the tenant being accessed.
+The scopes claim must contain one or more of the HRI roles ("hri_data_integrator", "hri_consumer", "hri_internal") as well as the tenant id of the tenant being accessed.
 
 ## Contribution Guide
 Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct, and the process for submitting pull requests to us.
diff --git a/deploy.sh b/deploy.sh
@@ -4,6 +4,8 @@
 #
 # SPDX-License-Identifier: Apache-2.0
 
+2>&1
+
 set -eo pipefail
 
 echo "CLOUD_API_KEY: ****"
@@ -16,7 +18,7 @@ echo "ELASTIC_SVC_ACCOUNT: $ELASTIC_SVC_ACCOUNT"
 echo "KAFKA_INSTANCE: $KAFKA_INSTANCE"
 echo "KAFKA_SVC_ACCOUNT: $KAFKA_SVC_ACCOUNT"
 echo "OIDC_ISSUER: $OIDC_ISSUER"
-echo "JWT_AUDIENCE_ID: $JWT_AUDIENCE_ID"
+echo "VALIDATION: $VALIDATION"
 
 # determine if IBM Cloud CLI is already installed
 set +e > /dev/null 2>&1
@@ -55,9 +57,19 @@ fi
 ibmcloud fn deploy --manifest mgmt-api-manifest.yml
 
 echo "Building OpenWhisk Parameters"
-params="$(cat <<EOF
+# set config parameters, all of them have to be set in the same command
+if [ -z "$VALIDATION" ] || [ "$VALIDATION" != true ]; then
+    echo "Setting Validation to false"
+    VALIDATION=false
+else
+    echo "Setting Validation to true"
+    VALIDATION=true
+fi
+
+params="$(cat << EOF
 {
   "issuer": "$OIDC_ISSUER",
+  "validation": $VALIDATION,
   "jwtAudienceId": "$JWT_AUDIENCE_ID"
 }
 EOF

diff --git a/docker/Dockerfile b/docker/Dockerfile
@@ -5,7 +5,6 @@
 # Use IBM's standard base image
 FROM registry.access.redhat.com/ubi8/ubi-minimal:8.4
 
-
 # yum is required by the IBM CLI installer and it tries to install git, curl, docker, kubectl, and helm
 # The docker install fails and causes problems. We don't need it, so this adds a dummy file to trick
 # the installer into thinking that docker is installed.
@@ -34,7 +33,7 @@ COPY build \
 
 WORKDIR mgmt-api-release
 
-# Setup flink user
+# Setup hri user
 RUN groupadd -g 1000 hri && \
     useradd --shell /bin/bash -u 1000 -g 1000 -m hri && \
     chown -R hri:hri /mgmt-api-release

diff --git a/docker/README.md b/docker/README.md
@@ -13,24 +13,23 @@ There are several environment variables that must be set in the container.
 | ELASTIC_SVC_ACCOUNT | Name of Elasticsearch service ID |
 | KAFKA_INSTANCE      | Name of Event Streams (Kafka) instance |
 | KAFKA_SVC_ACCOUNT   | Name of Event Streams (Kafka) service ID |
-| TOOLCHAIN_ID        | ID of the Toolchain for publishing results to Insights |
-| Logical_App_Name    | Application Name smoke test results are published under in Insights |
+| VALIDATION          | Whether to deploy the Management API with Validation, e.g. 'true', 'false' |
 | OIDC_ISSUER         | The base URL of the OIDC issuer to use for OAuth authentication (e.g. `https://us-south.appid.cloud.ibm.com/oauth/v4/<tenantId>`)               |
 | APPID_PREFIX        | (Optional) Prefix string to append to the AppId applications and roles created during deployment                                                |
-| SET_UP_APPID        | (Optional) defaults to true. Set to false if you do not want the App ID set-up described [above](#using-app-id-for-oidc-authentication) enabled. |
+| SET_UP_APPID        | (Optional) defaults to true. Set to false if you do not want the App ID set-up enabled. |
 
 ## Implementation Details
 
 The image entrypoint is `run.sh`, which:
  1. sets some environment variables
  1. logs into the IBM Cloud CLI
  1. calls `elastic.sh`
- 1. calls `appid.sh` 
+ 1. calls `appid.sh`
  1. calls `deploy.sh`
 
 `elastic.sh` turns off automatic index creation and sets the default template for batch indexes. These are idempotent actions, so they can be executed multiple times.
 
-`appid.sh` creates HRI application as well as HRI Consumer and HRI Data Integrator roles in AppId.
+`appid.sh` creates HRI and HRI Internal applications and HRI Internal, HRI Consumer, and HRI Data Integrator roles in AppId.
 
 `deploy.sh` deploys the Management API to IBM Functions and runs smoke tests (by calling the health check endpoint).
 

diff --git a/docker/appid.sh b/docker/appid.sh
@@ -2,6 +2,8 @@
 #
 # SPDX-License-Identifier: Apache-2.0
 
+#!/bin/bash
+
 # Exit on errors
 set -e
 
@@ -12,18 +14,17 @@ echo "issuer:$issuer"
 # Get IAM Token
 # Note, in this command and many below, the response is gathered and then sent to jq via echo (rather than piping directly) because if you pipe the response
 # directly to jq, the -f flag to fail if the curl command fails will not terminate the script properly.
-echo
 echo "Requesting IAM token"
 response=$(curl -X POST -sS 'https://iam.cloud.ibm.com/identity/token' -d "grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey=${CLOUD_API_KEY}")
 iamToken=$(echo $response | jq -r '.access_token // "NO_TOKEN"')
 if [ $iamToken = "NO_TOKEN" ]; then
   echo "the_curl_response: $response"
   echo "Error getting IAM Token! Exiting!"
-  exit 1  
+  exit 1
 fi  
 
 # Create application
-echo 
+echo
 echo "Creating HRI provider application"
 # Do not fail script if this call fails. We need to check if it failed because of a CONFLICT, in which case the script will exit 0.
 hriApplicationName="${APPID_PREFIX}HRI Management API"
@@ -49,13 +50,12 @@ if [ -z $hriApplicationId ]; then
     response=$(curl -X GET -sS "${issuer}/applications" -H "Authorization: Bearer ${iamToken}")
     hriApplicationId=$(echo $response | jq -r --arg name "$hriApplicationName" '.applications[] | select(.name == $name) | .clientId')
 
-    echo
-    echo "hriApplicationId: $hriApplicationId"
     if [ -z $hriApplicationId ]; then
       echo "Failed to get existing HRI Management API application ID! Unable to set JWT_AUDIENCE_ID!"
       echo "the_curl_response: $response"
       exit 1
     fi
+    echo
     echo "Setting JWT_AUDIENCE_ID to existing HRI Management API ID: $hriApplicationId"
     echo $hriApplicationId > JWT_AUDIENCE_ID
     exit 0
@@ -69,16 +69,34 @@ echo $hriApplicationId > JWT_AUDIENCE_ID
 
 # Assign scopes to application
 echo
-echo "Assigning hri_consumer and hri_data_integrator scopes to HRI provider application"
+echo "Assigning hri_internal, hri_consumer and hri_data_integrator scopes to HRI provider application"
 curl -X PUT -sS -f "${issuer}/applications/${hriApplicationId}/scopes" -H "Content-Type: application/json" -H "Authorization: Bearer ${iamToken}" -d @- << EOF
 {
-"scopes": [ "hri_consumer", "hri_data_integrator"]
+"scopes": [ "hri_internal", "hri_consumer", "hri_data_integrator"]
 }
 EOF
 
 # Create roles
-echo 
+echo
 echo "Creating roles for each of the created scopes"
+response=$(curl -X POST -sS "${issuer}/roles" -H "Authorization: Bearer ${iamToken}" -H "Content-Type: application/json" -d @- << EOF
+{
+"name": "${APPID_PREFIX}HRI Internal",
+"description": "HRI Internal Role",
+"access": [ {
+  "application_id": "${hriApplicationId}",
+  "scopes": [ "hri_internal" ]
+} ]
+}
+EOF
+)
+internalRoleId=$(echo $response | jq -r '.id // "REQUEST_FAILED"')
+if [ $internalRoleId = "REQUEST_FAILED" ]; then
+  echo "Error Creating role: HRI Internal!"
+  echo "the_curl_response: $response"
+  exit 1
+fi
+
 response=$(curl -X POST -sS "${issuer}/roles" -H "Authorization: Bearer ${iamToken}" -H "Content-Type: application/json" -d @- << EOF
 {
 "name": "${APPID_PREFIX}HRI Consumer",
@@ -94,7 +112,7 @@ consumerRoleId=$(echo $response | jq -r '.id // "REQUEST_FAILED"')
 if [ $consumerRoleId = "REQUEST_FAILED" ]; then
   echo "Error Creating role: HRI Consumer Role!"
   echo "the_curl_response: $response"
-  exit 1
+  exit 1  
 fi
 
 response=$(curl -X POST -sS "${issuer}/roles" -H "Authorization: Bearer ${iamToken}" -H "Content-Type: application/json" -d @- << EOF
@@ -112,7 +130,34 @@ dataIntegratorRoleId=$(echo $response | jq -r '.id // "REQUEST_FAILED"')
 if [ $dataIntegratorRoleId = "REQUEST_FAILED" ]; then
   echo "Error Creating role: HRI Data Integrator Role!"
   echo "the_curl_response: $response"
-  exit 1
+  exit 1  
 fi
 
+# Create HRI Internal application.
+echo
+echo "Creating HRI Internal application"
+response=$(curl -X POST -sS "${issuer}/applications" -H "Authorization: Bearer ${iamToken}" -H 'Content-Type: application/json' -d @- << EOF
+{
+"name": "${APPID_PREFIX}HRI Internal",
+"type": "regularwebapp"
+}
+EOF
+)
+internalApplicationId=$(echo $response | jq -r '.clientId // "REQUEST_FAILED"')
+if [ $internalApplicationId = "REQUEST_FAILED" ]; then
+  echo "Error Creating role: HRI Internal App Role!"
+  echo "the_curl_response: $response"
+  exit 1  
+fi
+
+# Assign roles to internal application.
+echo
+echo "Assigning internal and consumer roles to HRI Internal application"
+curl -X PUT -sS -f "${issuer}/applications/${internalApplicationId}/roles" -H "Authorization: Bearer ${iamToken}" -H "Content-Type: application/json" -d @- << EOF
+{
+"roles":{
+  "ids":["${internalRoleId}", "${consumerRoleId}"]
+}}
+EOF
+
 exit 0
diff --git a/docker/elastic.sh b/docker/elastic.sh
@@ -23,22 +23,21 @@ echo "ES baseUrl: ${baseUrl/:\/\/*@/://}"
 rtn=0
 
 # set auto-index creation off
-echo
 echo "Setting ElasticSearch auto index creation to false"
-curl -X PUT -sS -f $baseUrl/_cluster/settings  -H 'Content-Type: application/json' -d'
+curl -sS -f -X PUT $baseUrl/_cluster/settings  -H 'Content-Type: application/json' -d'
 {
 "persistent": { "action.auto_create_index": "false" }
-}' || { echo 'Setting ElasticSearch auto index creation failed!' ; rtn=1; }
+}' || { echo -e 'Setting ElasticSearch auto index creation failed!' ; rtn=1; }
 
 # upload batches index template
 echo
-echo "Setting ElasticSearch Batches index template"
-curl -X PUT -sS -f $baseUrl/_index_template/batches -H 'Content-Type: application/json' -d '@batches.json' || 
+echo -e "Setting ElasticSearch Batches index template"
+curl -sS -f -X PUT $baseUrl/_index_template/batches -H 'Content-Type: application/json' -d '@batches.json' || 
 { 
-	echo -e '\nSetting ElasticSearch Batches index template failed!' ; rtn=1; 
+	echo -e 'Setting ElasticSearch Batches index template failed!' ; rtn=1; 
 }
 
 echo
-echo "ElasticSearch configuration complete"
+echo -e "ElasticSearch configuration complete"
 
 exit $rtn
diff --git a/docker/template.env b/docker/template.env
@@ -1,3 +1,4 @@
+# Environment Variables for local testing
 IBM_CLOUD_API_KEY=<apikey>
 IBM_CLOUD_REGION=ibm:yp:us-south
 RESOURCE_GROUP=MY_RESOURCE_GROUP

diff --git a/document-store/index-templates/batches.json b/document-store/index-templates/batches.json
@@ -17,26 +17,46 @@
         },
         "recordCount": {
           "type": "long",
-          "index": "false"
+          "index": false
+        },
+        "expectedRecordCount": {
+          "type": "long",
+          "index": false
+        },
+        "actualRecordCount": {
+          "type": "long",
+          "index": false
         },
         "topic": {
           "type": "keyword",
-          "index": "false"
+          "index": false
         },
         "dataType": {
           "type": "keyword",
-          "index": "false"
+          "index": false
         },
         "startDate": {
           "type": "date"
         },
         "endDate": {
           "type": "date",
-          "index": "false"
+          "index": false
         },
         "metadata": {
           "type": "object",
-          "enabled": "false"
+          "enabled": false
+        },
+        "invalidThreshold": {
+          "type": "long",
+          "index": false
+        },
+        "invalidRecordCount": {
+          "type": "long",
+          "index": false
+        },
+        "failureMessage": {
+          "type": "text",
+          "index": false
         }
       }
     }

diff --git a/elastic-cert64 b/elastic-cert64
@@ -16,4 +16,4 @@ H+6i04hA9TkKT6ooLwMPc1LYYzqDljEkfKlLIPWCkOAozD3cyc26pV/35nG7WzAF
 xw7S3jAyB3WcJDlWlSWGTn58w3EHxzVXvKT6Y9eAdKp4SjUHyVFsL5xtSyjH8zpF
 pZKK8wWNUwgWQ66MNh8Ckq732JZ+so6RAfb4BbNj45I3s9fuZSYlvjkc5/+da3Ck
 Rp6anX5N6yIrzhVmAgefjQdBztYzdfPhsJBkS/TDnRmk
------END CERTIFICATE-----
+-----END CERTIFICATE-----