diff --git a/performance-testing/k6/results/results.md b/performance-testing/k6/results/results.md
deleted file mode 100644
index 301d69e..0000000
--- a/performance-testing/k6/results/results.md
+++ /dev/null
@@ -1,37 +0,0 @@
-# Benchmark/ Baseline #
-
-## Environment Details ##
-All tests were performed on AWS cloud on a g4dn.2xlarge ec2 instance. Please see [here](https://aws.amazon.com/ec2/instance-types/g4/) for more details
-
-## Results ##
-### 5 sequential associations ###
-#### Description ####
-Send through the same study 5 times, with a 90 second gap to get average metric for a known study, environment and MAP (liver-seg) set up.
-
-#### Metrics ####
-Metrics are shown in seconds.
-
-| Test | MIG (Payload processed elapsed) | Workflow Manager (Workflow Instance Created) | Workflow Manager (Task Dispatched) | Task Manager (Plugin Started) | Argo (liver-seg execution) |
-| ------------ | ------------------------------- | -------------------------------------------- | ---------------------------------- | ----------------------------- | -------------------------- |
-| Association1 | 32.8835657 | | | | 60 |
-| Association2 | 29.9969251 | | | | 60 |
-| Association3 | 29.7551576 | | | | 60 |
-| Association4 | 28.8816108 | | | | 61 |
-| Association5 | 27.1322536 | | | | 62 |
-| Average | 29.72990256 | | | | 60.6 |
-
-
-### 5 parallel associations ###
-
-#### Description ####
-Send through the same study 5 times in parallel, and gather metrics for a known study, environment and MAP (liver-seg) set up.
-
-#### Metrics ####
-| Test | MIG (Payload processed elapsed) | Workflow Manager (Workflow Instance Created) | Workflow Manager (Task Dispatched) | Task Manager (Plugin Started) | Argo (liver-seg execution) |
-| ------------ | ------------------------------- | -------------------------------------------- | ---------------------------------- | ----------------------------- | -------------------------- |
-| Association1 | 46.6967919 | | | | 61 |
-| Association2 | 67.1354584 | | | | 60 |
-| Association3 | 72.6302684 | | | | 61 |
-| Association4 | 86.4762341 | | | | 63 |
-| Association5 | 95.7458654 | | | | 63 |
-| Average | 73.73692364 | | | | 61.6 |
diff --git a/performance-testing/results/results.md b/performance-testing/results/results.md
new file mode 100644
index 0000000..c5a3b41
--- /dev/null
+++ b/performance-testing/results/results.md
@@ -0,0 +1,155 @@
+# Introduction #
+This report documents the baseline and load tests against the AIDE. It shows comparisons of baseline and load tests across an AWS cloud environment (SIT) and performant on-premise Pre Prod environment. It also lists any conclusions and identifies any necessary follow-up actions.
+
+# Environment Details #
+
+## AWS Cloud (SIT) Specification ##
+
+| Node | Specification |
+|-----------|----------------------------|
+| SIT-Head1 | 4 vCPU, 16GB ram, 0 GPUs |
+| SIT-Head2 | 4 vCPU, 16GB ram, 0 GPUs |
+| SIT-DGX | 8 vCPUs, 32GB ram, 1 GPU's |
+
+## On-premise Pre Prod Environment ##
+
+| Node | Specification |
+|-----------|----------------------------|
+| PreProd-Head1 | 48 vCPU, 252GB ram, 1 GPUs |
+| PreProd-Head2 | 48 vCPU, 252GB ram, 0 GPUs |
+| PreProd-Head3 | 48 vCPUs, 252GB ram, 1 GPU's |
+
+
+# Data #
+| Modality | Details |
+| -------- | ------------------------- |
+| RF | \- 1 slice
\- 1MB |
+| US | \- 7 slices
\- 17MB |
+| MR | \- 5 slices
\- 1MB |
+| CT | \- 324 slices
\- 167MB |
+
+# Applications #
+The following dummy applications were published to stress the GPU and CPU. These were written using [stress](https://linux.die.net/man/1/stress) and [gpu-burn](https://github.com/wilicc/gpu-burn)
+
+| Application Name | Specification | Modality |
+| ---------------- | ------------------------------------------------------------------------ | --------- |
+| Small | CPU: 2
GPU: Access to all
RAM: 1GB
Execution time: 10 seconds | RF |
+| Medium | CPU: 8
GPU: Access to all
RAM: 10GB
Execution time: 30 seconds | US and MR |
+| Large | CPU: 12
GPU: Access to all
RAM: 16GB
Execution time: 60 seconds | CT |
+
+# Test Types #
+
+## Baseline ##
+Single transactions to performance reference point which can be used as a basis for performance comparison
+
+## Load Average ##
+Realistic expected usage levels to determine its response time, resource usage, and reliability using GSTT imaging throughput data in an average 1 hour period.
+
+## Load Peak ##
+Realistic expected usage levels to determine its response time, resource usage, and reliability using GSTT imaging throughput data in an peak 1 hour period.
+
+## Stress ##
+Uplift of peak load by 25%
+
+# Throughput #
+
+## Peak 1 Hour ##
+
+| Modality | Transactions | Model executions |
+| ---------- | ------------ | ---------------- |
+| X-ray | 120 | 120 |
+| Ultrasound | 50 | 5 |
+| CT | 30 | 21 |
+| MRI | 25 | 17.5 |
+
+## Avg 1 Hour ##
+
+| Modality | Transactions | Model executions |
+| ---------- | ------------ | ---------------- |
+| X-ray | 60 | 60 |
+| Ultrasound | 28 | 2.8 |
+| CT | 10 | 7 |
+| MRI | 13 | 9.1 |
+
+## Stress 1 Hour ##
+
+| Modality | Transactions | Model executions |
+| ---------- | ------------ | ---------------- |
+| X-ray | 180 | 180 |
+| Ultrasound | 75 | 7.5 |
+| CT | 45 | 31.5 |
+| MRI | 37.5 | 26.25 |
+
+
+# KPI and Measurements #
+
+| KPI | Details | Query Params |
+| ----------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- |
+| DICOM Payload Processed | How long it took between an association being made to Informatics Gateway, the instances being saved to MinIO and a WorkflowRequestEvent being generated | ServiceName: Monai.Deploy.InformaticsGateway AND "Payload took" |
+| Task Dispatched | How long it took for the WorkflowRequestEvent to be consumed by the WorkflowManager, a workflow to be triggered and a TaskDispatchEvent to be generated | ServiceName: Monai.Deploy.WorkflowManager AND messageDescription: WorkflowRequestEvent AND durationMilliseconds > 0 |
+| Task Created | How long it took for the TaskDispatchEvent to be consumed by the TaskManager and create a Task | ServiceName: Monai.Deploy.WorkflowManager.TaskManager AND messageType: TaskDispatchEvent AND durationMilliseconds > 0 |
+| Task Update | How long it took for the TaskManager to publish a TaskUpdateEvent, the WorkflowManager to consume the event and update the WorkflowInstance | ServiceName: Monai.Deploy.WorkflowManager AND messageDescription: TaskUpdateEvent AND durationMilliseconds > 0 |
+| Argo | How long it took for Argo to run the application requested. This includes time from the pod being scheduled and then a TaskCallbackEvent being published | Taken from Argo |
+| End To End | Indicative time of the end to end processing of a workflow from dicom association to workflow completion. | Time from Task Update timestamp - (DICOM Payload Process timestamp - processed time) |
+
+# Cloud Execution #
+
+## Details ##
+Baseline tests were executed on SIT to validate the cloud environment to compare pre-prod tests against to understand the performance improvements based on specifications.
+
+## Results ##
+### Baseline ###
+#### Description ####
+Send through the same study 5 times, with a 90 second gap to get average metric for a known study, environment and MAP (liver-seg) set up.
+
+#### Metrics ####
+| | DICOM Payload Processed | DICOM Payload Processed | Task Dispatched | Task Dispatched | Task Created | Task Created | Task Update | Argo | Argo | Argo | End to End |
+| -------- | ----------------------- | ----------------------- | --------------- | --------------- | ------------- | ------------ | ------------- | --------- | ------------- | --------- | ---------- |
+| Modality | Average | Max | Average | Max | Average | Max | Average | Max | Average (min) | Max (min) | Indicative |
+| CT | 01:11 | 01:24 | 14.5 | 20.4 | 2.3 | 2.9 | 0.8 | 1.7 | 01:57 | 02:04 | 03:21 |
+| MR | 13.6 | 34.5 | 6.7 | 13.6 | 4.9 | 10 | 1 | 1.5 | 01:24 | 01:32 | 01:23 |
+| US | 6.2 | 6.8 | 2.6 | 3.3 | 2.8 | 3.9 | 0.7 | 1.5 | 01:14 | 01:15 | 01:25 |
+| RF | 5.8 | 9.5 | 11.3 | 23.6 | 30 | 107.7 | 1.1 | 2.9 | 01:06 | 01:37 | 00:58 |
+
+# On-Premise Execution #
+
+## Details ##
+Baseline, Load and Stress tests were executed on on-premise to understand the performance of MONAI-Deploy and AIDE on target production hardware and validate against throughput and metrics.
+
+## Results ##
+### Baseline 1 ###
+#### Description ####
+Send through the same study 5 times, with a 90 second gap to get average metric for a known study, environment and MAP (liver-seg) set up.
+
+#### Metrics ####
+| | DICOM Payload Processed | DICOM Payload Processed | Task Dispatched | Task Dispatched | Task Created | Task Created | Task Update | Task Update | Argo | Argo | End to End |
+| ------------------------------------------------------------ | ----------------------- | ----------------------- | --------------- | --------------- | ------------- | ------------ | ------------- | ----------- | ------------- | --------- | ---------- |
+| Modality | Average (sec) | Max (sec) | Average (sec) | Max (sec) | Average (sec) | Max (sec) | Average (sec) | Max (sec) | Average (min) | Max (min) | Indicative |
+| CT ("{{ context.dicom.series.all('0008','0060') }} == 'CT'") | 34.4 | 36.2 | 11.6 | 12.5 | 1.1 | 1.2 | 0.4 | 0.9 | 01:54 | 02:07 | 02:30 |
+| CT ("{{ context.dicom.series.any('0008','0060') }} == 'CT'") | 34.2 | 37.8 | 12.3 | 13.7 | 1.2 | 1.6 | 0.7 | 1 | 01:53 | 02:03 | N/A |
+| MR | 1.1 | 1.4 | 0.7 | 1.1 | 1.2 | 1.5 | 0.6 | 0.8 | 01:06 | 01:10 | 01:18 |
+| US | 1.7 | 2.3 | 1.1 | 1.3 | 0.9 | 1.6 | 0.6 | 1 | 01:10 | 01:17 | 01:07 |
+| RF | 0.7 | 1.2 | 0.7 | 1.1 | 0.9 | 1.3 | 0.8 | 1 | 00:55 | 00:58 | 00:42 |
+| CT (executing Small app & no conditional logic) | 34.9 | 37.9 | 10.6 | 10.8 | 2.08 | 6.3 | 0.9 | 1.9 | 01:06 | 01:13 | 01:47 |
+| RF (no conditional logic) | 0.7 | 0.9 | 0.8 | 1.4 | 1.3 | 1.8 | 0.7 | 1.3 | 00:55 | 01:00 | 00:52 |
+
+### Baseline 2 ###
+#### Description ####
+Retest of the MIG following a change to how it was saving data to MinIO.
+
+| | DICOM Payload Processed | DICOM Payload Processed |
+| -------- | ----------------------- | ----------------------- |
+| Modality | Average (sec) | Max (sec) |
+| CT | 14 | 16.2 |
+
+### Load (Avg) ###
+#### Description ####
+#### Metrics ####
+
+### Load (Peak) ###
+#### Description ####
+#### Metrics ####
+
+### Stress ###
+#### Description ####
+#### Metrics ####