From be0f062b5d2e247617c5ad55dfe62b07d5ae0039 Mon Sep 17 00:00:00 2001
From: Andrew Truong <itsandrewtruong@gmail.com>
Date: Fri, 15 Mar 2024 14:40:21 -0400
Subject: [PATCH] Update README.md

---
 .../README.md                                    | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/jobs/deploy_to_nvidia_nemo_inference_microservice/README.md b/jobs/deploy_to_nvidia_nemo_inference_microservice/README.md
index 6f62968..668b111 100644
--- a/jobs/deploy_to_nvidia_nemo_inference_microservice/README.md
+++ b/jobs/deploy_to_nvidia_nemo_inference_microservice/README.md
@@ -14,11 +14,16 @@ Deployment time varies by model and machine type. The base Llama2-7b config take
 
 ## User Quickstart
 
-1. Create a queue if you don't have one already, and launch an agent:
+1. Create a queue if you don't have one already.  See an example queue config below.
+   1. You can set `gpus` to the specific GPUs you want to use, or `all` to use everything.
+   2. Set `runtime` to `nvidia`
+   
+   ![image](https://github.com/wandb/launch-jobs/assets/15385696/d349e37a-ce1d-48b3-992f-1b4b617efa19)
+2. Launch an agent on your GPU machine:
    ```bash
    wandb launch-agent -e $ENTITY -p $PROJECT -q $QUEUE
    ```
-2. Submit the deployment job with your desired configs from the [Launch UI](https://wandb.ai/launch). See `configs/` for examples.
+3. Submit the deployment job with your desired configs from the [Launch UI](https://wandb.ai/launch). See `configs/` for examples.
    1. You can also submit via the CLI:
       ```bash
       wandb launch -d gcr.io/playground-111/deploy-to-nemo:latest \
@@ -27,7 +32,12 @@ Deployment time varies by model and machine type. The base Llama2-7b config take
         -q $QUEUE \
         -c $CONFIG_JSON_FNAME
       ```
-3. You can track the deployment process in the Launch UI. Once complete, you can immediately curl the endpoint to test the model. The model name is always `ensemble`.
+      ![image](https://github.com/wandb/launch-jobs/assets/15385696/8bc95b7a-94a6-453e-9c87-f6b25a567604)
+      
+5. You can track the deployment process in the Launch UI.
+   ![image](https://github.com/wandb/launch-jobs/assets/15385696/49ca8391-689e-4cb7-9ba9-b5691f2cc7aa)
+   
+7. Once complete, you can immediately curl the endpoint to test the model. The model name is always `ensemble`.
    ```bash
     #!/bin/bash
     curl -X POST "http://0.0.0.0:9999/v1/completions" \