diff --git a/README.md b/README.md
index 2c85f3b47..78291df4c 100644
--- a/README.md
+++ b/README.md
@@ -99,8 +99,8 @@ In summary:
| High Level APIs | Sample use | Arguments |
|-----------------|------------|-------------------|
-| QEfficient.cloud.infer | [click here](#1-use-qefficientcloudinfer) |
model_name : $\color{green} {Mandatory}$ num_cores : $\color{green} {Mandatory}$ device_group : $\color{green} {Mandatory}$batch_size : Optional [Default-1] prompt_len : Optional [Default-32] ctx_len : Optional [Default-128]mxfp6 : Optional hf_token : Optional cache_dir : Optional ["cache_dir" in current working directory]prompt : Optinoal [Default-"My name is"] |
-| QEfficient.cloud.execute | [click here](#2-use-of-qefficientcloudexcute) | model_name : $\color{green} {Mandatory}$ device_group : $\color{green} {Mandatory}$qpc_path : $\color{green} {Mandatory}$prompt : Optional [Default-"My name is"] cache_dir : Optional ["cache_dir" in current working directory]hf_token : Optional |
+| QEfficient.cloud.infer | [click here](#1-use-qefficientcloudinfer) | model_name : $\color{green} {Mandatory}$ num_cores : $\color{green} {Mandatory}$ device_group : $\color{green} {Mandatory}$batch_size : Optional [Default-1] prompt_len : Optional [Default-32] ctx_len : Optional [Default-128]mxfp6 : Optional hf_token : Optional cache_dir : Optional ["cache_dir" in current working directory]prompt : Optionalprompts_txt_file_path : Optional*only one argument, prompt or prompts_txt_file_path should be passed* |
+| QEfficient.cloud.execute | [click here](#2-use-of-qefficientcloudexcute) | model_name : $\color{green} {Mandatory}$ device_group : $\color{green} {Mandatory}$qpc_path : $\color{green} {Mandatory}$ cache_dir : Optional ["cache_dir" in current working directory]hf_token : Optional prompt : Optionalprompts_txt_file_path : Optional *only one argument, prompt or prompts_txt_file_path should be passed* |
### 1. Use QEfficient.cloud.infer
@@ -116,7 +116,7 @@ This is the single e2e python api in the library, which takes model_card name as
python -m QEfficient.cloud.infer --help
python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 16 --device_group [0] --prompt "My name is" --mos 1 --aic_enable_depth_first
-# If executing for batch size>1, pass path of txt file with input prompts, Example below
+# If executing for batch size>1, pass path of txt file with input prompts, Example below, sample txt file(prompts.txt) is present in examples folder .
python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 3 --prompt_len 32 --ctx_len 128 --num_cores 16 --device_group [0] --prompts_txt_file_path examples/prompts.txt --mxfp6 --mos 1 --aic_enable_depth_first
```
@@ -128,7 +128,7 @@ Once we have compiled the QPC, we can now use the precompiled QPC in execute API
python -m QEfficient.cloud.execute --model_name gpt2 --qpc_path qeff_models/gpt2/qpc_16cores_1BS_32PL_128CL_1devices_mxfp6/qpcs/ --prompt "Once upon a time in" --device_group [0]
```
-We can also enable MQ, just based on the number of devices. Based on the "--device_group" as input it will create TS config on the fly. If "--device-group [0,1]" it will create TS config for 2 devices and use it for compilation, if "--device-group 0" then TS compilation is skipped and single soc execution is enabled.
+We can also enable MQ, just based on the number of devices. Based on the "--device-group" as input it will create TS config on the fly. If "--device-group [0,1]" it will create TS config for 2 devices and use it for compilation, if "--device-group 0" then TS compilation is skipped and single soc execution is enabled.
```bash
python -m QEfficient.cloud.infer --model_name Salesforce/codegen-2B-mono --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 16 --device-group [0,1] --prompt "def fibonacci(n):" --mos 2 --aic_enable_depth_first
@@ -145,7 +145,7 @@ python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 1 --prompt_len 3
| High Level APIs | Single SoC | Tensor Slicing |
|-----------------|------------|-------------------|
-| QEfficient.cloud.infer | python -m QEfficient.cloud.infer --model_name $\color{green} {model}$ --batch_size 8 --prompt_len 128 --ctx_len 1024 --num_cores 16 --device-group [0] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 1 --aic_enable_depth_first | python -m QEfficient.cloud.infer --model_name $\color{green}{model}$ --batch_size 8 --prompt_len 128 --ctx_len 1024--num_cores 16 --device-group [0,1,2,3] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 4 --aic_enable_depth_first |
+| QEfficient.cloud.infer | python -m QEfficient.cloud.infer --model_name $\color{green} {model}$ --batch_size 1 --prompt_len 128 --ctx_len 1024 --num_cores 16 --device-group [0] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 1 --aic_enable_depth_first | python -m QEfficient.cloud.infer --model_name $\color{green}{model}$ --batch_size 1 --prompt_len 128 --ctx_len 1024--num_cores 16 --device-group [0,1,2,3] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 4 --aic_enable_depth_first |
| QEfficient.cloud.excute | python -m QEfficient.cloud.execute --model_name $\color{green}{model}$ --device_group [0] --qpc_path $\color{green}{path}$ --prompt "My name is" --hf_token $\color{green}{xyz}$ | python -m QEfficient.cloud.execute --model_name $\color{green}{model}$ --device_group [0,1,2,3] --qpc_path $\color{green}{path}$ --prompt "My name is" --hf_token $\color{green}{xyz}$ |
:memo: Replace $\color{green}{model}$ , $\color{green}{path}$ and $\color{green}{xyz}$ with preffered model card name, qpc path and hf token respectively.