Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Signed-off-by: Mamta Singh <[email protected]>
  • Loading branch information
quic-mamta authored May 21, 2024
1 parent 8712b87 commit 81a3163
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ In summary:

| High Level APIs | Sample use | Arguments |
|-----------------|------------|-------------------|
| QEfficient.cloud.infer | [click here](#1-use-qefficientcloudinfer) | <li>model_name : $\color{green} {Mandatory}$</li> <li>num_cores : $\color{green} {Mandatory}$</li> <li>device_group : $\color{green} {Mandatory}$</li><li>batch_size : Optional [Default-1]</li> <li>prompt_len : Optional [Default-32]</li> <li>ctx_len : Optional [Default-128]</li><li>mxfp6 : Optional </li> <li>hf_token : Optional </li><li>cache_dir : Optional ["cache_dir" in current working directory]</li><li>prompt : Optinoal [Default-"My name is"]</li> |
| QEfficient.cloud.execute | [click here](#2-use-of-qefficientcloudexcute) | <li>model_name : $\color{green} {Mandatory}$</li> <li>device_group : $\color{green} {Mandatory}$</li><li>qpc_path : $\color{green} {Mandatory}$</li><li>prompt : Optional [Default-"My name is"]</li> <li>cache_dir : Optional ["cache_dir" in current working directory]</li><li>hf_token : Optional </li> |
| QEfficient.cloud.infer | [click here](#1-use-qefficientcloudinfer) | <li>model_name : $\color{green} {Mandatory}$</li> <li>num_cores : $\color{green} {Mandatory}$</li> <li>device_group : $\color{green} {Mandatory}$</li><li>batch_size : Optional [Default-1]</li> <li>prompt_len : Optional [Default-32]</li> <li>ctx_len : Optional [Default-128]</li><li>mxfp6 : Optional </li> <li>hf_token : Optional </li><li>cache_dir : Optional ["cache_dir" in current working directory]</li><li>prompt : Optional</li><li>prompts_txt_file_path : Optional</li><li>*only one argument, prompt or prompts_txt_file_path should be passed*</li> |
| QEfficient.cloud.execute | [click here](#2-use-of-qefficientcloudexcute) | <li>model_name : $\color{green} {Mandatory}$</li> <li>device_group : $\color{green} {Mandatory}$</li><li>qpc_path : $\color{green} {Mandatory}$</li> <li>cache_dir : Optional ["cache_dir" in current working directory]</li><li>hf_token : Optional </li><li>prompt : Optional</li><li>prompts_txt_file_path : Optional</li> <li>*only one argument, prompt or prompts_txt_file_path should be passed*</li> |


### 1. Use QEfficient.cloud.infer
Expand All @@ -116,7 +116,7 @@ This is the single e2e python api in the library, which takes model_card name as
python -m QEfficient.cloud.infer --help
python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 16 --device_group [0] --prompt "My name is" --mos 1 --aic_enable_depth_first

# If executing for batch size>1, pass path of txt file with input prompts, Example below
# If executing for batch size>1, pass path of txt file with input prompts, Example below, sample txt file(prompts.txt) is present in examples folder .

python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 3 --prompt_len 32 --ctx_len 128 --num_cores 16 --device_group [0] --prompts_txt_file_path examples/prompts.txt --mxfp6 --mos 1 --aic_enable_depth_first
```
Expand All @@ -128,7 +128,7 @@ Once we have compiled the QPC, we can now use the precompiled QPC in execute API
python -m QEfficient.cloud.execute --model_name gpt2 --qpc_path qeff_models/gpt2/qpc_16cores_1BS_32PL_128CL_1devices_mxfp6/qpcs/ --prompt "Once upon a time in" --device_group [0]
```

We can also enable MQ, just based on the number of devices. Based on the "--device_group" as input it will create TS config on the fly. If "--device-group [0,1]" it will create TS config for 2 devices and use it for compilation, if "--device-group 0" then TS compilation is skipped and single soc execution is enabled.
We can also enable MQ, just based on the number of devices. Based on the "--device-group" as input it will create TS config on the fly. If "--device-group [0,1]" it will create TS config for 2 devices and use it for compilation, if "--device-group 0" then TS compilation is skipped and single soc execution is enabled.

```bash
python -m QEfficient.cloud.infer --model_name Salesforce/codegen-2B-mono --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 16 --device-group [0,1] --prompt "def fibonacci(n):" --mos 2 --aic_enable_depth_first
Expand All @@ -145,7 +145,7 @@ python -m QEfficient.cloud.infer --model_name gpt2 --batch_size 1 --prompt_len 3

| High Level APIs | Single SoC | Tensor Slicing |
|-----------------|------------|-------------------|
| QEfficient.cloud.infer | python -m QEfficient.cloud.infer --model_name $\color{green} {model}$ --batch_size 8 --prompt_len 128 --ctx_len 1024 --num_cores 16 --device-group [0] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 1 --aic_enable_depth_first | python -m QEfficient.cloud.infer --model_name $\color{green}{model}$ --batch_size 8 --prompt_len 128 --ctx_len 1024--num_cores 16 --device-group [0,1,2,3] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 4 --aic_enable_depth_first |
| QEfficient.cloud.infer | python -m QEfficient.cloud.infer --model_name $\color{green} {model}$ --batch_size 1 --prompt_len 128 --ctx_len 1024 --num_cores 16 --device-group [0] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 1 --aic_enable_depth_first | python -m QEfficient.cloud.infer --model_name $\color{green}{model}$ --batch_size 1 --prompt_len 128 --ctx_len 1024--num_cores 16 --device-group [0,1,2,3] --prompt "My name is" --mxfp6 --hf_token $\color{green}{xyz}$ --mos 4 --aic_enable_depth_first |
| QEfficient.cloud.excute | python -m QEfficient.cloud.execute --model_name $\color{green}{model}$ --device_group [0] --qpc_path $\color{green}{path}$ --prompt "My name is" --hf_token $\color{green}{xyz}$ | python -m QEfficient.cloud.execute --model_name $\color{green}{model}$ --device_group [0,1,2,3] --qpc_path $\color{green}{path}$ --prompt "My name is" --hf_token $\color{green}{xyz}$ |

:memo: Replace $\color{green}{model}$ , $\color{green}{path}$ and $\color{green}{xyz}$ with preffered model card name, qpc path and hf token respectively.
Expand Down

0 comments on commit 81a3163

Please sign in to comment.