You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
>Note: Repository defaults to master branch, to build the version 4.2 checkout the branch r4.2.
@@ -145,7 +144,7 @@ git checkout r4.2
145
144
146
145
### 2.2.1. Preparing third party repositories
147
146
148
-
Build setup downloads the AOCL BLIS and ZenDNN repos into `third_party` folder. It can alternatively use local copies of ZenDNN and AOCL BLIS. This is very useful for day to day development scenarios, where developer may be interested in using recent version of repositories. Build setup will switch between local and remote copies of ZenDNN, AOCL BLIS and FBGEMM with environmental variables `ZENDNN_PT_USE_LOCAL_ZENDNN` , `ZENDNN_PT_USE_LOCAL_BLIS` and `ZENDNN_PT_USE_LOCAL_FBGEMM` respectively. To use local copies of ZenDNN , AOCL BLIS or FBGEMM, set `ZENDNN_PT_USE_LOCAL_ZENDNN` , `ZENDNN_PT_USE_LOCAL_BLIS` or `ZENDNN_PT_USE_LOCAL_FBGEMM` to 1 respectively. The source repositories should be downloaded/cloned in the directory where `ZenDNN_PyTorch_Plugin` is cloned for local setting. Folder structure may look like below.
147
+
Build setup downloads the ZenDNN, AOCL BLIS and FBGEMM repos into `third_party` folder. It can alternatively use local copies of ZenDNN, AOCL BLIS and FBGEMM. This is very useful for day to day development scenarios, where developer may be interested in using recent version of repositories. Build setup will switch between local and remote copies of ZenDNN, AOCL BLIS and FBGEMM with environmental variables `ZENDNN_PT_USE_LOCAL_ZENDNN` , `ZENDNN_PT_USE_LOCAL_BLIS` and `ZENDNN_PT_USE_LOCAL_FBGEMM` respectively. To use local copies of ZenDNN , AOCL BLIS or FBGEMM, set `ZENDNN_PT_USE_LOCAL_ZENDNN` , `ZENDNN_PT_USE_LOCAL_BLIS` or `ZENDNN_PT_USE_LOCAL_FBGEMM` to 1 respectively. The source repositories should be downloaded/cloned in the directory where `ZenDNN_PyTorch_Plugin` is cloned for local setting. Folder structure may look like below.
149
148
150
149
```
151
150
<parent folder>
@@ -217,29 +216,29 @@ For CNN models, set `dynamic=False` when calling for `torch.compile` as below:
217
216
```python
218
217
model = torch.compile(model, backend='zentorch', dynamic=False)
219
218
with torch.no_grad():
220
-
output = model(input)
219
+
output = model(input)
221
220
```
222
221
223
222
For hugging face NLP models, optimize them as below:
224
223
```python
225
224
model = torch.compile(model, backend='zentorch')
226
225
with torch.no_grad():
227
-
output = model(input)
226
+
output = model(input)
228
227
```
229
228
230
229
For hugging face LLM models, optimize them as below:
231
230
1. If output is generated through a call to direct `model`, optimize it as below:
232
231
```python
233
232
model = torch.compile(model, backend='zentorch')
234
233
with torch.no_grad():
235
-
output = model(input)
234
+
output = model(input)
236
235
```
237
236
238
237
2. If output is generated through a call to `model.forward`, optimize it as below:
@@ -275,15 +274,7 @@ The default level of logs is **WARNING** for both cpp and python sources but can
275
274
276
275
>INFO: Since all OPs implemented in _zentorch_ are registered with torch using the TORCH_LIBRARY() and TORCH_LIBRARY_IMPL() macros in bindings, the PyTorch profiler can be used without any modifications to measure the op level performance.
277
276
278
-
279
-
## 4.3 Saving the graph
280
-
Saving of the fx graphs before and after optimization in svg format can be enabled by setting the environment variable `ZENTORCH_SAVE_GRAPH` to 1.
281
-
```bash
282
-
export ZENTORCH_SAVE_GRAPH=1
283
-
```
284
-
The graphs will be saved by the names 'native_model.svg' and 'zen_optimized_model.svg', in the parent directory of the script in which the optimize function provided by the _zentorch_ is used.
285
-
286
-
## 4.4 Support for `TORCH_COMPILE_DEBUG`
277
+
## 4.3 Support for `TORCH_COMPILE_DEBUG`
287
278
PyTorch offers a debugging toolbox that comprises a built-in stats and trace function. This functionality facilitates the display of the time spent by each compilation phase, output code, output graph visualization, and IR dump. `TORCH_COMPILE_DEBUG` invokes this debugging tool that allows for better problem-solving while troubleshooting the internal issues of TorchDynamo and TorchInductor. This functionality works for the models optimized using _zentorch_, so it can be leveraged to debug these models as well. To enable this functionality, users can either set the environment variable `TORCH_COMPILE_DEBUG=1` or specify the environment variable with the runnable file (e.g., test.py) as input.
288
279
```bash
289
280
# test.py contains model optimized by torch.compile with 'zentorch' as backend
@@ -296,31 +287,7 @@ zentorch v4.2.0 is supported with ZenDNN v4.2. Please see the **Tuning Guideline
296
287
297
288
# 6. Additional Utilities:
298
289
299
-
## 6.1 Disabling Inductor:
300
-
301
-
This feature is intended for use whenever fx_graphs generated from torch.compile needs to be compared with and without Inductor compilation.
302
-
303
-
disable_inductor() API takes in a boolean input to disable Inductor. Once disabled, to re-enable inductor, pass "False" to the same API.
0 commit comments