Edit mixed precision pages in feature guide. #3755

dwelsch-esi · 2025-01-28T17:51:04Z

No description provided.

Signed-off-by: Dave Welsch <[email protected]>

dwelsch-esi · 2025-01-28T21:56:43Z

Docs/featureguide/mixed precision/mmp.rst

+Prerequisites
+-------------
+
+Manual mixed precision is supported only on PyTorch models.


Is this true? I assumed it is because I didn't see APIs for TF or ONNX.

dwelsch-esi · 2025-01-28T21:58:53Z

Docs/featureguide/mixed precision/mmp.rst

-* Change the precision of all the layers in the model of a certain type
-* Change the precision of model input tensors (or only a subset of input tensors)
-* Change the precision of model output tensors (or only a subset of output tensors)
+* A leaf layer


How is a leaf layer defined?

dwelsch-esi · 2025-01-28T22:01:20Z

Docs/featureguide/mixed precision/mmp.rst

+        Not supported.
+
+
+The ``apply`` call generates a report detailing how the request was inferred, propagated to other layers, and eventually realized.


Where is the report saved?

dwelsch-esi · 2025-01-28T22:06:12Z

Docs/featureguide/mixed precision/amp.rst

-    Layer Groups are defined as a group of layers grouped together based on certain rules.
-    This helps in reducing search space over which the mixed precision algorithm operates.
-    It also ensures that we search only over the valid bit-width settings for parameters and activations.
+Layer Groups are defined based on certain rules.


What are the rules and how are they defined?

dwelsch-esi · 2025-01-28T22:08:07Z

Docs/featureguide/mixed precision/amp.rst


-    .. image:: ../../images/pareto.png
-        :width: 900px
+An example of a Pareto list:


How is this list different from the accuracy list in the previous phase?

dwelsch-esi · 2025-01-28T22:09:26Z

Docs/featureguide/mixed precision/amp.rst


-Use Cases
-=========
+Conversion operations (convert ops) are introduced in the mixed-precision model for transition between ops with different activation bit widths or data types (float vs int). Convert ops contribute to the inference time along with bit-operations of ops.


I assume that 'convert op' is jargon for 'conversion operation'. Is that the case?

Edit mixed precision pages in feature guide.

b152ca5

Signed-off-by: Dave Welsch <[email protected]>

dwelsch-esi commented Jan 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Edit mixed precision pages in feature guide. #3755

Edit mixed precision pages in feature guide. #3755

dwelsch-esi commented Jan 28, 2025

dwelsch-esi Jan 28, 2025

dwelsch-esi Jan 28, 2025

dwelsch-esi Jan 28, 2025

dwelsch-esi Jan 28, 2025

dwelsch-esi Jan 28, 2025

dwelsch-esi Jan 28, 2025

		Not supported.


		The ``apply`` call generates a report detailing how the request was inferred, propagated to other layers, and eventually realized.

Edit mixed precision pages in feature guide. #3755

Are you sure you want to change the base?

Edit mixed precision pages in feature guide. #3755

Conversation

dwelsch-esi commented Jan 28, 2025

dwelsch-esi Jan 28, 2025

Choose a reason for hiding this comment

dwelsch-esi Jan 28, 2025

Choose a reason for hiding this comment

dwelsch-esi Jan 28, 2025

Choose a reason for hiding this comment

dwelsch-esi Jan 28, 2025

Choose a reason for hiding this comment

dwelsch-esi Jan 28, 2025

Choose a reason for hiding this comment

dwelsch-esi Jan 28, 2025

Choose a reason for hiding this comment