diff --git a/README.md b/README.md index e4ce1c4e04438..f7844e0236087 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -> 🚀 We are granting pilot access to **Ivy\'s Compiler and Transpiler** +> 🚀 We are granting pilot access to **Ivy\'s Tracer and Transpiler** > to some users, [join the waitlist](https://console.unify.ai/) if you > want to test them out! @@ -131,8 +131,8 @@ deploy systems. Feel free to head over to the docs for the full API reference, but the functions you\'d most likely want to use are: ``` python -# Compiles a function into an efficient fully-functional graph, removing all wrapping and redundant code -ivy.compile() +# Traces an efficient fully-functional graph from a function, removing all wrapping and redundant code +ivy.trace_graph() # Converts framework-specific code to a different framework ivy.transpile() @@ -142,8 +142,8 @@ ivy.unify() ``` These functions can be used eagerly or lazily. If you pass the necessary -arguments for function tracing, the compilation/transpilation step will -happen instantly (eagerly). Otherwise, the compilation/transpilation +arguments for function tracing, the tracing/transpilation step will +happen instantly (eagerly). Otherwise, the tracing/transpilation will happen only when the returned function is first invoked. ``` python diff --git a/docs/overview/contributing/error_handling.rst b/docs/overview/contributing/error_handling.rst index be30a9796d38c..2ddfa6be2c4ca 100644 --- a/docs/overview/contributing/error_handling.rst +++ b/docs/overview/contributing/error_handling.rst @@ -26,7 +26,7 @@ This section, "Error Handling" aims to assist you in navigating through some com E with_out=False, E instance_method=False, E test_gradients=False, - E test_compile=None, + E test_trace=None, E as_variable=[False], E native_arrays=[False], E container=[False], @@ -65,7 +65,7 @@ This section, "Error Handling" aims to assist you in navigating through some com E with_out=False, E instance_method=False, E test_gradients=True, - E test_compile=None, + E test_trace=None, E as_variable=[False], E native_arrays=[False], E container=[False], @@ -129,7 +129,7 @@ This section, "Error Handling" aims to assist you in navigating through some com E with_out=False, E instance_method=False, E test_gradients=False, - E test_compile=None, + E test_trace=None, E as_variable=[False], E native_arrays=[False], E container=[False], diff --git a/docs/overview/deep_dive/containers.rst b/docs/overview/deep_dive/containers.rst index 13521ec772f17..bfcc94e048bbe 100644 --- a/docs/overview/deep_dive/containers.rst +++ b/docs/overview/deep_dive/containers.rst @@ -252,8 +252,8 @@ There may be some compositional functions which are not implicitly nestable for One such example is the :func:`ivy.linear` function which is not implicitly nestable despite being compositional. This is because of the use of special functions like :func:`__len__` and :func:`__list__` which, among other functions, are not nestable and can't be made nestable. But we should try to avoid this, in order to make the flow of computation as intuitive to the user as possible. -When compiling the code, the computation graph is **identical** in either case, and there will be no implications on performance whatsoever. -The implicit nestable solution may be slightly less efficient in eager mode, as the leaves of the container are traversed multiple times rather than once, but if performance is of concern then the code should always be compiled in any case. +When tracing the code, the computation graph is **identical** in either case, and there will be no implications on performance whatsoever. +The implicit nestable solution may be slightly less efficient in eager mode, as the leaves of the container are traversed multiple times rather than once, but if performance is of concern then the code should always be traced in any case. The distinction is only really relevant when stepping through and debugging with eager mode execution, and for the reasons outlined above, the preference is to keep compositional functions implicitly nestable where possible. **Shared Nested Structure** diff --git a/docs/overview/deep_dive/ivy_frontends.rst b/docs/overview/deep_dive/ivy_frontends.rst index 8214423af7e3b..ac7f1aab1ea14 100644 --- a/docs/overview/deep_dive/ivy_frontends.rst +++ b/docs/overview/deep_dive/ivy_frontends.rst @@ -92,12 +92,12 @@ The former set of functions map very closely to the API for the Accelerated Line The latter set of functions map very closely to NumPy's well known API. In general, all functions in the :mod:`jax.numpy` namespace are themselves implemented as a composition of the lower-level functions in the :mod:`jax.lax` namespace. -When transpiling between frameworks, the first step is to compile the computation graph into low level python functions for the source framework using Ivy's graph compiler, before then replacing these nodes with the associated functions in Ivy's frontend API. +When transpiling between frameworks, the first step is to trace a computation graph of low level python functions for the source framework using Ivy's tracer, before then replacing these nodes with the associated functions in Ivy's frontend API. Given that all jax code can be decomposed into :mod:`jax.lax` function calls, when transpiling JAX code it should always be possible to express the computation graph as a composition of only :mod:`jax.lax` functions. Therefore, arguably these are the *only* functions we should need to implement in the JAX frontend. -However, in general we wish to be able to compile a graph in the backend framework with varying levels of dynamicism. +However, in general we wish to be able to trace a graph in the backend framework with varying levels of dynamicism. A graph of only :mod:`jax.lax` functions chained together in general is more *static* and less *dynamic* than a graph which chains :mod:`jax.numpy` functions together. -We wish to enable varying extents of dynamicism when compiling a graph with our graph compiler, and therefore we also implement the functions in the :mod:`jax.numpy` namespace in our frontend API for JAX. +We wish to enable varying extents of dynamicism when creating a graph with our tracer, and therefore we also implement the functions in the :mod:`jax.numpy` namespace in our frontend API for JAX. Thus, both :mod:`lax` and :mod:`numpy` modules are created in the JAX frontend API. We start with the function :func:`lax.add` as an example. diff --git a/docs/overview/deep_dive/superset_behaviour.rst b/docs/overview/deep_dive/superset_behaviour.rst index 5e232c7ceabd3..8a68e61eed23d 100644 --- a/docs/overview/deep_dive/superset_behaviour.rst +++ b/docs/overview/deep_dive/superset_behaviour.rst @@ -47,7 +47,7 @@ We've already explained that we should not duplicate arguments in the Ivy functi Does this mean, provided that the proposed argument is not a duplicate, that we should always add this backend-specific argument to the Ivy function? The answer is **no**. When determining the superset, we are only concerned with the pure **mathematics** of the function, and nothing else. -For example, the :code:`name` argument is common to many TensorFlow functions, such as `tf.concat `_, and is used for uniquely identifying parts of the compiled computation graph during logging and debugging. +For example, the :code:`name` argument is common to many TensorFlow functions, such as `tf.concat `_, and is used for uniquely identifying parts of the traced computation graph during logging and debugging. This has nothing to do with the mathematics of the function, and so is *not* included in the superset considerations when implementing Ivy functions. Similarly, in NumPy the argument :code:`subok` controls whether subclasses of the :class:`numpy.ndarray` class should be permitted, which is included in many functions, such as `numpy.ndarray.astype `_. Finally, in JAX the argument :code:`precision` is quite common, which controls the precision of the return values, as used in `jax.lax.conv `_ for example. @@ -129,8 +129,8 @@ The following would be a much better solution: return res You will notice that this implementation involves more lines of code, but this should not be confused with added complexity. -All Ivy code should be graph compiled for efficiency, and in this case all the :code:`if` and :code:`else` statements are removed, and all that remains is the backend functions which were executed. -This new implementation will be compiled to a graph of either one, three, four, or six functions depending on the values of :code:`beta` and :code:`threshold`, while the previous implementation would *always* compile to six functions. +All Ivy code should be traced for efficiency, and in this case all the :code:`if` and :code:`else` statements are removed, and all that remains is the backend functions which were executed. +This new implementation will be traced to a graph of either one, three, four, or six functions depending on the values of :code:`beta` and :code:`threshold`, while the previous implementation would *always* traces to six functions. This does mean we do not adopt the default values used by PyTorch, but that's okay. Implementing the superset does not mean adopting the same default values for arguments, it simply means equipping the Ivy function with the capabilities to execute the superset of behaviours. @@ -167,7 +167,7 @@ With regards to both of these points, Ivy provides the generalized superset impl However, as discussed above, :func:`np.logical_and` also supports the :code:`where` argument, which we opt to **not** support in Ivy. This is because the behaviour can easily be created as a composition like so :code:`ivy.where(mask, ivy.logical_and(x, y), ivy.zeros_like(mask))`, and we prioritize the simplicity, clarity, and function uniqueness in Ivy's API in this case, which comes at the cost of reduced runtime efficiency for some functions when using a NumPy backend. -However, in future releases our automatic graph compilation and graph simplification processes will alleviate these minor inefficiencies entirely from the final computation graph, by fusing multiple operations into one at the API level where possible. +However, in future releases our automatic graph tracing and graph simplification processes will alleviate these minor inefficiencies entirely from the final computation graph, by fusing multiple operations into one at the API level where possible. Maximizing Usage of Native Functionality ---------------------------------------- diff --git a/docs/overview/design.rst b/docs/overview/design.rst index ea32c66512596..a8cc4e382338b 100644 --- a/docs/overview/design.rst +++ b/docs/overview/design.rst @@ -29,7 +29,7 @@ If that sounds like you, feel free to check out the `Deep Dive`_ section after y | back-end functional APIs ✅ | Ivy functional API ✅ | Framework Handler ✅ -| Ivy Compiler 🚧 +| Ivy Tracer 🚧 | | (b) `Ivy as a Transpiler `_ | front-end functional APIs 🚧 diff --git a/docs/overview/design/building_blocks.rst b/docs/overview/design/building_blocks.rst index 3adcc9c4287b6..249e48050e006 100644 --- a/docs/overview/design/building_blocks.rst +++ b/docs/overview/design/building_blocks.rst @@ -355,26 +355,26 @@ A good example is :func:`ivy.lstm_update`, as shown: We *could* find and wrap the functional LSTM update methods for each backend framework which might bring a small performance improvement, but in this case there are no functional LSTM methods exposed in the official functional APIs of the backend frameworks, and therefore the functional LSTM code which does exist for the backends is much less stable and less reliable for wrapping into Ivy. Generally, we have made decisions so that Ivy is as stable and scalable as possible, minimizing dependencies to backend framework code where possible with minimal sacrifices in performance. -Graph Compiler 🚧 +Tracer 🚧 ----------------- “What about performance?” I hear you ask. This is a great point to raise! With the design as currently presented, there would be a small performance hit every time we call an Ivy function by virtue of the added Python wrapping. -One reason we created the graph compiler was to address this issue. +One reason we created the tracer was to address this issue. -The compiler takes in any Ivy function, backend function, or composition, and returns the computation graph using the backend functional API only. +The tracer takes in any Ivy function, backend function, or composition, and returns the computation graph using the backend functional API only. The dependency graph for this process looks like this: .. image:: https://github.com/unifyai/unifyai.github.io/blob/main/img/externally_linked/design/compiler_dependency_graph.png?raw=true :align: center :width: 75% -Let's look at a few examples, and observe the compiled graph of the Ivy code against the native backend code. +Let's look at a few examples, and observe the traced graph of the Ivy code against the native backend code. First, let's set our desired backend as PyTorch. -When we compile the three functions below, despite the fact that each -has a different mix of Ivy and PyTorch code, they all compile to the same graph: +When we trace the three functions below, despite the fact that each +has a different mix of Ivy and PyTorch code, they all trace to the same graph: +----------------------------------------+-----------------------------------------+-----------------------------------------+ |.. code-block:: python |.. code-block:: python |.. code-block:: python | @@ -393,7 +393,7 @@ has a different mix of Ivy and PyTorch code, they all compile to the same graph: | x = ivy.array([[1., 2., 3.]]) | x = torch.tensor([[1., 2., 3.]]) | x = ivy.array([[1., 2., 3.]]) | | | | | | # create graph | # create graph | # create graph | -| graph = ivy.compile_graph( | graph = ivy.compile_graph( | graph = ivy.compile_graph( | +| graph = ivy.trace_graph( | graph = ivy.trace_graph( | graph = ivy.trace_graph( | | pure_ivy, x) | pure_torch, x) | mix, x) | | | | | | # call graph | # call graph | # call graph | @@ -408,7 +408,7 @@ For all existing ML frameworks, the functional API is the backbone that underpin This means that under the hood, any code can be expressed as a composition of ops in the functional API. The same is true for Ivy. Therefore, when compiling the graph with Ivy, any higher-level classes or extra code which does not directly contribute towards the computation graph is excluded. -For example, the following 3 pieces of code all compile to the exact same computation graph as shown: +For example, the following 3 pieces of code all result in the exact same computation graph when traced as shown: +----------------------------------------+-----------------------------------------+-----------------------------------------+ |.. code-block:: python |.. code-block:: python |.. code-block:: python | @@ -427,9 +427,9 @@ For example, the following 3 pieces of code all compile to the exact same comput | | -1, 1, (3, 3)) | -1, 1, (3, 3)) | | # input | b = ivy.zeros((3,)) | b = ivy.zeros((3,)) | | x = ivy.array([1., 2., 3.]) | | | -| | # compile graph | # compile graph | -| # compile graph | graph = ivy.compile_graph( | graph = ivy.compile_graph( | -| net.compile_graph(x) | clean, x, w, b) | unclean, x, w, b) | +| | # trace graph | # trace graph | +| # trace graph | graph = ivy.trace_graph( | graph = ivy.trace_graph( | +| net.trace_graph(x) | clean, x, w, b) | unclean, x, w, b) | | | | | | # execute graph | # execute graph | # execute graph | | net(x) | graph(x, w, b) | graph(x, w, b) | @@ -439,8 +439,8 @@ For example, the following 3 pieces of code all compile to the exact same comput :align: center :width: 75% -This compilation is not restricted to just PyTorch. -Let's take another example, but compile to Tensorflow, NumPy, and JAX: +This tracing is not restricted to just PyTorch. +Let's take another example, but trace to Tensorflow, NumPy, and JAX: +------------------------------------+ |.. code-block:: python | @@ -454,7 +454,7 @@ Let's take another example, but compile to Tensorflow, NumPy, and JAX: | x = ivy.array([[1., 2., 3.]]) | | y = ivy.array([[2., 3., 4.]]) | | # create graph | -| graph = ivy.compile_graph( | +| graph = ivy.trace_graph( | | ivy_func, x, y) | | | | # call graph | @@ -486,13 +486,13 @@ Jax: :width: 75% | -The example above further emphasizes that the graph compiler creates a computation graph consisting of backend functions, not Ivy functions. -Specifically, the same Ivy code compiles to different graphs depending on the selected backend. -However, when compiling native framework code, we are only able to compile a graph for that same framework. -For example, we cannot take torch code and compile this into tensorflow code. +The example above further emphasizes that the tracer creates a computation graph consisting of backend functions, not Ivy functions. +Specifically, the same Ivy code is traced to different graphs depending on the selected backend. +However, when compiling native framework code, we are only able to trace a graph for that same framework. +For example, we cannot take torch code and trace this into tensorflow code. However, we can transpile torch code into tensorflow code (see `Ivy as a Transpiler `_ for more details). -The graph compiler does not compile to C++, CUDA, or any other lower level language. +The tracer is not a compiler and does not compile to C++, CUDA, or any other lower level language. It simply traces the backend functional methods in the graph, stores this graph, and then efficiently traverses this graph at execution time, all in Python. Compiling to lower level languages (C++, CUDA, TorchScript etc.) is supported for most backend frameworks via :func:`ivy.compile`, which wraps backend-specific compilation code, for example: @@ -524,6 +524,6 @@ Therefore, the backend code can always be run with maximal efficiency by compili **Round Up** -Hopefully, this has painted a clear picture of the fundamental building blocks underpinning the Ivy framework, being the backend functional APIs, Ivy functional API, backend handler, and graph compiler 🙂 +Hopefully, this has painted a clear picture of the fundamental building blocks underpinning the Ivy framework, being the Backend functional APIs, Ivy functional API, Backend handler, and Tracer 😄 Please reach out on `discord `_ if you have any questions! diff --git a/docs/overview/design/ivy_as_a_framework.rst b/docs/overview/design/ivy_as_a_framework.rst index bf1201048a94b..fd88a46f8113c 100644 --- a/docs/overview/design/ivy_as_a_framework.rst +++ b/docs/overview/design/ivy_as_a_framework.rst @@ -1,7 +1,7 @@ Ivy as a Framework ================== -On the `Building Blocks `_ page, we explored the role of the backend functional APIs, the Ivy functional API, the framework handler, and the graph compiler. +On the `Building Blocks `_ page, we explored the role of the Backend functional APIs, the Ivy functional API, the Backend handler, and the Tracer. These are parts labeled as (a) in the image below. On the `Ivy as a Transpiler `_ page, we explained the role of the backend-specific frontends in Ivy, and how these enable automatic code conversions between different ML frameworks. diff --git a/docs/overview/design/ivy_as_a_framework/ivy_stateful_api.rst b/docs/overview/design/ivy_as_a_framework/ivy_stateful_api.rst index 3c6574b884d04..c98bb5e860de5 100644 --- a/docs/overview/design/ivy_as_a_framework/ivy_stateful_api.rst +++ b/docs/overview/design/ivy_as_a_framework/ivy_stateful_api.rst @@ -427,18 +427,18 @@ The implementation is as follows: def __init__(self, lr=1e-4, beta1=0.9, beta2=0.999, epsilon=1e-07, inplace=None, - stop_gradients=True, compile_on_next_step=False, + stop_gradients=True, trace_on_next_step=False, dev=None): ivy.Optimizer.__init__( self, lr, inplace, stop_gradients, True, - compile_on_next_step, dev) + trace_on_next_step, dev) self._beta1 = beta1 self._beta2 = beta2 self._epsilon = epsilon self._mw = None self._vw = None self._first_pass = True - self._should_compile = False + self._should_trace = False # Custom Step diff --git a/docs/overview/design/ivy_as_a_transpiler.rst b/docs/overview/design/ivy_as_a_transpiler.rst index 50dd33d747ada..a7497d5b2f6ec 100644 --- a/docs/overview/design/ivy_as_a_transpiler.rst +++ b/docs/overview/design/ivy_as_a_transpiler.rst @@ -1,7 +1,7 @@ Ivy as a Transpiler =================== -On the `Building Blocks `_ page, we explored the role of the backend functional APIs, the Ivy functional API, the backend handler, and the graph compiler. +On the `Building Blocks `_ page, we explored the role of the Backend functional APIs, the Ivy functional API, the Backend handler, and the Tracer. These parts are labelled (a) in the image below. Here, we explain the role of the backend-specific frontends in Ivy, and how these enable automatic code conversions between different ML frameworks. @@ -164,11 +164,11 @@ Again, by chaining these methods together, we can now call :func:`tf.math.cumpro x = torch.tensor([[0., 1., 2.]]) ret = tf.math.cumprod(x, -1) -Role of the Graph Compiler 🚧 +Role of the Tracer 🚧 ----------------------------- -The very simple example above worked well, but what about even more complex PyTorch code involving Modules, Optimizers, and other higher level objects? This is where the graph compiler plays a vital role. -The graph compiler can convert any code into its constituent functions at the functional API level for any ML framework. +The very simple example above worked well, but what about even more complex PyTorch code involving Modules, Optimizers, and other higher level objects? This is where the tracer plays a vital role. +The tracer can convert any code into its constituent functions at the functional API level for any ML framework. For example, let’s take the following PyTorch code and run it using JAX: @@ -192,7 +192,7 @@ For example, let’s take the following PyTorch code and run it using JAX: We cannot simply :code:`import ivy.frontends.torch` in place of :code:`import torch` as we did in the previous examples. This is because the Ivy frontend only supports the functional API for each framework, whereas the code above makes use of higher level classes through the use of the :mod:`torch.nn` namespace. -In general, the way we convert code is by first compiling the code into its constituent functions in the core API using Ivy’s graph compiler, and then we convert this executable graph into the new framework. +In general, the way we convert code is by first decomposing the code into its constituent functions in the core API using Ivy’s tracer, and then we convert this executable graph into the new framework. For the example above, this would look like: .. code-block:: python @@ -200,11 +200,11 @@ For the example above, this would look like: import jax import ivy - jax_graph = ivy.compile_graph(net, x).to_backend('jax') + jax_graph = ivy.trace_graph(net, x).to_backend('jax') x = jax.numpy.array([1., 2., 3.]) jax_graph(x) -However, when calling :func:`ivy.compile_graph` the graph only connects the inputs to the outputs. +However, when calling :func:`ivy.trace` the graph only connects the inputs to the outputs. Any other tensors or variables which are not listed in the inputs are treated as constants in the graph. In this case, this means the learnable weights in the Module will be treated as constants. This works fine if we only care about running inference on our graph post-training, but this won’t enable training of the Module in JAX. @@ -219,15 +219,15 @@ In order to convert a model from PyTorch to JAX, we first must convert the :clas net = ivy.to_ivy_module(net) In its current form, the :class:`ivy.Module` instance thinly wraps the PyTorch model into the :class:`ivy.Module` interface, whilst preserving the pure PyTorch backend. -We can compile this network into a graph using Ivy’s graph compiler like so: +We can trace a graph of this network using Ivy’s tracer like so: .. code-block:: python - net = net.compile_graph() + net = net.trace_graph() In this case, the learnable weights are treated as inputs to the graph rather than constants. -Now, with a compiled graph under the hood of our model, we can call :meth:`to_backend` directly on the :class:`ivy.Module` instance to convert it to any backend of our choosing, like so: +Now, with a traced graph under the hood of our model, we can call :meth:`to_backend` directly on the :class:`ivy.Module` instance to convert it to any backend of our choosing, like so: .. code-block:: python diff --git a/docs/overview/faq.rst b/docs/overview/faq.rst index e74df1b21dff7..6cb113df6a2f7 100644 --- a/docs/overview/faq.rst +++ b/docs/overview/faq.rst @@ -38,17 +38,17 @@ TensorFlow and PyTorch do allow dynamic sizes, but only on certain backends. Dynamic sizes require a dynamic memory manager, which CPUs/GPUs have, but XLA currently doesn't. How does Ivy deal with all of this? -**A:** Ivy assumes dynamic shapes are supported, but an error will be thrown if/when the function is compiled with dynamic shapes enabled, but the backend does not support dynamic shapes in the compiled graph. -For now, fully framework-agnostic compiled graphs are only possible for static graphs. +**A:** Ivy assumes dynamic shapes are supported, but an error will be thrown if/when the function is traced with dynamic shapes enabled, but the backend does not support dynamic shapes in the traced graph. +For now, fully framework-agnostic traced graphs are only possible for static graphs. Type and Shape Checking ----------------------- **Q:** What kind of type system does Ivy use? Does it do shape-checking of tensors? If so, how does it handle dynamic sizes? The gold standard here is a fully dependent type system, but this is very rare, with the exception of `dex`_. -**A:** The checks performed during graph compilation will remain backend-specific. -The function :func:`ivy.compile` wraps the backend compilation functions, for example :func:`jax.jit`, :func:`tf.function`, :func:`torch.jit.script` and :func:`torch.jit.trace`. -For some backends, shape-checking will be performed during the compilation phase and for others it will not. +**A:** The checks performed during compiling will remain backend-specific. +The function :func:`ivy.compile` wraps the backend tracing functions, for example :func:`jax.jit`, :func:`tf.function`, :func:`torch.jit.script` and :func:`torch.jit.trace`. +For some backends, shape-checking will be performed during the tracing phase and for others it will not. GPU handling ------------ @@ -62,7 +62,7 @@ Model Deployment **Q:** Does Ivy support model deployment? **A:** Yes, Ivy will support efficient model deployment. -However, currently this feature is not yet supported as the graph compiler module is still under development, and will be released soon with ivy version 1.2.0. +However, currently this feature is not yet supported as the tracer module is still under development, and will be released soon with ivy version 1.2.0. Dynamic Control Flow @@ -78,9 +78,9 @@ How will Ivy handle dynamic control flow? Will Ivy parse python ASTs? **A:** For now, Ivy will not support dynamic control flow by parsing ASTs. -The dynamism of :code:`for` loops and :code:`while` loops will be ignored during compilation, and just the static trace which chains the array operations performed during the forward pass at compile time will be preserved. +The dynamism of :code:`for` loops and :code:`while` loops will be ignored during tracing, and just the static trace which chains the array operations performed during the forward pass at tracing time will be preserved. -However, Ivy will support the compilation of looping and branching methods such as :code:`lax.scan`, :code:`lax.while`, :code:`tf.while`, :code:`tf.cond` etc. +However, Ivy will support the tracing of looping and branching methods such as :code:`lax.scan`, :code:`lax.while`, :code:`tf.while`, :code:`tf.cond` etc. In cases where there is not an associated compilable method in other backends, we will strive to implement this as a composition of existing compilable operations. If such a composition is not possible, then we will instead convert these to compositions of pure Python :code:`for`, :code:`while` and :code:`if` statements (when using a PyTorch backend for example). @@ -121,7 +121,7 @@ We’re very happy in either case! Support for Functions --------------------- -**Q:** Is it possible to compile tensor code into a reusable and differentiable function? If you can't, then it will be difficult to apply any fancy kernel fusion algorithms, and you can expect to lose a lot of performance. +**Q:** Is it possible to trace tensor code into a reusable and differentiable function? If you can't, then it will be difficult to apply any fancy kernel fusion algorithms, and you can expect to lose a lot of performance. What about higher-order operations, like :code:`jax.vmap` and :code:`jax.pmap`? **A:** Most functions in Ivy are *primary* functions, which are generally implemented as light wrapping around a near-identical backend-specific function, which itself will likely map to an efficient kernel. @@ -137,7 +137,7 @@ Alternative Data Structures **Q:** Will Ivy support data structures such as tuples, dictionaries, lists etc.? For example, JAX code is full of them. **A:** We will of course support these structures in pure python code, but we will not support backend-specific alternative compilable data structures. -While Ivy will not provide an interface to these data structures directly, Ivy code can easily supplement JAX code which does contain these data structures, and both can be compiled together without issue. +While Ivy will not provide an interface to these data structures directly, Ivy code can easily supplement JAX code which does contain these data structures, and both can be traced together without issue. Ivy can act as a supplementary framework if/when some of the more unique backend-specific data structures are required. Custom Operations diff --git a/docs/overview/get_started.rst b/docs/overview/get_started.rst index 9d891f143e5c6..42b3e1e1f12c3 100644 --- a/docs/overview/get_started.rst +++ b/docs/overview/get_started.rst @@ -3,8 +3,8 @@ Get Started .. - If you want to use **Ivy's compiler and transpiler**, make sure to follow the - :ref:`setting up instructions for the API key ` + If you want to use **Ivy's tracer and transpiler**, make sure to follow the + :ref:`setting up instructions for the API key ` after installing Ivy! @@ -56,10 +56,10 @@ the `Contributing - Setting Up `_ page, where OS-specific and IDE-specific instructions and video tutorials to install Ivy are available! -Ivy's compiler and transpiler +Ivy's tracer and transpiler ----------------------------- -To use Ivy's compiler and transpiler, you'll need an **API key**. We are starting to +To use Ivy's tracer and transpiler, you'll need an **API key**. We are starting to grant pilot access to certain users, so you can `join the waitlist `_ if you want to get one! @@ -84,8 +84,8 @@ For reference, this would be equivalent to: Issues and Questions ~~~~~~~~~~~~~~~~~~~~ -If you find any issue or bug while using the compiler and/or the transpiler, please -raise an `issue in GitHub `_ and add the ``compiler`` +If you find any issue or bug while using the tracer and/or the transpiler, please +raise an `issue in GitHub `_ and add the ``tracer`` or the ``transpiler`` label accordingly. A member of the team will get back to you ASAP! Otherwise, if you haven't found a bug but want to ask a question, suggest something, or get help diff --git a/docs/overview/glossary.rst b/docs/overview/glossary.rst index a7fec1b41f195..e00facf819e3b 100644 --- a/docs/overview/glossary.rst +++ b/docs/overview/glossary.rst @@ -30,10 +30,10 @@ All of these new words can get confusing! We've created a glossary to help nail A wrapper function around native compiler functions, which uses lower level compilers such as XLA to compile to lower level languages such as C++, CUDA, TorchScript, etc. Graph Compiler - Graph compilers map the high-level computational graph coming from frameworks to operations that are executable on a specific device. + Graph Compilers map the high-level computational graph coming from frameworks to operations that are executable on a specific device. - Ivy Graph Compiler - Ivy's Graph Compiler traces the graph as a composition of functions in the functional API in Python. + Ivy Tracer + Ivy's Tracer creates a graph as a composition of functions in the functional API in Python. Ivy Functional API Is used for defining complex models, the Ivy functional API does not implement its own backend but wraps around other frameworks functional APIs and brings them into alignment. diff --git a/docs/overview/one_liners.rst b/docs/overview/one_liners.rst index 0b11527b0b132..e3c53cbff6e47 100644 --- a/docs/overview/one_liners.rst +++ b/docs/overview/one_liners.rst @@ -4,10 +4,10 @@ One liners .. grid:: 1 1 3 3 :gutter: 4 - .. grid-item-card:: ``ivy.compile()`` - :link: one_liners/compile.rst + .. grid-item-card:: ``ivy.trace_graph()`` + :link: one_liners/trace.rst - Compiles a ``Callable`` or set of them into an Ivy graph. + Traces a ``Callable`` or set of them into an Ivy graph. .. grid-item-card:: ``ivy.transpile()`` :link: one_liners/transpile.rst @@ -25,6 +25,6 @@ One liners :hidden: :maxdepth: -1 - one_liners/compile.rst + one_liners/trace.rst one_liners/transpile.rst one_liners/unify.rst diff --git a/docs/overview/one_liners/compile.rst b/docs/overview/one_liners/trace.rst similarity index 69% rename from docs/overview/one_liners/compile.rst rename to docs/overview/one_liners/trace.rst index 98d3cfd826a3a..05000be5870d2 100644 --- a/docs/overview/one_liners/compile.rst +++ b/docs/overview/one_liners/trace.rst @@ -1,35 +1,35 @@ -``ivy.compile()`` -================= +``ivy.trace_graph()`` +===================== .. - ⚠️ **Warning**: The compiler and the transpiler are not publicly available yet, so certain parts of this doc won't work as expected as of now! + ⚠️ **Warning**: The tracer and the transpiler are not publicly available yet, so certain parts of this doc won't work as expected as of now! When we call an Ivy function, there is always a small performance hit due to added Python wrapping. This overhead becomes increasingly noticeable when we use large -models with multiple function calls. The Graph Compiler improves the performance of +models with multiple function calls. The Tracer improves the performance of Ivy by removing the extra wrapping around each function call. -The Graph Compiler takes in any Ivy function, framework-specific (backend) function, +The Tracer takes in any Ivy function, framework-specific (backend) function, or composition of both, and produces a simplified executable computation graph composed of functions from the backend functional API only, which results in: -- Simplified code: The Graph Compiler simplifies the code by removing all the wrapping +- Simplified code: The Tracer simplifies the code by removing all the wrapping and functions that don't contribute to the output: print statements, loggers, etc. -- Improved performance: The compiled graph has no performance overhead due to Ivy's +- Improved performance: The created graph has no performance overhead due to Ivy's function wrapping, likewise, redundant operations from the original function are also removed, increasing its overall performance. -Compiler API +Tracer API ------------ -.. py:function:: ivy.compile(*objs, stateful = None, arg_stateful_idxs = None, kwarg_stateful_idxs = None, to = None, include_generators = True, array_caching = True, return_backend_compiled_fn = False, static_argnums = None, static_argnames = None, args = None, kwargs = None,) +.. py:function:: ivy.trace_graph(*objs, stateful = None, arg_stateful_idxs = None, kwarg_stateful_idxs = None, to = None, include_generators = True, array_caching = True, return_backend_traced_fn = False, static_argnums = None, static_argnames = None, args = None, kwargs = None,) - Compiles a ``Callable`` or set of them into an Ivy graph. If ``args`` or ``kwargs`` are specified, + Creates a ``Callable`` or set of them into an Ivy graph. If ``args`` or ``kwargs`` are specified, compilation is performed eagerly, otherwise, compilation will happen lazily. - :param objs: Callable(s) to compile and create a graph of. + :param objs: Callable(s) to trace and create a graph of. :type objs: ``Callable`` :param stateful: List of instances to be considered stateful during the graph compilation. :type stateful: ``Optional[List]`` @@ -37,14 +37,14 @@ Compiler API :type arg_stateful_idxs: ``Optional[List]`` :param kwarg_stateful_idxs: Keyword arguments to be considered stateful during the graph compilation. :type kwarg_stateful_idxs: ``Optional[List]`` - :param to: Backend that the graph will be compiled to. If not specified, the current backend will be used. + :param to: Backend that the graph will be traced to. If not specified, the current backend will be used. :type to: ``Optional[str]`` :param include_generators: Include array creation/generation functions as part of the graph. :type include_generators: ``bool`` :param array_caching: Cache the constant arrays that appear as arguments to the functions in the graph. :type array_caching: ``bool`` - :param return_backend_compiled_fn: Whether to apply the native compilers, i.e. tf.function, after ivy's compilation. - :type return_backend_compiled_fn: ``bool`` + :param return_backend_traced_fn: Whether to apply the native compilers, i.e. tf.function, after ivy's compilation. + :type return_backend_traced_fn: ``bool`` :param static_argnums: For jax's jit compilation. :type static_argnums: ``Optional[Union[int, Iterable[int]]]`` :param static_argnames: For jax's jit compilation. @@ -54,12 +54,12 @@ Compiler API :param kwargs: Keyword arguments for obj. :type kwargs: ``Optional[dict]`` :rtype: ``Union[Graph, LazyGraph, ivy.Module, ModuleType]`` - :return: A compiled ``Graph`` or a non-initialized ``LazyGraph``. If the object is an ``ivy.Module``, the forward pass will be compiled and the same module will be returned. If the object is a ``ModuleType``, the function will return a copy of the module with every method lazily compiled. + :return: A ``Graph`` or a non-initialized ``LazyGraph``. If the object is an ``ivy.Module``, the forward pass will be traced and the same module will be returned. If the object is a ``ModuleType``, the function will return a copy of the module with every method lazily traced. -Using the compiler +Using the tracer ------------------ -To use the ``ivy.compile()`` function, you need to pass a callable object and the corresponding inputs +To use the ``ivy.trace_graph()`` function, you need to pass a callable object and the corresponding inputs to the function. Let's start with a simple function: @@ -81,10 +81,10 @@ Let's start with a simple function: x = ivy.array([1, 2, 3]) y = ivy.array([2, 3, 4]) - # Compile the function - compiled_fn = ivy.compile(fn, args=(x, y)) + # Trace the function + traced_fn = ivy.trace_graph(fn, args=(x, y)) -In this case, the compiled graph would be: +In this case, the created graph would be: .. image:: https://raw.githubusercontent.com/unifyai/unifyai.github.io/main/img/externally_linked/compiler/figure1.png @@ -93,49 +93,49 @@ From the graph, we can observe that: 1. As ``x`` and ``y`` are the only variables used when calculating the returned value ``z``, the non-contributing variable(s), ``k`` was not included in the graph. Function calls that don't contribute to the output like the ``print`` function were also excluded. -2. As we set the backend to ``torch`` during the compilation process, the compiled +2. As we set the backend to ``torch`` during the compilation process, the traced functions are torch functions, and the input and output types are torch tensors. 3. The tensor shape in the graph only indicates the shape of the inputs the graph was - traced with. The compiler doesn't impose additional restrictions on the shape or + traced with. The tracer doesn't impose additional restrictions on the shape or datatype of the input array(s). .. code-block:: python # Original set of inputs - out = compiled_fn(x, y) + out = traced_fn(x, y) # Inputs of different shape a = ivy.array([[1., 2.]]) b = ivy.array([[2., 3.]]) # New set of inputs - out = compiled_fn(a, b) + out = traced_fn(a, b) Eager vs lazy Compilation ~~~~~~~~~~~~~~~~~~~~~~~~~ -The graph compiler runs the original function under the hood and tracks its computation -to create the compiled graph. The **eager compilation** method traces the graph in the -corresponding function call with the specified inputs before we use the compiled +The Tracer runs the original function under the hood and tracks its computation +to create the created graph. The **eager compilation** method traces the graph in the +corresponding function call with the specified inputs before we use the traced function. -Instead of compiling functions before using them, Ivy also allows you to compile the +Instead of compiling functions before using them, Ivy also allows you to trace the function dynamically. This can be done by passing only the function to the -compile method and not including the function arguments. In this case, the output will be a +trace method and not including the function arguments. In this case, the output will be a ``LazyGraph`` instead of a ``Graph`` instance. When this ``LazyGraph`` object is first invoked with -function arguments, it compiles the function and returns the output of the compiled +function arguments, it Creates the function and returns the output of the traced function. Once the graph has been initialized, calls to the ``LazyGraph`` object will -use the compiled function to compute the outputs directly. +use the traced function to compute the outputs directly. .. code-block:: python - # Compile the function eagerly (compilation happens here) - eager_graph = ivy.compile(fn, args=(x, y)) + # Trace the function eagerly (compilation happens here) + eager_graph = ivy.trace_graph(fn, args=(x, y)) - # Compile the function lazily (compilation does not happen here) - lazy_graph = ivy.compile(fn) + # Trace the function lazily (compilation does not happen here) + lazy_graph = ivy.trace_graph(fn) - # Compile and return the output + # Trace and return the output out = lazy_graph(x, y) To sum up, lazy compilation enables you to delay the compilation process until you have @@ -144,12 +144,12 @@ compiling libraries, where it’s not feasible to provide valid arguments for ev function call. Now let's look at additional functionalities that you can find in the -compiler. +tracer. Array caching ~~~~~~~~~~~~~ -The compiler is able to cache constant arrays and their operations through the +The tracer is able to cache constant arrays and their operations through the ``array_caching`` flag, reducing computation time after compilation. .. code-block:: python @@ -164,9 +164,9 @@ The compiler is able to cache constant arrays and their operations through the z = x ** (a + b) return z - comp_func = ivy.compile(fn, args=(x,)) + comp_func = ivy.trace_graph(fn, args=(x,)) -When calling ``ivy.compile()``, the ``array_caching`` argument is set to ``True`` by +When calling ``ivy.trace_graph()``, the ``array_caching`` argument is set to ``True`` by default, which returns the following graph. .. image:: https://raw.githubusercontent.com/unifyai/unifyai.github.io/main/img/externally_linked/compiler/figure2.png @@ -196,7 +196,7 @@ are included as nodes or "baked" into the graph. z = x ** a return z + torch.rand([1]) - comp_func = ivy.compile(fn, include_generators=True, args=(x,)) + comp_func = ivy.trace_graph(fn, include_generators=True, args=(x,)) Returns: @@ -215,7 +215,7 @@ And instead, z = x * a return z + torch.rand([1]) - comp_func = ivy.compile(fn, include_generators=False, args=(x,)) + comp_func = ivy.trace_graph(fn, include_generators=False, args=(x,)) Returns: @@ -241,32 +241,32 @@ arbitrary classes using the ``stateful`` parameters. cont = ivy.Container(x=x) args = (cont.cont_deep_copy(), x) - comp_func = ivy.compile(fn, arg_stateful_idxs=[[0]], args=args) + comp_func = ivy.trace_graph(fn, arg_stateful_idxs=[[0]], args=args) .. image:: https://raw.githubusercontent.com/unifyai/unifyai.github.io/main/img/externally_linked/compiler/figure6.png Sharp bits ---------- -As some parts of the graph compiler are still under development, there are some sharp +As some parts of the Tracer are still under development, there are some sharp bits to take into account when using it. All of these points are WIP, so they'll be removed soon! -1. **Dynamic control flow**: The compiled graph is built using function tracing at the +1. **Dynamic control flow**: The created graph is built using function tracing at the moment, so dynamic control flow such as conditional branches or conditional loops will not be registered correctly. As an example, if there is a while loop in your code that depends on a changing value, the number of iterations in the final graph will be the same as the number of iterations performed with the input passed to the - compile function. -2. **Non-framework-specific code**: As the compiler traces the function using the + trace function. +2. **Non-framework-specific code**: As the tracer traces the function using the functional API of the underlying framework, any piece of code inside the model that is not from the said framework will not be correctly registered, this includes other frameworks code (such as NumPy statements inside a torch model) or python statements such as len(). 3. **Incorrectly cached parts of the graph**: There are certain cases where compilation can succeed but hide some cached parts of the graph which shouldn't really be cached. - To check this, it's recommended to compile with a noise array of the same shape and - then check if the output of the original function and the compiled graph with another + To check this, it's recommended to trace with a noise array of the same shape and + then check if the output of the original function and the created graph with another input is the same. If you find out that the graph is not right, feel free to open an `issue `_ with a minimal example and we'll look into it! @@ -274,7 +274,7 @@ removed soon! Examples -------- -Below, we compile a ResNet50 model from +Below, we trace a ResNet50 model from `Hugging Face `_ and use it to classify the breed of a cat. @@ -306,15 +306,15 @@ Normally, we would then feed these inputs to the model itself without compiling with torch.no_grad(): logits = model(**inputs).logits -With ivy, you can compile your model to a computation graph for increased performance. +With ivy, you can trace your model to a computation graph for increased performance. .. code-block:: python # Compiling the model - compiled_graph = ivy.compile(model, args=(**inputs,)) + traced_graph = ivy.trace_graph(model, args=(**inputs,)) - # Using the compiled function - logits = compiled_graph(**inputs).logits + # Using the traced function + logits = traced_graph(**inputs).logits Time for the final output of our computation graph. diff --git a/docs/overview/one_liners/transpile.rst b/docs/overview/one_liners/transpile.rst index 701be359e3165..ecd435b7b7362 100644 --- a/docs/overview/one_liners/transpile.rst +++ b/docs/overview/one_liners/transpile.rst @@ -1,9 +1,9 @@ ``ivy.transpile()`` -================= +=================== .. - ⚠️ **Warning**: The compiler and the transpiler are not publicly available yet, so certain parts of this doc won't work as expected as of now! + ⚠️ **Warning**: The tracer and the transpiler are not publicly available yet, so certain parts of this doc won't work as expected as of now! Ivy's Transpiler converts a function written in any framework into your framework of @@ -24,10 +24,10 @@ want to use to research, develop, or deploy systems. So if you want to: Ivy's Transpiler is definitely the tool for the job 🔧 -To convert the code, it traces a computational graph using the Graph Compiler and +To convert the code, it traces a computational graph using the Tracer and leverages Ivy's frontends and backends to link one framework to another. After swapping each function node in the computational graph with their equivalent Ivy frontend -functions, the compiler removes all the wrapping in the frontends and replaces them with the native +functions, the tracer removes all the wrapping in the frontends and replaces them with the native functions of the target framework. @@ -61,7 +61,7 @@ Transpiler API Using the transpiler -------------------- -Similar to the ``ivy.compile`` function, ``ivy.unify`` and ``ivy.transpile`` can be used +Similar to the ``ivy.trace`` function, ``ivy.unify`` and ``ivy.transpile`` can be used eagerly and lazily. If you pass the necessary arguments, the function will be called instantly, otherwise, transpilation will happen the first time you invoke the function with the proper arguments. @@ -178,7 +178,7 @@ another, at the moment we support ``torch.nn.Module`` when ``to="torch"``, Sharp bits ---------- -In a similar fashion to the compiler, the transpiler is under development and we are +In a similar fashion to the trace, the transpiler is under development and we are still working on some rough edges. These include: 1. **Keras model subclassing**: If a model is transpiled to keras, the resulting @@ -195,15 +195,15 @@ still working on some rough edges. These include: 3. **Haiku transform with state**: As of now, we only support the transpilation of transformed Haiku modules, this means that ``transformed_with_state`` objects will not be correctly transpiled. -4. **Array format between frameworks**: As the compiler outputs a 1-to-1 mapping of the - compiled function, the format of the tensors is preserved when transpiling from a +4. **Array format between frameworks**: As the tracer outputs a 1-to-1 mapping of the + traced function, the format of the tensors is preserved when transpiling from a framework to another. As an example, if you transpile a convolutional block from PyTorch (which uses ``N, C, H, W``) to TensorFlow (which uses ``N, H, W, C``) and want to use it as part of a bigger (TensorFlow) model, you'll need to include a permute statement for the inference to be correct. -Keep in mind that the transpiler uses the graph compiler under the hood, so the -:ref:`sharp bits of the compiler ` +Keep in mind that the transpiler uses the Tracer under the hood, so the +:ref:`sharp bits of the tracer ` apply here as well! Examples diff --git a/docs/overview/one_liners/unify.rst b/docs/overview/one_liners/unify.rst index a07ac2fbf5b40..687ab07293f1f 100644 --- a/docs/overview/one_liners/unify.rst +++ b/docs/overview/one_liners/unify.rst @@ -1,9 +1,9 @@ ``ivy.unify()`` -================ +=============== .. - ⚠️ **Warning**: The compiler and the transpiler are not publicly available yet, so certain parts of this doc won't work as expected as of now! + ⚠️ **Warning**: The tracer and the transpiler are not publicly available yet, so certain parts of this doc won't work as expected as of now! Ivy's Unify function is an alias for ``ivy.transpile(..., to="ivy", ...)``. You can know more about the transpiler in the `transpile() `_ page. diff --git a/docs/overview/related_work/what_does_ivy_add.rst b/docs/overview/related_work/what_does_ivy_add.rst index 14a407d24a751..da6dc8a94ddb5 100644 --- a/docs/overview/related_work/what_does_ivy_add.rst +++ b/docs/overview/related_work/what_does_ivy_add.rst @@ -51,11 +51,11 @@ It therefore extends what is possible in any of the specific individual framewor Graph Tracers ------------- -Ivy’s `Graph Compiler <../one_liners/compile>`_ exhibits similar properties to many of the framework-specific graph tracers. -Ivy’s graph compiler employs function tracing for computing the graph, and uses this graph as an intermediate representation during the transpilation process. -Of all the graph tracers, Ivy’s graph compiler is most similar to `torch.fx`_. +Ivy’s `Tracer <../one_liners/trace>`_ exhibits similar properties to many of the framework-specific graph tracers. +Ivy’s tracer employs function tracing for computing the graph, and uses this graph as an intermediate representation during the transpilation process. +Of all the graph tracers, Ivy’s tracer is most similar to `torch.fx`_. This is because :code:`torch.fx` also operates entirely in Python, without deferring to lower level languages for tracing and extracting the computation graph or the intermediate representation. -The main difference is that Ivy’s graph compiler is fully framework-agnostic; Ivy’s compiler is able to compile graphs from any framework, while framework-specific compilers are of course bound to their particular framework. +The main difference is that Ivy’s tracer is fully framework-agnostic; Ivy’s tracer is able to trace graphs from any framework, while framework-specific tracers are of course bound to their particular framework. Exchange Formats ----------------