Questions on writing a custom interpreter #10663

jatentaki · 2022-05-11T12:09:08Z

jatentaki
May 11, 2022

Hello, I'm trying to implement a custom interpreter that would cast intermediate values between full and half precision to use performant half-precision primitives where available, without compromising the results on brittle ops like reduce_sum or exp. I give more context here. I'm following the guide here, my code is visible here and I have two main questions.

Firstly, how to handle non-unary functions? The tutorial says "we handle unary functions so we don't worry about flattening/unflattening". I can't make this assumption and I can't figure out where to get info about the output tree shape, such that I can tree_unflatten the outputs of my routine.

Secondly, how do I handle subexpressions? I assume it should be somehow possible to recursively apply amp on the subfuns obtained in line subfuns, bind_params = eqn.primitive.get_bind_params(eqn.params) , but those are actually of type linear_util.WrappedFun which confuses me a bit. Currently, applying my code on a function transformed with jit makes my interpreter have no effect.

Finally I would like to ask for feedback regarding this idea, in general. In the linked discussion thread in Flax I get the feedback that implementing AMP as a JAX transform, rather than in Flax specifically, is a good idea. That said, I recently found a haiku implementation which seems to be haiku-specific and rather complicated to use, compared to a functional transform. Is there a reason the authors would like to implement it in a haiku-specific way? Note: I am aware that this project also needs a loss/gradient scaling method, but that is already provided in Flax.

Thank you for your help.

jakevdp · 2022-05-11T16:27:24Z

jakevdp
May 11, 2022
Maintainer

This is complicated and not well documented, and unfortunately there's not really any one-size-fits-all solution (it depends a lot on what you want your transform to do). But here's one example of flattening arguments to a function before passing it to a custom jaxpr evaluator: https://github.com/google/jax/blob/1b0be5095a62064820301d1fa25f3c38596e1ae2/jax/experimental/sparse/transform.py#L394-L402

One way to proceed is roughly this:

call tree_flatten on your input arguments to get the flattened arguments and in_tree
convert these flattened args to flattened abstract values
call flatten_fun with in_tree to get the out_tree callback
call trace_to_jaxpr_dynamic to get the flattened out avals
evaluate the jaxpr (with your custom jaxpr interpreter) on the flattened inputs to get the flattened outputs
plug these and out_tree() into tree_unflatten to reconstruct the results.

Let me know if anything is unclear!

0 replies

sharadmv · 2022-05-17T01:55:46Z

sharadmv
May 17, 2022
Collaborator

Adding on to Jake's answer:

Handling higher-order primitives like xla_call and xla_pmap is a bit tricky. Since the internals are rapidly changing, I can't provide a super concrete solution that is future-proof. I can point you to code that is generally kept up-to-date though.

The high-level idea is to wrap a recursive call to your interpreter in lu.wrap_init and pass that into the higher-order primitive. Oryx has a couple examples of handling these primitives:

the effect_handler interpreter: see the call_jaxpr logic in eval_jaxpr_with_state, default_call_interpreter_rule, and the other call rules.
the propagate interpreter: this is a more complex interpreter that goes forward and backwards but still has call rules that recursively call propagate. See call_rule for an example that also has flattening/unflattening in the recursive calls.

0 replies

jhn-nt · 2025-01-06T11:20:45Z

jhn-nt
Jan 6, 2025

Hi All,

I am a novice to jax and I am trying to learn its inner machinery, and, first let me say, I am huge fun of the works of yours!
Thanks for sharing it with the community

Anyhow, I am writing you as I am interested in developing a "final style" custom interpreter, as suggested here.
I went thorugh the code of Oryx,sparse and autodidax, however I am having a hard time understanding how to deal with pjit, specifically how to make my tracer work even without using jax.disable_jit()

Here is a simple example I am working on where I want to interpret a sum with a multiplication:

from jax import core, lax, grad, jit 
from jax.interpreters import partial_eval
import jax



def fun(a,b):
    return a+b

class SwapTracer(core.Tracer):
    def __init__(self,trace,value):
        self._trace=trace
        self.value=value

    @property
    def aval(self):
        return core.get_aval(self.value)
    
    def full_lower(self):
        return self


class SwapTrace(core.Trace):
    def __init__(self,parent_trace):
        self.parent_trace=parent_trace

    def process_primitive(self, primitive, tracers, params):
        print(f"Primitive: {primitive}")
        print(f"Tracers: {tracers}")
        print(f"Params: {params}")
        invals=[tracer.value if isinstance(tracer,SwapTracer) else tracer for tracer in tracers]

        print(f"Invals: {invals}")
        if primitive is lax.add_p:
            with core.set_current_trace(self.parent_trace):
                outvals=lax.mul_p.bind(*invals,**params)
        else:
            outvals=primitive.bind_with_trace(self.parent_trace,invals,params)
        print(f"Outvals: {outvals}")

        if primitive.multiple_results:
            out_tracers=[SwapTracer(self,outval) for outval in outvals]
        else:
            out_tracers=SwapTracer(self,outvals)

        print(f"Outtracers: {out_tracers}\n")
        return out_tracers 
    
    def process_call(self, call_primitive, f, tracers, params):
        print(f"Call Primitive: {call_primitive}")
        print(f"Tracers: {tracers}")
        print(f"Params: {params}")

        invals=[tracer.value if isinstance(tracer,SwapTracer) else tracer for tracer in tracers]
        outvals=call_primitive.bind(swap(f),*invals,**params)
        return SwapTracer(self,outvals)
    



def swap(f):
    def wrapped(*args):
        with core.take_current_trace() as parent_trace:
            print(f"Parent:{parent_trace}")
            trace=SwapTrace(parent_trace)
            print(f"Current:{trace}\n")
            in_tracers=[SwapTracer(trace,arg) for arg in args]
            with core.set_current_trace(trace):
                out_tracers=f(*in_tracers)
        return  out_tracers.value
    return wrapped
     

# This works
with jax.disable_jit(disable=True):
    result=swap(fun)(1.,5.)
print(f"Result without pjit: {result}\n")

# How to intercept PJIT?
with jax.disable_jit(disable=False):
    result=swap(fun)(1.,5.)
print(f"Result with pjit: {result}\n")

Please apologise if the question is trivial, but I cannot to seem to find a way around it

and Thank you very much

Best
Giovanni

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions on writing a custom interpreter #10663

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Questions on writing a custom interpreter #10663

jatentaki May 11, 2022

Replies: 3 comments

jakevdp May 11, 2022 Maintainer

sharadmv May 17, 2022 Collaborator

jhn-nt Jan 6, 2025

jatentaki
May 11, 2022

jakevdp
May 11, 2022
Maintainer

sharadmv
May 17, 2022
Collaborator

jhn-nt
Jan 6, 2025