Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enh] Named arguments for macros #13

Open
Technologicat opened this issue Oct 2, 2018 · 14 comments
Open

[enh] Named arguments for macros #13

Technologicat opened this issue Oct 2, 2018 · 14 comments

Comments

@Technologicat
Copy link

Technologicat commented Oct 2, 2018

On some occasions, being able to pass named arguments to macros would be useful.

Use case, related to the multilambda macro in unpythonic (rackety lambda with implicit begin, for Python):

from multilambda import macros, λ

myadd = λ(x, y)[print(x, y), x + y]
assert myadd(2, 3) == 5

echo = λ(x='hi there')[print(x), x]  # doesn't work, needs named arg support

(Usage, implementation.)

For the same use case, *args and **kwargs support would be really nice. :)

Thoughts?

[edit] update link.
[edit2] these links are now obsolete; the silly λ macro has been removed.

@azazel75
Copy link
Owner

azazel75 commented Oct 3, 2018

I don't see a compelling argument here.. just "would be really nice"?

@Technologicat
Copy link
Author

The argument is that currently, it's not possible to do certain things - such as, in the use case above, define a λ macro that allows default values for its arguments, or allows *args and/or **kwargs.

The point of this macro is to reduce the amount of typing - not for lambda itself, but shortening lambda arg0, ...: begin(body0, ...) to λ(arg0, ...)[body0, ...] - so that there is no need to type out the begin(), making it more convenient to write multi-expression lambdas.

The example is perhaps a bit silly in that I have no idea if I'll ever use this particular macro in production code - it's so unpythonic that it borders on the limit of good taste. At least I won't use it if there is no way to give defaults to arguments, and it doesn't support *args or **kwargs. :)

I can try to find a better example where the feature is needed; this is just where I first noticed the missing feature, so I thought I'd open a ticket for discussion before I forget.

@Technologicat
Copy link
Author

I don't know if it makes the use case any more compelling, but Technologicat/unpythonic@b51337b makes λ a first-class citizen that can have not only multiple body expressions, but also its own local variables.

To become really useful, λ would need to handle default values for arguments, and *args and **kwargs - i.e. have all the args-handling features of the regular lambda - but currently this can't be done.

I think I could take a look at what it would take to support named args in MacroPy (to support default values for args in λ), but *args and **kwargs are probably better left out until there is no need to support Python 3.4.

@Technologicat
Copy link
Author

Technologicat commented Oct 5, 2018

Here, I made a first cut of this: Technologicat@653b2d2

At least all tests still pass, so I probably didn't break much :3

Here's also an updated λ that uses the new mechanism, for declaring default values: Technologicat/unpythonic@4e2a28c

How it works:

  • There is a new magic in **kw, called kwargs. It gets the named arguments given to the macro invocation (if a Call), as an OrderedDict. (A better name is welcome, but OTOH kwargs is the usual pythonic partner of args.) The key is an str, the value is an AST node.

  • The old kwargs field in MacroData was only used to save the assignment target of a with block; this is now saved in the new field extrakws. This split was done to isolate the user-given named args from the **kw magic dictionary, which MacroPy uses internally.

  • OrderedDict was chosen because the keywords field of a Call is a list; there may be important information encoded in the ordering, so we should preserve it.

As for *args and **kwargs, my suggestion is to ignore that part of my original post; the problem disappears once we upgrade requirements to Python 3.5. Then the extra given arguments are absorbed into new specially formatted items in args and keywords fields of a Call. With the present addition, the mechanism we have now can handle both.

Anyway, named args for macros are now here - thoughts?

[edit] clarify why OrderedDict; mention data types of key and value.
[edit2] fix silly mistakes.

@Technologicat
Copy link
Author

Re-checking PEP 448, the proposed first-cut solution does need a small revisit after upgrading to Python 3.5, because multiple * and ** items may then appear in the same call.

Multiple * pose no problem. The code that already exists in MacroPy can handle the Starred nodes just like any other arg, and let the macro do what it wants with them, since the macro asked for arguments. ;)

Supporting multiple ** requires perhaps dropping the OrderedDict currently proposed here, and just passing through a list of keyword items. Then let the macro do what it wants with them.

@Technologicat
Copy link
Author

Ah, well, second cut:

Technologicat@ddd9d75
Technologicat/unpythonic@80af4b8

Ditched the OrderedDict in favor of just passing through the list of keyword objects. The advantage, beside better 3.5 support, is that (in PG's famous? words) the abstraction is so thin it's practically transparent - the user can now use the Green Tree Snakes docs to understand what the magic kwargs contains.

Now, thoughts? :)

@azazel75
Copy link
Owner

azazel75 commented Oct 5, 2018

Thanks Juha,

but your solution and in general all the situation leaves me more perplexed... I've opened the door to calling the macro with any positional argument or keyword with transform() (even it's still to be refined to give the parameters injected by the machinery protection against being shadowed by those specified by the user) and you propose to augment this somewhat arcane way of passing arguments (that is now the args parameter to the macro for me) that the macro implementer has to parse on his own, maybe with the complication of implementing support for multiple runtime versions...It doesn't seem the right thing to do.

I would rather prefer passing those parameters and keywords as real python objects, not some AST trees, but when expansion happens there nothing running yet so this would be working only for parameters and keywords bound to literals or pure expressions.... I need to think over it and to see some real example... isolated.

Anyway please open a PR with your code. I mean, move your commits to another branch (one per PR) and open a PR from it or your code will not be commentable.

Please post here some example of your lambda macro, with comments so that I can understand what it's meant to do without reading a ton of code

@Technologicat
Copy link
Author

Thanks for the heads-up, I'll make my code commentable and follow up with a PR for discussion.

IMHO, args as an AST is a feature, not a bug; as you said, it's before run-time, so nothing exists yet. Leaving it to each macro to decide what to do with the input ASTs sounds to me it's exactly within the job description of a macro.

Why some form of args - as a minimal example, consider:

@macros.expr
def let(tree, args, **kw):  # args: sequence of ast.Tuple: (k1, v1), (k2, v2), ..., (kn, vn)
    names  = [k.id for k, _ in (a.elts for a in args)]
    values = [v for _, v in (a.elts for a in args)]
    lam = q[lambda: ast_literal[tree]]
    lam.args.args = [arg(arg=x) for x in names]
    return q[ast_literal[lam](ast_literal[values])]

@macros.expr
def letseq(tree, args, **kw):
    if not args:
        return tree
    first, *rest = args
    return let.transform(letseq.transform(tree, *rest), first)

Note .transform killed all boilerplate to write those macro definitions, which is excellent. Usage:

let((x, 1),
    (y, 2))[
      print(x + y)]

letseq((x, 1),
       (x, x+1))[
         print(x)]

The new identifiers are declared as bare names - being able to do this relies on the fact that the input is an AST. Sure, we could place the bindings at the beginning of the tree:

let[((x, 1),
     (y, 2)),
      print(x + y)]

but a separate bindings section looks more readable.

Why some form of named args - it would let us do this:

let(x=1,
    y=2)[
      print(x + y)]

which looks more pythonic. It also allows neat new stuff like args with default values in λ, but I now think let is overall a better example.

Finally, this particular kwargs hack fixes an asymmetry in the API; if you wrote mac(x=5)[...], which is very pythonic, it would be silently ignored, whereas mac(5) would place the 5 into args (as a Num).

Whether an args-like arcane mechanism is needed at all is another question. If it can be axed altogether, that would simplify things. I didn't see this angle before.

In the context of the new .transform, do you have a proposal (idea, not code) on how to handle named args from the use site? Normal run-time code obviously won't call mac.transform(tree, *args, **kwargs); it will invoke the macro as mac(a0, ..., an, k0=v0, ..., km=vm)[...] (hypothetical syntax if named args are allowed).

I think some isolation is needed; it is perfectly valid to define a let variable called gen_sym or similar, and it shouldn't conflict with MacroPy internals. Similarly, the let should not always bind a gen_sym, just because that name happens to exist in **kw. For let (and similarly for λ), there needs to be a way to tell apart user-given vs. MacroPy internal kwargs. Shadowing is only a partial solution, ideally both definitions (if present) should be accessible in the macro code.

@Technologicat
Copy link
Author

Technologicat commented Oct 6, 2018

Since I promised "neat new stuff", here's also an example on λ (all safeties stripped):

@macros.expr
def λ(tree, args, kwargs, **kw):  # <-- requires the kwargs hack
    withdefault_names = [k.arg for k in kwargs]
    defaults = [k.value for k in kwargs]
    names = [k.id for k in args] + withdefault_names
    newtree = do.transform(tree)
    lam = q[lambda: ast_literal[newtree]]
    lam.args.args = [arg(arg=x) for x in names]
    lam.args.defaults = defaults  # for the last n args
    return lam

@macros.expr
def do(tree, **kw):
    ... # beside the point; see unpythonic.syntax

Usage:

echo = λ(myarg="hello")[print(myarg),
                        myarg]
assert echo() == "hello"
assert echo("hi") == "hi"

count = let((x, 0))[
          λ()[x << x + 1,
              x]]
assert count() == 1
assert count() == 2

myadd = λ(x, y)[print("myadding", x, y),
                localdef(tmp << x + y),
                print("result is", tmp),
                tmp]
assert myadd(2, 3) == 5

The essential point is, kwargs is used to capture keyword nodes from the use site, where arg is the name and value is the AST node representing the value. These can be abused as an args-with-defaults declaration. (In a call, named args after positionals; in a function declaration, args with defaults last. Isomorphic, or close enough.)

No *args or **kwargs support yet, but in 3.5, not difficult to add. Just sanity-check there is at most one of * and ** each, and check placement w.r.t. other args. Extending this slightly could give support also for only-by-name args.

[edit] The count example requires let from unpythonic.syntax; the one posted above is there called simple_let and doesn't support assignments. Supporting an "assignment expression" requires some trickery which is beside the point here.

@Technologicat
Copy link
Author

I just obsoleted my λ; this is much more pythonic, not to mention less brittle:

@macros.block
def multilambda(tree, **kw):
    @Walker
    def transform(tree, *, stop, **kw):
        if type(tree) is not Lambda or type(tree.body) is not List:
            return tree
        bodys = Tuple(elts=tree.body.elts, ctx=Load())
        bodys = copy_location(bodys, tree)
        stop()  # don't recurse over what do[] does
        bodys = transform.recurse(bodys)  # but recurse over user code
        tree.body = do.transform(bodys)
        return tree
    yield transform.recurse(tree)

Usage:

with multilambda:
    echo = lambda x: [print(x), x]
    assert echo("hi there") == "hi there"

    count = let((x, 0))[
              lambda: [x << x + 1,
                       x]]
    assert count() == 1
    assert count() == 2

    t = lambda: [[1, 2]]
    assert t() == [1, 2]

The pythonic let use case still stands; there named arguments would be useful.

@catb0t
Copy link

catb0t commented Oct 8, 2018

sorry -- this has little to do with the interesting discussion of late, but i wonder about the contrived code example in the first comment

i don't understand how this works in 2 ways, even in custom MacroPy or unpythonic.

x and y are not known names at the point λ is called. the comment about needing named arguments is later in the code, so surely λ(x, y) works but i don't know how.

myadd = λ(x, y)[print(x, y), x + y]

λ is a function called with two un-assigned symbols which returns a function-like callable indexable object, this is fine.

but as well print(x, y) is unavoidably evaluated as soon as it is seen by the Python interpreter and so the slice / indexing object becomes [None, x + y].

Perhaps from multilambda import macros, λ puts all code after it in a big try...except NameError: block?

And the import also enables some "lazy-loading" feature of the Python interpreter so that print(x, y) is not evaluated to None immediately? Or did you mean to write lambda: print(x, y)?

@Technologicat
Copy link
Author

Technologicat commented Oct 8, 2018

Cat: it's a macro thing. :)

Roughly speaking, a macro intercepts and transforms code before the rest of the interpreter even sees it. It just needs to be valid syntactically, so that Python's parser can convert the source code text to an AST. MacroPy then hands over (relevant parts of) this AST to macros, to be transformed into a new AST. Normal run-time interpretation starts only after all macros have "run" (been expanded). This gives some flexibility normal code doesn't have.

The λ is a macro; it looks like a function call, but it's subtly different. The [...] are part of the macro syntax in MacroPy; they delimit the body, i.e. the main stuff that goes in. The (...) delimit macro arguments (args) - these are also ASTs, just placed inside (...) instead of [...].

Both args and body are sent to the same "call" of the macro. Hence, λ(arg0, ...)[body0, ...] is just one operation, not two. The undefined names are never seen by the interpreter - they are transformed into argument names in a lambda.

Now, unpythonic does a bit of magic - the body of λ gets wrapped with an unpythonic.seq.do, which (in its normal runtime code part) takes a list of regular old Python lambdas, and runs them one by one. (There are some technical details to support variables local to the "do", beside the point here.)

The "do" macro, which is what the λ macro actually inserts, then makes this a bit easier to use, by taking the code entered by the user, and wrapping each item in a lambda - automatically - so that execution is delayed until the underlying unpythonic.seq.do actually runs.

Hope this helps :)

[edit] fix text formatting

@catb0t
Copy link

catb0t commented Oct 8, 2018

Yes, it helps very much! I sort of thought Python tries to resolve names at parse time and complain at runtime, but this is all very interesting to learn :)

@Technologicat
Copy link
Author

Cat: AFAIK, Python basically resolves everything at runtime. Only reserved words such as import and def always mean what we expect them to; almost anything else can be overridden (either by rebinding the original or shadowing it by something more local) from anywhere at any time. :)

I've sometimes tripped over this myself, when writing a context manager, declaring

    def __exit__(self, type, value, traceback):
        ...

and then wondered why a call to the built-in type() from inside that method fails to work. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants