Migrate `argparse` CLI definition to a `pydantic` basis for most important commands #438

simeoncarstens · 2024-01-10T16:17:13Z

As described in #433, we would like to add a HTTP API for looper. The challenge here is to keep HTTP API and CLI in sync, and in a call, @nsheff, @zz1874 and I decided to address this by replacing the argparse-based CLI command / argument definition with a definition based on pydantic models: several libraries (for example, pydantic-argparse, tyro and clipstick) allow to build CLI parsers with type checking from pydantic.BaseModel definitions.

So we agreed that our first task towards a HTTP API for looper is to redefine import CLI commands (looper {run,runp,check}) via pydantic models. Building an HTTP API that consumes data compliant with these models is then straightforward.

For now, we assume we will be using pydantic-argparse, and, as agreed with @nsheff, will build pydantic models that are compatible with it. This issue and comments below track the progress of this task.

The text was updated successfully, but these errors were encountered:

simeoncarstens · 2024-01-11T08:41:21Z

We (@zz1874, I) wrote a hacky demo script that uses a CLI based on pydantic-argparse to run the hello_looper example. You can find it on a branch in our fork of the repository. It serves as a proof-of-concept of how to define CLI arguments in a pydantic model.

simeoncarstens · 2024-01-11T09:54:37Z

Going forward, I (this has not been discussed with neither @nsheff nor @zz1874) propose to organize the pydantic-based CLI rewrite as follows:

At the core, we introduce a dataclass called Argument, which holds all kind of information we need to know about a CLI argument / flag. That includes for now:

argument name (e.g. "command-extra"),
type (e.g., str),
default value (e.g., ""),
which commands support that argument (e.g. ("run", "runp", "rerun"), or possibly treat commands as an enum).

We then create a data structure (a tuple, or an enum) that holds an Argument instance for each argument there is. We can then draw from this pool of argument definitions to dynamically create pydantic models for each command via pydantic.create_model(). This approach allows us to define arguments only once and reuse these definitions easily for different commands.
I am working on a first implementation of this approach in the following branch: https://github.com/tweag/looper/tree/tweag/pydantic-command-models

nsheff · 2024-01-16T16:22:08Z

Given that we're running into issues with packages that don't support pydantic 2.0 (see here for example: pepkit/pipestat#127), and that pydantic-argparse appears to be at a dead-end, unresponsive as to whether pydantic 2.0 will ever be supported (SupImDos/pydantic-argparse#48), we might want to rethink using pydantic-argparse, and instead, it might make more sense to just roll our own argparser from the Argument objects, for example, with something like this:

ttps://stackoverflow.com/questions/72741663/argument-parser-from-a-pydantic-model

nsheff · 2024-01-16T19:01:01Z

See also: https://github.com/edornd/argdantic

donaldcampbelljr · 2024-02-29T21:16:07Z

Given that we're running into issues with packages that don't support pydantic 2.0 (see here for example: pepkit/pipestat#127), and that pydantic-argparse appears to be at a dead-end, unresponsive as to whether pydantic 2.0 will ever be supported (SupImDos/pydantic-argparse#48), we might want to rethink using pydantic-argparse, and instead, it might make more sense to just roll our own argparser from the Argument objects, for example, with something like this:

I just ran into this issue with pydantic > 2.0.0 support.

Recently, we forced requirement for pephubclient >=0.4.0 ( see #453, 1c8a8ad).

As I continue working on migrating the arguments to the pydantic models, I am now running into dependency issues because pephubclient requires pydantic greater than 2.5.0 but pydantic-argparse depends on pydantic<2.0.0:

ERROR: Cannot install looper and pydantic-argparse==0.8.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    pephubclient 0.4.0 depends on pydantic>2.5.0
    pydantic-argparse 0.8.0 depends on pydantic<2.0.0 and >=1.9.0

Unfortunately, I've gotten quite far in finishing the migration before realizing this issue (I realized it when I merged some downstream changes upstream).

donaldcampbelljr · 2024-03-01T15:35:09Z

There is a fork of pydantic-argparse that uses pydantic2: https://github.com/anastasds/pydantic2-argparse

I was able to implement it here and resolve dependency issues:
066e661

Interestingly, the fork (and now my looper implementation) requires calling the still available pydantic.v1 api:

import pydantic.v1 as pydantic

donaldcampbelljr · 2024-03-01T19:05:26Z

Another issue as I rework the tests:

For arguments that can be given a list (e.g. selecting or excluding based on multiple flags), the argparser cannot seem to handle a list:

--looper-config .looper.yaml --exc-flag "failed" "running" run --dry-run

I've tried different permutations of the syntax as well as changing the Field type but am still striking out.

simeoncarstens · 2024-03-04T16:25:54Z

@donaldcampbelljr interesting issue! I investigated a bit and the problem seems to be that pydantic-argparse doesn't know how to distinguish between list elements (such as "failed" and "running") and the subcommand ("run").
In fact, a related issue can be demonstrated with just the usual argparse Python module:

from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument("--somelist", nargs="+")
parser.add_argument("-v", "-voo")
subparsers = parser.add_subparsers(title="subcommands")
cmd_parser = subparsers.add_parser("cmd")
cmd_parser.add_argument("--somestr", type=str)

args = parser.parse_args()

yields

$ python argtest.py --somelist a b cmd --somestr "Asdf"
usage: argtest.py [-h] [--somelist SOMELIST [SOMELIST ...]] [-v V] {cmd} ...
argtest.py: error: argument {cmd}: invalid choice: 'Asdf' (choose from 'cmd')

One way to "fix" ( 🙄 ) this is to use a non-list-valued argument before the subcommand, for example:

$ python argtest.py --somelist a b -v bla cmd --somestr "Asdf"

which parses correctly.
But unfortunately it seems like a bug fix in pydantic[2]-argparse would be required for even that workaround to work 🙁 I'd be happy to look into how to fix this!

donaldcampbelljr · 2024-03-04T17:12:33Z

@simeoncarstens Interesting! Thank you for taking a look. I was also digging into it this morning. I noticed that the compute argument (also a list) does appear to be working,but it is added to the Run parser and, thus, comes after the run command:

looper --looper-config .looper.yaml run --dry-run --compute PARTITION=standard cores='32' mem='32000'

this also works (moving the --dry-run after the list):

looper --looper-config .looper.yaml run --compute PARTITION=standard cores='32' mem='32000' --dry-run

I was thinking that one workaround would be to add all the shared arguments to the individual commands so that the list arguments can be used after the command parser. Not ideal but it might be necessary for this setup.

donaldcampbelljr · 2024-03-04T19:20:54Z

So, I tried out my proposed suggestion here: 9fdf85a

It does seem to work. However, the position of the arguments (do they come before or after the command?) is still a bit opaque, e.g.:

looper --looper-config .looper.yaml  --output-dir "your/output/dir" run --dry-run --compute  mem='32000' --exc-flag "failed" "running"

I think I will proceed to place all of the arguments such that they come after the command. This will make the syntax similar to the previous CLI such that it will begin with looper run.

Is there any reason we should not structure it in this way? We will will have copies of the shared arguments among the parsers but I believe that is the only downside.

simeoncarstens · 2024-03-05T10:52:18Z

Another option is to monkey-patch thepydantic2-argparse. For example, we could require all list arguments to be given as comma-separated strings, and then add the following in cli_pydantic.py right after the imports:

def parse_field(
    parser: argparse.ArgumentParser,
    field: pydantic.fields.ModelField,
) -> Optional[utils.pydantic.PydanticValidator]:
    """Adds container pydantic field to argument parser.

    Args:
        parser (argparse.ArgumentParser): Argument parser to add to.
        field (pydantic.fields.ModelField): Field to be added to parser.

    Returns:
        Optional[utils.pydantic.PydanticValidator]: Possible validator method.
    """

    import pydantic2_argparse.utils as utils

    class SplitArgs(argparse.Action):
        def __call__(self, parser, namespace, values, option_string=None):
            setattr(namespace, self.dest, values.split(','))

    # Add Container Field
    parser.add_argument(
        utils.arguments.name(field),
        action=SplitArgs,
        help=utils.arguments.description(field),
        dest=field.alias,
        metavar=field.alias.upper(),
        required=bool(field.required),
    )

    # Construct and Return Validator
    return utils.pydantic.as_validator(field, lambda v: v)

pydantic2_argparse.argparse.parser.parsers.container.parse_field = parse_field

This is ugly, but would have the advantage that the actual command / argument hierarchy is maintained. If we do this, then of course the mandatory comma separation (e.g. --exc-flag=running,failed) would need to be very aggressively documented in the help string.

simeoncarstens · 2024-03-05T13:11:04Z

Putting the majority of shared arguments after the subcommand would also be possible, of course. As you say, there's then a duplication of arguments across various subcommands. But then again, it might even be preferred if you had a good reason to do so for the existing CLI. Note that then some of our changes would need to be reversed. In the end, I think it's as you and the users prefer 🙂

donaldcampbelljr · 2024-03-29T18:54:46Z

Removing likely-solved after today's discussion. Cause: lack of short form arguments.

This was brought up previously in a PR:
#448 (comment)

But to reiterate here: pydantic-argparse/pydantic2-argparse does not support short-form arguments at this time and that is currently undesirable.

We will hold on releasing the next version of Looper (1.8.0) until we incorporate the short-form arguments.

donaldcampbelljr · 2024-04-10T19:50:49Z

It appears that clipstick does allow short arguments and uses Pydantic 2:
https://sander76.github.io/clipstick/usage.html#keyword-arguments

I believe it was mentioned in a meeting that we decided not to use clipstick for some reason, but I cannot seem to track down my notes on why that was decided.

simeoncarstens · 2024-04-11T08:04:54Z

I'm sorry about the late realization that short arguments are not supported by pydantic-argparse! We probably should have pointed that out more explicitly, rather than only in a comment on PR 🙁
As for clipstick, I think what I didn't like about it was that (quote from the documentation)

Next to that I wanted to try and build my own parser instead of using Argparse because… why not.

That doesn't exactly inspire confidence with me, but that's just opinionated on my side.

nsheff · 2024-04-11T13:08:18Z

Yes, clipstick doesn't use argparse, which I considered a fatal downside, since we use argparse in all our other projects so we're familiar with it.

donaldcampbelljr · 2024-05-15T19:38:33Z

Ok, with the latest PR I've added the short arguments. Marking this as likely solved.

simeoncarstens mentioned this issue Jan 12, 2024

Discussion: add HTTP API for looper #433

Open

4 tasks

simeoncarstens mentioned this issue Jan 18, 2024

First iteration of a CLI based on pydantic models that allows to run the hello_looper example #440

Merged

donaldcampbelljr added a commit that referenced this issue Feb 26, 2024

begin adding more pydantic args: rerun and runp #438

19a24c2

donaldcampbelljr added a commit that referenced this issue Feb 27, 2024

Add table, report, destroy, check #438

d3f5505

donaldcampbelljr added a commit that referenced this issue Feb 27, 2024

add remaining commands #438

ca77d58

donaldcampbelljr added a commit that referenced this issue Feb 27, 2024

check for none values during pydantic arg parsing #438

10cce80

donaldcampbelljr mentioned this issue Feb 27, 2024

Adding pydantic argparse cli and Refactor Looper Tests #472

Merged

4 tasks

donaldcampbelljr added a commit that referenced this issue Mar 4, 2024

Add exc_flag and sel_flag to Run command as a poc #438

9fdf85a

donaldcampbelljr added a commit that referenced this issue Mar 4, 2024

Move all optional arguments to be under each command as appropriate #438

debb45c

donaldcampbelljr added a commit that referenced this issue Mar 5, 2024

Re-add optional commands to ensure manual tests pass, refactor #438

0589a03

donaldcampbelljr added this to the v1.8.0 milestone Mar 27, 2024

donaldcampbelljr added the likely-solved label Mar 27, 2024

donaldcampbelljr removed the likely-solved label Mar 29, 2024

donaldcampbelljr mentioned this issue May 15, 2024

Workaround for shortform arguments #489

Merged

donaldcampbelljr added the likely-solved label May 15, 2024

donaldcampbelljr closed this as completed Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate `argparse` CLI definition to a `pydantic` basis for most important commands #438

Migrate `argparse` CLI definition to a `pydantic` basis for most important commands #438

simeoncarstens commented Jan 10, 2024

simeoncarstens commented Jan 11, 2024

simeoncarstens commented Jan 11, 2024

nsheff commented Jan 16, 2024

nsheff commented Jan 16, 2024

donaldcampbelljr commented Feb 29, 2024

donaldcampbelljr commented Mar 1, 2024

donaldcampbelljr commented Mar 1, 2024

simeoncarstens commented Mar 4, 2024

donaldcampbelljr commented Mar 4, 2024

donaldcampbelljr commented Mar 4, 2024

simeoncarstens commented Mar 5, 2024

simeoncarstens commented Mar 5, 2024

donaldcampbelljr commented Mar 29, 2024

donaldcampbelljr commented Apr 10, 2024

simeoncarstens commented Apr 11, 2024

nsheff commented Apr 11, 2024

donaldcampbelljr commented May 15, 2024

Migrate argparse CLI definition to a pydantic basis for most important commands #438

Migrate argparse CLI definition to a pydantic basis for most important commands #438

Comments

simeoncarstens commented Jan 10, 2024

simeoncarstens commented Jan 11, 2024

simeoncarstens commented Jan 11, 2024

nsheff commented Jan 16, 2024

nsheff commented Jan 16, 2024

donaldcampbelljr commented Feb 29, 2024

donaldcampbelljr commented Mar 1, 2024

donaldcampbelljr commented Mar 1, 2024

simeoncarstens commented Mar 4, 2024

donaldcampbelljr commented Mar 4, 2024

donaldcampbelljr commented Mar 4, 2024

simeoncarstens commented Mar 5, 2024

simeoncarstens commented Mar 5, 2024

donaldcampbelljr commented Mar 29, 2024

donaldcampbelljr commented Apr 10, 2024

simeoncarstens commented Apr 11, 2024

nsheff commented Apr 11, 2024

donaldcampbelljr commented May 15, 2024

Migrate `argparse` CLI definition to a `pydantic` basis for most important commands #438

Migrate `argparse` CLI definition to a `pydantic` basis for most important commands #438