Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Generate expr method signatures, docs #3600

Merged
merged 86 commits into from
Oct 12, 2024
Merged

feat: Generate expr method signatures, docs #3600

merged 86 commits into from
Oct 12, 2024

Conversation

dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Sep 19, 2024

Closes #3563

Description

This PR builds on the work of #3466, and provides accurate signatures for alt.expr methods.

Benefits

Originally listed in #3563 (comment)

  • Raising a TypeError at the time of expr definition
    • Rather than silently passing until validated within a ChartType
    • Importantly, this is within python and not javascript
  • Providing a meaningful traceback, identifying both the expr and the missing argument
  • Opens the door for defining default argument values
    • E.g. in waterfall_chart, there are repeat uses of "" for the elseValue
    • If this is a common case, we could use that as a default and it would be clear by the signature alone
  • The current docstrings - which refer to parameters by name - would be easier to understand
    • Right now, you'd need to refer to Vega Expressions to determine the order they must appear in
  • The change would not add any new requirements to currently valid specs
    • Therefore, suitable as a MINOR version feature

Screenshots

image
image
image

Implementation

In #3563 I proposed authoring these changes manually.
However, after revisiting a conversation I had with @jonmmease I thought it would be worth at least trying to generate the definitions.

I found that https://vega.github.io/vega/docs/expressions/ seemed to be the only place they are documented in a single place.
From my understanding, this document is manually maintained but updates occur infrequently.

There was enough consistency in the markup used to work out how various parameters aligned with python syntax https://docs.python.org/3/tutorial/controlflow.html#special-parameters.

I also used this opportunity to clean up the docs themselves, and get them closer to the rest of altair.
There is definitely more that could be explored on that front, and some of the changes here could be utilised to improve the existing generated docs.

Related

Tasks

Deferred

Trying my best to keep the scope of the PR focused.
Working on this sparked a lot of ideas, so I've collected those here to possibly explore in the future:

  • Inline code -> block code (comment)
  • Also parsing properties from expressions.md
    • Wouldn't be that much work to add, but not a great deal of value
    • We're missing MAX_VALUE, MIN_VALUE
  • tools.schemapi.vega_expr.py -> tools.vega_expr.py
    • The module is unrelated to the vega-lite schema
    • I think the review will be simpler if it stays here for now
  • Soft-deprecating camelCase expr methods, replace w/ snake_case
    • Pretty sure adding an _ExprMeta.__getattr__ would make this possible
    • The current names would remain accessible (but undocumented)

(Early) Preview

Prior to 5e75051 (#3600) there wasn't any generated source on this branch.
Keeping these screenshots for a quick referenceuntil I've resolved some test issues.

Screenshots

image

image

image

- Only the most common case
- no docs
- No method body
Currently just collects the pieces, but doesn't render the final str
Also renamed constant `EXPR_ANNOTATION` -> `INPUT_ANNOTATION`
- Contains all the functionality
- Needs a lot of tidying up
- Uses the same config as `indent_docstring`
- Can't truly be `numpydoc` though without a parameters section
@dangotbanned
Copy link
Member Author

dangotbanned commented Sep 27, 2024

Is that something that is expected? Or was it naive to expect that something as such could work?

@mattijn I managed to get all of these working correctly:

# ruff: noqa: B018
import altair as alt

# import altair.expr as ae
from altair import expr
from altair import expr as ae
from altair.expr import expr as expr_cls

alt.expr.scale
expr.scale
ae.scale
expr_cls.scale

image

Edit

AFAIK this behavior isn't something I'd expect to have changed vs main - but would be interesting if so

@mattijn
Copy link
Contributor

mattijn commented Sep 28, 2024

Thanks! It works as expected, but that also means it surface the following issue:
This works

import polars as pl
import altair as alt
from altair import expr as ae

df = pl.DataFrame(
    {
        "x": [0.32, 0.86, 0.10, 0.30, 0.47, 0.0, 1.0, 0.91, 0.88, 0.12],
        "color": list("ACABBCABBA"),
    }
)

param_array = alt.param(name="my_array", value=["B", "A", "C"])
alt.Chart(df).mark_tick(thickness=10).encode(
    x=alt.X("x"), 
    color=alt.Color("color").scale(domain=param_array)
).add_params(param_array)
image

But this will not:

param_array = alt.param(name="my_array", value=["B", "A", "C"])
param_sort = alt.param(name="my_sorted_array", expr=ae.sort(param_array))
alt.Chart(df).mark_tick(thickness=10).encode(
    x=alt.X("x"), 
    color=alt.Color("color").scale(domain=param_sort)
).add_params(param_array, param_sort)
Javascript Error: Unrecognized function: sort
This usually means there's a typo in your chart specification. See the javascript console for the full traceback.

And while ae.sort(param_array) is now recognised as valid, since it is now auto-generated from the docs (great!), it is not yet released as part of the current accepted Vega version within Altair therefore raising an JavaScript error.

The compiled vega-lite specification looks correctly though:

{
  "config": {"view": {"continuousWidth": 300, "continuousHeight": 300}},
  "data": {"name": "data-1d2849971aacb1c06bb7718af005021c"},
  "mark": {"type": "tick", "thickness": 10},
  "encoding": {
    "color": {
      "field": "color",
      "scale": {"domain": {"expr": "my_sorted_array"}},
      "type": "nominal"
    },
    "x": {"field": "x", "type": "quantitative"}
  },
  "params": [
    {"name": "my_array", "value": ["B", "A", "C"]},
    {"name": "my_sorted_array", "expr": "sort(my_array)"}
  ],
  "$schema": "https://vega.github.io/schema/vega-lite/v5.20.1.json",
  "datasets": {
    "data-1d2849971aacb1c06bb7718af005021c": [
      {"x": 0.32, "color": "A"},
      {"x": 0.86, "color": "C"},
      {"x": 0.1, "color": "A"},
      {"x": 0.3, "color": "B"},
      {"x": 0.47, "color": "B"},
      {"x": 0, "color": "C"},
      {"x": 1, "color": "A"},
      {"x": 0.91, "color": "B"},
      {"x": 0.88, "color": "B"},
      {"x": 0.12, "color": "A"}
    ]
  }
}

@dangotbanned
Copy link
Member Author

But this will not:

...

And while ae.sort(param_array) is now recognised as valid, since it is now auto-generated from the docs (great!), it is not yet released as part of the current accepted Vega version within Altair therefore raising an JavaScript error.

The compiled vega-lite specification looks correctly though:

{
  "config": {"view": {"continuousWidth": 300, "continuousHeight": 300}},
  "data": {"name": "data-1d2849971aacb1c06bb7718af005021c"},
  "mark": {"type": "tick", "thickness": 10},
  "encoding": {
    "color": {
      "field": "color",
      "scale": {"domain": {"expr": "my_sorted_array"}},
      "type": "nominal"
    },
    "x": {"field": "x", "type": "quantitative"}
  },
  "params": [
    {"name": "my_array", "value": ["B", "A", "C"]},
    {"name": "my_sorted_array", "expr": "sort(my_array)"}
  ],
  "$schema": "https://vega.github.io/schema/vega-lite/v5.20.1.json",
  "datasets": {
    "data-1d2849971aacb1c06bb7718af005021c": [
      {"x": 0.32, "color": "A"},
      {"x": 0.86, "color": "C"},
      {"x": 0.1, "color": "A"},
      {"x": 0.3, "color": "B"},
      {"x": 0.47, "color": "B"},
      {"x": 0, "color": "C"},
      {"x": 1, "color": "A"},
      {"x": 0.91, "color": "B"},
      {"x": 0.88, "color": "B"},
      {"x": 0.12, "color": "A"}
    ]
  }
}

Well spotted @mattijn thank you!

I'll follow this up in #3600 (comment)

Not fully convinced this is a reliable alterntive to manually maintaining

#3600 (comment)
…sion`

- Found this to be a much more reliable source
- Won't require manual syncing
- Can easily be replaced in the future with a public function/attribute accessible in `vl_convert` itself

#3600 (comment)
@dangotbanned dangotbanned marked this pull request as draft October 7, 2024 11:47
@dangotbanned
Copy link
Member Author

Note

@mattijn I'm moving into draft to utilize #3633 and simplify #3600 (comment)

@dangotbanned dangotbanned marked this pull request as ready for review October 7, 2024 13:12
Copy link
Contributor

@mattijn mattijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. Looks good now!

@dangotbanned
Copy link
Member Author

Thanks for this PR. Looks good now!

Thanks for your help with this PR @mattijn

One last thing I'd like to check before merging, from the Deferred section of the description

  • tools.schemapi.vega_expr.py -> tools.vega_expr.py
    • The module is unrelated to the vega-lite schema
    • I think the review will be simpler if it stays here for now

I can either:

  1. Leave the path unchanged
  2. Move vega_expr.py in a follow-up PR
  3. Move vega_expr.py now and then merge

@mattijn
Copy link
Contributor

mattijn commented Oct 12, 2024

Option 3 is OK 👍

@dangotbanned dangotbanned enabled auto-merge (squash) October 12, 2024 12:59
@dangotbanned dangotbanned merged commit 6cb633f into main Oct 12, 2024
25 checks passed
@dangotbanned dangotbanned deleted the vega-expr-gen branch October 12, 2024 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve expr method signatures
2 participants