-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert/ extend Signature field from a string to structured data. #249
Comments
For how to store signature we may want to look into https://pypi.org/project/griffe/ |
In [1]: import griffe
...:
...: papyri = griffe.load("papyri")
In [2]: papyri.functions
Out[2]:
{'_intro': <Function('_intro', 228, 235)>,
'ingest': <Function('ingest', 238, 266)>,
'install': <Function('install', 272, 401)>,
'relink': <Function('relink', 404, 414)>,
'pack': <Function('pack', 477, 481)>,
'bootstrap': <Function('bootstrap', 484, 492)>,
'ascii': <Function('ascii', 512, 519)>,
'serve': <Function('serve', 522, 527)>,
'serve_static': <Function('serve_static', 530, 544)>,
'browse': <Function('browse', 547, 551)>,
'build_parser': <Function('build_parser', 554, 574)>,
'open': <Function('open', 577, 590)>,
'sample_async_function': <Function('sample_async_function', 593, 594)>}
In [3]: sample_async_function = papyri.functions["sample_async_function"]
In [4]: vars(sample_async_function)
Out[4]:
{'name': 'sample_async_function',
'lineno': 593,
'endlineno': 594,
'docstring': None,
'parent': <Module(PosixPath('papyri/__init__.py'))>,
'members': {},
'labels': {'async'},
'imports': {},
'exports': None,
'aliases': {},
'runtime': True,
'_lines_collection': None,
'_modules_collection': None,
'parameters': <griffe.dataclasses.Parameters at 0x1075b75e0>,
'returns': None,
'decorators': [],
'setter': None,
'deleter': None,
'overloads': None}
In [5]: ingest_function = papyri.functions["ingest"]
In [6]: ingest_function.parameters['check'].as_dict()
Out[6]:
{'name': 'check',
'annotation': Name(source='bool', full='bool'),
'kind': <ParameterKind.positional_or_keyword: 'positional or keyword'>,
'default': 'False'}
In [7]: ingest_function.parameters._parameters_dict
Out[7]:
{'paths': <griffe.dataclasses.Parameter at 0x107599400>,
'check': <griffe.dataclasses.Parameter at 0x1075994c0>,
'relink': <griffe.dataclasses.Parameter at 0x10759b1c0>,
'dummy_progress': <griffe.dataclasses.Parameter at 0x10759b2b0>}
In [8]: ingest_function.parameters['check'].as_dict()
Out[8]:
{'name': 'check',
'annotation': Name(source='bool', full='bool'),
'kind': <ParameterKind.positional_or_keyword: 'positional or keyword'>,
'default': 'False'} I think the first step could be to return the |
Yes, I think that should work. I think you can also use
And that should give you a JSON representation of the function. Though it seem the type annotations are still strings, but we can take care of this later.
Note that griffe is based on ast-maniputation and papyri does actually import the objects. So there might be some |
Hello, chiming in :) For values (actual values of attributes or default values of parameters), Griffe indeed simply unparses the AST into a string. For type annotatons however, it builds complete "expressions", which are recursive lists of strings (delimiters like Storing them like this allows to render signatures with links for each part of each type annotation. In the future we might store values the same way. |
Update: Griffe changed its way of storing expressions. They're now a tree of proper objects representing Python code (basically |
Here's how griffe represents the annotation "annotation": {
"left": {
"left": {
"name": "list",
"cls": "ExprName"
},
"slice": {
"name": "int",
"cls": "ExprName"
},
"cls": "ExprSubscript"
},
"operator": "|",
"right": {
"name": "int",
"cls": "ExprName"
},
"cls": "ExprBinOp"
}, Reconstructing this would be harder work. It would be simpler to just store the annotation as a string, but this would break compatibility if we wanted to add it in the future. Maybe we can store the string form as But I'm not clear just how much we want to be compatible with griffe. Also, it would probably be useful if griffe also stored the string form of the annotation in addition to the ast form. |
It's a bit more readable if we put the class name at the top of each dict: "annotation": {
"cls": "ExprBinOp",
"left": {
"cls": "ExprSubscript",
"left": {
"cls": "ExprName",
"name": "list",
},
"slice": {
"cls": "ExprName",
"name": "int",
},
},
"operator": "|",
"right": {
"cls": "ExprName",
"name": "int",
}
} I'm not familiar with the Papyri project so I'll suggest things that might not be acceptable to you, please forgive me 🙂 If you actually depend on Griffe, you can directly work with expressions. Calling You can iterate on expressions, recursively, in depth-first order, to process specific types of sub-expressions (subscripts, attributes, function calls, etc.) if you need to. With the previous example, your (recursive) iteration would yield: [
ExprBinOp(...),
ExprSubscript(...)
ExprName("list"),
"[",
ExprName("int"),
"]",
"|",
ExprName("int"),
] If you iterate with [
ExprName("list"),
"[",
ExprName("int"),
"]",
"|",
ExprName("int"),
] And that's exactly how Griffe outputs strings from expressions: def __str__(self):
return "".join(elem if isinstance(elem, str) else elem.name for elem in self.iterate(flat=True)) You can dump and reload expression to/from JSON: import json
from griffe.encoders import json_decoder
annotation = json.dumps({
"cls": "ExprBinOp",
"left": {
"cls": "ExprSubscript",
"left": {
"cls": "ExprName",
"name": "list"
},
"slice": {
"cls": "ExprName",
"name": "int",
}
},
"operator": "|",
"right": {
"cls": "ExprName",
"name": "int"
}
})
json.loads(annotation, object_hook=json_decoder)
...though in this example we don't have the If you're only looking for how to store annotations, without having to depend on Griffe, then I'll just share my experience with you: previously I stored them as simple lists of punctuation and names (the flat example above). Well, it is not enough once you start having more demanding rendering options, such as:
Ultimately, instead of special casing names, then attributes, then subscripts, then ..., I went all in and now store an actual AST, with just the necessary, additional info ( It sure is a bit more... chunky, but it's far more robust, powerful, and flexible 😄 Getting the N-th item of an optional tuple before was super prone to errors, now it's a breeze. |
The griffe JSON is already pretty similar to the JSON generated by the myst_serialize. Here's Griffe for def def f(x1, /, x2, x3=None, *, x4, x5=None): pass "parameters": [
{
"name": "x1",
"annotation": null,
"kind": "positional-only",
"default": null
},
{
"name": "x2",
"annotation": null,
"kind": "positional or keyword",
"default": null
},
{
"name": "x3",
"annotation": null,
"kind": "positional or keyword",
"default": "None"
},
{
"name": "x4",
"annotation": null,
"kind": "keyword-only",
"default": null
},
{
"name": "x5",
"annotation": null,
"kind": "keyword-only",
"default": "None"
}
], And here's what papyri already gives "parameters": [
{
"annotation": null,
"default": null,
"kind": "POSITIONAL_ONLY",
"name": "x1",
"type": "ParameterNode"
},
{
"annotation": null,
"default": null,
"kind": "POSITIONAL_OR_KEYWORD",
"name": "x2",
"type": "ParameterNode"
},
{
"annotation": null,
"default": "None",
"kind": "POSITIONAL_OR_KEYWORD",
"name": "x3",
"type": "ParameterNode"
},
{
"annotation": null,
"default": null,
"kind": "KEYWORD_ONLY",
"name": "x4",
"type": "ParameterNode"
},
{
"annotation": null,
"default": "None",
"kind": "KEYWORD_ONLY",
"name": "x5",
"type": "ParameterNode"
}
], The biggest difference is that papyri is using the variable name of the (side note: can someone explain to me why an "int enum" has variables assigned to strings, and how you can actually access those?) |
(wow)Looks like each new value gets replaced by an incremented int, while attaching the string to its def __new__(cls, description):
value = len(cls.__members__) # when first value (POSITIONAL_ONLY) is declared, value is 0
member = int.__new__(cls, value) # so member is IntEnum(0) (or something like that??)
member._value_ = value # its value is 0
member.description = description # its description is 'positional-only'
return member # return and proceed to next declared value Mind blown. You then get descriptions with |
Work in progress here #289. Not sure if we should try to store the annotations in a way that is forward compatible in case we want to store an ast like griffe later. I still think it's a good idea to store the str in the JSON even if it can be technically reconstructed later. |
It would be nice to store the signature in a more structured form in order to be able to do some operation after the fact.
We probably need to iterate a bit on it, and maybe some part of the signature need should be str anyway, but we might want to store:
Even if the rendering in myst, for now we convert all back to string, it's ok, but it would allow us to dynamically modify the rendering to for example not display default values, or not display deprecated parameter, or do search like "is async" or not.
For the default values/annotation, I'm assuming it might be ok to store as string for now, but in the long run I'm hopping we can also compress/expand those and link to relevant object.
For example
I think in the end we should be able to make tokens like
Union
,str
List
link to the relevant object, and maybeeven expand it to if the type is structured
I'm not sure how to do that yet, but I think it's something we should keep in mind.
The text was updated successfully, but these errors were encountered: