Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting different schemas around getting child nodes #18

Open
LeaVerou opened this issue Mar 11, 2024 · 5 comments
Open

Supporting different schemas around getting child nodes #18

LeaVerou opened this issue Mar 11, 2024 · 5 comments

Comments

@LeaVerou
Copy link
Member

Problem

In #16 we converted our hardcoded { node, property, index } parent pointer structure to a more flexible { node, path[] } schema. However, we did not change our internals much to really support arbitrary paths. We still assume children are found by:

  • If no getChildProperties() is specified, we follow all properties on a node, and call isNode() on them to see if they are child nodes. This only goes 1-2 levels deep: values that are obtained by following one property, or array values if that one property is an array.
  • If getChildProperties() is specified, we follow these properties and do not call isNode(), we just filter by existence.

Currently, this assumes a specific structure that may be overfit to Treecle’s AST beginnings.
I suspect in the wild there are two common ways to represent tree data structures:

  1. Follow specific properties that always point to children (the AST case)
  2. Follow a property that always points to children (children for Mavo Nodes, childNodes for DOM Nodes)

Currently, the API only supports providing a function that takes a node and returns a list of properties. As an example, this is how this setting is specified in vastly:

export const properties = {
	CallExpression: ["arguments", "callee"],
	BinaryExpression: ["left", "right"],
	UnaryExpression: ["argument"],
	ArrayExpression: ["elements"],
	ConditionalExpression: ["test", "consequent", "alternate"],
	MemberExpression: ["object", "property"],
	Compound: ["body"],
};

defaults.getChildProperties = (node) => {
	return properties[node.type] ?? [];
};

This appears to be overfit to 1. Yet, I suspect 2 may even be more common.
How do we allow both to be specified without making either more complicated due to the existence of the other?

Ideas

getChildProperties() to getChildPaths(), handle both string[][] and string[]?

We don't want to complicate 1 to cater to 2, but what if we could do both? If the function returns an array of strings, they are single properties. If it returns an array of arrays, they are paths.

The problem is, we don’t necessarily have specific child properties in 2, often once you get from the node to its children, everything in that data structure is a child.

Wildcards? JSON Paths?

Basically, we want a way to say children/* for these cases. What if we handle / and * in properties specially?

But then we’re basically creating a path microsyntax, and restricting the potential syntax of actual real properties accordingly.
OTOH, that's basically JSON Path syntax, which is quite well established.

The advantage of something like this is that we can still handle properties like Vastly’s in exactly the same way.


Not a huge fan of any of these ideas, so I’ll continue brainstorming.

@adamjanicki2
Copy link
Contributor

Could also be a variant of option 1 where we accept a generic flattened array, e.g. ["left", "right"], or a nested array, e.g. [["children", "key1"], ["children", "key2"]], or we could also allow the wildcard key to signal check all properties in this object, for example, [["children", "*"]] would mean all subproperties of children are children

@LeaVerou
Copy link
Member Author

Could also be a variant of option 1 where we accept a generic flattened array, e.g. ["left", "right"], or a nested array, e.g. [["children", "key1"], ["children", "key2"]], or we could also allow the wildcard key to signal check all properties in this object, for example, [["children", "*"]] would mean all subproperties of children are children

Yeah, the more I think about it, the more I like this idea.
For convenience, we should also support arrays as the value, not just functions.

E.g. the Mavo.Node use case would be:

childPaths: [["children", "*"]]

the DOM Node use case would be:

childPaths: [["childNodes", "*"]]

While having a nested array with a single element is a bit unwieldy, it's very explicit, and in most cases there's only one.

@adamjanicki2
Copy link
Contributor

Yeah, the more I think about it, the more I like this idea. For convenience, we should also support arrays as the value, not just functions.

Yeah we can definitely do this. I'll start iterating on this and have a PR up today or tomorrow

@LeaVerou
Copy link
Member Author

Btw I think the function that applies such a path to an object and returns the result is really useful and we should expose it as one of our helpers rather than having it as an internal util.

@adamjanicki2
Copy link
Contributor

Btw I think the function that applies such a path to an object and returns the result is really useful and we should expose it as one of our helpers rather than having it as an internal util.

Yeah I was planning on adding that along with a find path function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants