Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capability for serializing and deserializing well-formedness definitions #130

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

fhackett
Copy link
Contributor

@fhackett fhackett commented Aug 9, 2024

Like the title says, this PR allows Trieste's well-formedness definitions to be serialized to Trieste ASTs (which have a plain-text conversion) and back.

The utility of this feature is in cases like:

  • being able to ask a tool what its input well-formedness definition is (in a hypothetical future scenario where two tools will exchange Trieste ASTs directly)
  • debugging what a well-formedness definition actually looks like when it has been defined as a long series of mutations from some other definition.
  • wanting to know what a well-formedness definition for well-formedness definitions looks like.

In its current state, the code can serialize its own well-formedness definition and then re-read it. It's challenging to fuzz the code as-is because it's not in the format of a reader/writer: reader/writers can't work with Wellformedness as inputs or outputs.

While I believe the code to be usable in its current state (in that you can use it with Trieste's native AST serializers and it will crash on bad inputs), I aspire to wrap this code with a reader/writer that have proper error handling.

@fhackett fhackett marked this pull request as draft August 9, 2024 14:18
@fhackett
Copy link
Contributor Author

Based on the conversation yesterday, I incorporated the current serializer as part of the AST debug writing process.

This is not a way to read external WF definitions, nor is it that great of an output format (it's full of the long internal token names, for one), but it works and you can have a lot of fun diffing WFs between passes.

Note that I made some effort to ensure that WF output is canonical and ordered, so you can compare different WFs and a text diff will get quite close to a semantic diff, in terms of where nodes appear and disappear, mapping similarities, etc.

I also noticed that actually giving an output mode outside of "just serialize the AST" probably belongs in another file (and outside this changeset), because otherwise we'll have an interesting circular dependency between the pass machinery and, well, itself.

@fhackett fhackett marked this pull request as ready for review August 13, 2024 13:01
@fhackett
Copy link
Contributor Author

All that to say, I think this is moderately useful, reasonably extensible in some ways I discussed, and ready for review in its own right (give or take inevitable style questions, etc).

Note one feature that isn't literally just how Wellformed works, which is the optional ability to specify a namespace that forms a string prefix of all (outside of core Trieste stuff like "top") token names. It might be useful in the future, and if you ignore it, it gets set to the universal "" prefix and does nothing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant