Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional bootstrapping #175

Merged
merged 16 commits into from
Jan 3, 2021

Conversation

Ericson2314
Copy link
Collaborator

@Ericson2314 Ericson2314 commented Dec 5, 2020

This is an alternative to #170 that allow building happy with or without bootstrapping.

We discussed this with @simonmar elsewhere and concluded that a better solution (for now) would be to check in Happy-generated files.

The rational here is hidden to me, but I'm hoping that this might be of interest, in that we are both keeping the theoretical niceness of bootstrapping available, while also offering an escape hatch as a first step towards radically simplifying the build system / testing infra of everything happy related. I think the latter is extremely important, because anything at all custom very much hampers the ability of tools like hadrian and haskell.nix that want to plan and execute a complete bootstrap with as little manual intervention as possible.

This is WIP because I need to update CI.

Closes #169

CC @angerman @hsyl20

int-index and others added 5 commits August 20, 2020 23:19
Before this patch, building 'happy' required a pre-built binary of
'happy'. This was elegant in the same way a self-hosting compiler is
elegant. But it also made building purely from source more complicated
than needed.

This patch introduces a small, bespoke parsing library, and applies it
for parsing .y and .ly files.

Now 'happy' doesn't depend on itself, and can be built using just GHC.
@int-index
Copy link
Collaborator

The rational here is hidden to me

The rationale is basically eating your own dog food. #170 worked well, but it meant that happy wouldn't be making use of its own features.

Now this pull request (#175) means we have two implementations at once. I'm afraid that the cure might be worse than the disease, as it will be difficult to keep the two parsers in sync. And it increases the amount of code (thus the amount of potential bugs).

Here are the two options that I think are better:

  1. Reopen No bootstrapping #170. Then we sacrifice dogfooding, but we get a nice bootstrapping story and inspectability.

  2. Run happy manually once on its own .y-grammar, then add an automatic check on CI that the checked-in generated file is up to date. This way we get dogfooding, but we also get a huge unreadable happy-generated piece of code checked-in. The only problem I see here is that happy output is unreadable, so it's not principally different from checking-in a blob.

So if we favor inspectability, then #170 is the way to go, because I think my bespoke parser combinators turned out alright. On the other hand, if we favor dogfooding, then we should git add the happy-generated parser.

It's a judgement call. Personally, I favor inspectability more (hence #170). @simonmar told me:

I'm not sure I'm convinced that we should eliminate bootstrapping, but I don't feel all that strongly. Use your best judgment.

And I didn't execute on that because I'm not that sure that my judgement is correct here. Maybe your vote will determine the outcome here, @Ericson2314 :-)

@harpocrates
Copy link
Contributor

@int-index does #170 produce the same parsing code that happy would've produced by itself? If so, there could be a third option which I think gets the best of both worlds:

  1. reopen No bootstrapping #170, get the nice bootstrapping and inspectability
  2. in CI, after bootstrapping happy once, run it on its own .y-grammar to confirm that the result of bootstrapping is no different than what would've been obtained from running happy itself

@Ericson2314
Copy link
Collaborator Author

Ericson2314 commented Dec 21, 2020

@int-index I think I am still in favor of this approach.

Anything that involves mandatory bootstrapping, without a keeping around build using all happy/alex versions back to the beginning I consider a "coinductive bootstrap". We try to find a fixed point, and cross our fingers it's the right one. By contrast, with #175 allows easy "inductive bootstrap", where you can always boot from the parser combinator version, and therefore compare any fixed point to a "grounded" reference.

For reference, I'm a big fan of @taktoa's GHC desiderata in https://gist.github.com/taktoa/a59400fd3e1c400835b60c416ad33952. I view happy and alex also sort of a perfect microcosm of the GHC universe, where we have a chance at checking off these idealistic boxes at far lower cost.

The only problem I see here is that happy output is unreadable, so it's not principally different from checking-in a blob.

I agree completely.

Now this pull request (#175) means we have two implementations at once. I'm afraid that the cure might be worse than the disease, as it will be difficult to keep the two parsers in sync.

But think of it as tests! We have in order of clarity: happy output < parser combinators < happy input. So the most auditable spec is the happy input, but the most auditable build artifact is the parser combinator version.

And it increases the amount of code (thus the amount of potential bugs).

Ah, but we should consider parallel vs serial composition, like with electrical resistance. Bigger implementation is more bugs, but this is 2 implementations. I say this is parallel composition, and thus, like compositing resisters in parallel, means fewer bugs.


High-minded reasoning aside, let me offer some concessions:

  1. All that said, even if it is fewer bugs, it is more. This is something I really want, so I volunteer to bare the burden of any additional pain associated with this. If we go with this, and I go AWAL, feel free to delete one.

  2. The two paths don't need to be feature-set identical. We just need enough functionality to do the inductive bootstrap, so the parser combinator one can have fewer features if need be. Not sure that helps now, but it might help later.

@int-index
Copy link
Collaborator

OK, I'm convinced.

@Ericson2314
Copy link
Collaborator Author

:) Thanks for hearing me out. I suppose I'll wait longer to push the button so @simonmar can weigh in.

@Ericson2314 Ericson2314 changed the title WIP: Optional bootstrapping Optional bootstrapping Dec 22, 2020
@Ericson2314 Ericson2314 changed the title Optional bootstrapping WIP: Optional bootstrapping Dec 22, 2020
@Ericson2314 Ericson2314 changed the title WIP: Optional bootstrapping Optional bootstrapping Dec 24, 2020
Ericson2314 added a commit that referenced this pull request Dec 25, 2020
Instead of preprocessing an outer layer of CPP when building happy, just
always produce code that uses CPP. Combined with #175, this means happy
now has a perfectly bog standard build system, with Makefiles and extra
steps strictly optional.

I gather Hugs, and possibly other Haskell implementations, out of the
box doesn't support CPP, but I don't want this to stop us. Those can
just manually run CPP on the generated code first.
@int-index
Copy link
Collaborator

The two paths don't need to be feature-set identical. We just need enough functionality to do the inductive bootstrap, so the parser combinator one can have fewer features if need be. Not sure that helps now, but it might help later.

Should we drop the attribute grammar support from the parser combinators implementation, then?

@Ericson2314
Copy link
Collaborator Author

@int-index I like it!

One small caviet, it might be hard to get Hadrian and Make to do the bootstrap without introducing a stage -1 (ugh), so if GHC uses the attribute grammar stuff I might be tempted to wait to remove it, but if it doesn't then yes let's by all means get rid of it.

@int-index
Copy link
Collaborator

GHC doesn't use it.

This makes the initial boostrapping stage smaller while not affecting
the second stage.
@simonmar
Copy link
Member

Just out of interest, why not use ReadP instead of hand-rolling parser combinators?

@int-index
Copy link
Collaborator

Just out of interest, why not use ReadP instead of hand-rolling parser combinators?

Originally I expected these parser combinators to replace bootstrapping entirely, so I wanted them to be more efficient. Performance considerations influenced their design significantly.

But with the new design where they are used to facilitate bootstrapping instead of replacing it, maybe performance isn't important at all. So ReadP could be used, too.

@Ericson2314
Copy link
Collaborator Author

Ericson2314 commented Jan 3, 2021

So what say you both that we just merge this so we can also merge #179 and then worry about the ReadP simplification, having a nice from-source bootstrap to iterate on?

@int-index
Copy link
Collaborator

Yeah, doing one thing at a time sounds reasonable to me. Would you mind squashing the commits before the merge?

@Ericson2314 Ericson2314 merged commit fcaca24 into haskell:master Jan 3, 2021
Ericson2314 added a commit that referenced this pull request Jan 3, 2021
Instead of preprocessing an outer layer of CPP when building happy, just
always produce code that uses CPP. Combined with #175, this means happy
now has a perfectly bog standard build system, with Makefiles and extra
steps strictly optional.

I gather Hugs, and possibly other Haskell implementations, out of the
box doesn't support CPP, but I don't want this to stop us. Those can
just manually run CPP on the generated code first.
@andreasabel andreasabel mentioned this pull request Feb 16, 2021
@Ericson2314 Ericson2314 deleted the optional-bootstrapping branch March 19, 2021 02:10
@Ericson2314 Ericson2314 mentioned this pull request Mar 19, 2021
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

No bootstrapping
4 participants