Optional bootstrapping #175

Ericson2314 · 2020-12-05T20:49:16Z

This is an alternative to #170 that allow building happy with or without bootstrapping.

We discussed this with @simonmar elsewhere and concluded that a better solution (for now) would be to check in Happy-generated files.

The rational here is hidden to me, but I'm hoping that this might be of interest, in that we are both keeping the theoretical niceness of bootstrapping available, while also offering an escape hatch as a first step towards radically simplifying the build system / testing infra of everything happy related. I think the latter is extremely important, because anything at all custom very much hampers the ability of tools like hadrian and haskell.nix that want to plan and execute a complete bootstrap with as little manual intervention as possible.

~~This is WIP because I need to update CI.~~

Closes #169

CC @angerman @hsyl20

Before this patch, building 'happy' required a pre-built binary of 'happy'. This was elegant in the same way a self-hosting compiler is elegant. But it also made building purely from source more complicated than needed. This patch introduces a small, bespoke parsing library, and applies it for parsing .y and .ly files. Now 'happy' doesn't depend on itself, and can be built using just GHC.

int-index · 2020-12-21T15:41:33Z

The rational here is hidden to me

The rationale is basically eating your own dog food. #170 worked well, but it meant that happy wouldn't be making use of its own features.

Now this pull request (#175) means we have two implementations at once. I'm afraid that the cure might be worse than the disease, as it will be difficult to keep the two parsers in sync. And it increases the amount of code (thus the amount of potential bugs).

Here are the two options that I think are better:

Reopen No bootstrapping #170. Then we sacrifice dogfooding, but we get a nice bootstrapping story and inspectability.
Run happy manually once on its own .y-grammar, then add an automatic check on CI that the checked-in generated file is up to date. This way we get dogfooding, but we also get a huge unreadable happy-generated piece of code checked-in. The only problem I see here is that happy output is unreadable, so it's not principally different from checking-in a blob.

So if we favor inspectability, then #170 is the way to go, because I think my bespoke parser combinators turned out alright. On the other hand, if we favor dogfooding, then we should git add the happy-generated parser.

It's a judgement call. Personally, I favor inspectability more (hence #170). @simonmar told me:

I'm not sure I'm convinced that we should eliminate bootstrapping, but I don't feel all that strongly. Use your best judgment.

And I didn't execute on that because I'm not that sure that my judgement is correct here. Maybe your vote will determine the outcome here, @Ericson2314 :-)

harpocrates · 2020-12-21T17:50:17Z

@int-index does #170 produce the same parsing code that happy would've produced by itself? If so, there could be a third option which I think gets the best of both worlds:

reopen No bootstrapping #170, get the nice bootstrapping and inspectability
in CI, after bootstrapping happy once, run it on its own .y-grammar to confirm that the result of bootstrapping is no different than what would've been obtained from running happy itself

Ericson2314 · 2020-12-21T18:40:38Z

@int-index I think I am still in favor of this approach.

Anything that involves mandatory bootstrapping, without a keeping around build using all happy/alex versions back to the beginning I consider a "coinductive bootstrap". We try to find a fixed point, and cross our fingers it's the right one. By contrast, with #175 allows easy "inductive bootstrap", where you can always boot from the parser combinator version, and therefore compare any fixed point to a "grounded" reference.

For reference, I'm a big fan of @taktoa's GHC desiderata in https://gist.github.com/taktoa/a59400fd3e1c400835b60c416ad33952. I view happy and alex also sort of a perfect microcosm of the GHC universe, where we have a chance at checking off these idealistic boxes at far lower cost.

The only problem I see here is that happy output is unreadable, so it's not principally different from checking-in a blob.

I agree completely.

Now this pull request (#175) means we have two implementations at once. I'm afraid that the cure might be worse than the disease, as it will be difficult to keep the two parsers in sync.

But think of it as tests! We have in order of clarity: happy output < parser combinators < happy input. So the most auditable spec is the happy input, but the most auditable build artifact is the parser combinator version.

And it increases the amount of code (thus the amount of potential bugs).

Ah, but we should consider parallel vs serial composition, like with electrical resistance. Bigger implementation is more bugs, but this is 2 implementations. I say this is parallel composition, and thus, like compositing resisters in parallel, means fewer bugs.

High-minded reasoning aside, let me offer some concessions:

All that said, even if it is fewer bugs, it is more. This is something I really want, so I volunteer to bare the burden of any additional pain associated with this. If we go with this, and I go AWAL, feel free to delete one.
The two paths don't need to be feature-set identical. We just need enough functionality to do the inductive bootstrap, so the parser combinator one can have fewer features if need be. Not sure that helps now, but it might help later.

int-index · 2020-12-21T19:28:40Z

OK, I'm convinced.

Ericson2314 · 2020-12-21T23:22:49Z

:) Thanks for hearing me out. I suppose I'll wait longer to push the button so @simonmar can weigh in.

Instead of preprocessing an outer layer of CPP when building happy, just always produce code that uses CPP. Combined with #175, this means happy now has a perfectly bog standard build system, with Makefiles and extra steps strictly optional. I gather Hugs, and possibly other Haskell implementations, out of the box doesn't support CPP, but I don't want this to stop us. Those can just manually run CPP on the generated code first.

We boostrap from source now. Alex is still kept, but just for tests

…pping

int-index · 2020-12-27T11:12:26Z

The two paths don't need to be feature-set identical. We just need enough functionality to do the inductive bootstrap, so the parser combinator one can have fewer features if need be. Not sure that helps now, but it might help later.

Should we drop the attribute grammar support from the parser combinators implementation, then?

Ericson2314 · 2020-12-27T20:07:17Z

@int-index I like it!

One small caviet, it might be hard to get Hadrian and Make to do the bootstrap without introducing a stage -1 (ugh), so if GHC uses the attribute grammar stuff I might be tempted to wait to remove it, but if it doesn't then yes let's by all means get rid of it.

int-index · 2020-12-27T21:09:46Z

GHC doesn't use it.

This makes the initial boostrapping stage smaller while not affecting the second stage.

simonmar · 2020-12-31T11:44:34Z

Just out of interest, why not use ReadP instead of hand-rolling parser combinators?

int-index · 2020-12-31T12:50:46Z

Just out of interest, why not use ReadP instead of hand-rolling parser combinators?

Originally I expected these parser combinators to replace bootstrapping entirely, so I wanted them to be more efficient. Performance considerations influenced their design significantly.

But with the new design where they are used to facilitate bootstrapping instead of replacing it, maybe performance isn't important at all. So ReadP could be used, too.

…pping

Ericson2314 · 2021-01-03T03:47:39Z

So what say you both that we just merge this so we can also merge #179 and then worry about the ReadP simplification, having a nice from-source bootstrap to iterate on?

int-index · 2021-01-03T09:27:59Z

Yeah, doing one thing at a time sounds reasonable to me. Would you mind squashing the commits before the merge?

Instead of preprocessing an outer layer of CPP when building happy, just always produce code that uses CPP. Combined with #175, this means happy now has a perfectly bog standard build system, with Makefiles and extra steps strictly optional. I gather Hugs, and possibly other Haskell implementations, out of the box doesn't support CPP, but I don't want this to stop us. Those can just manually run CPP on the generated code first.

int-index and others added 5 commits August 20, 2020 23:19

Merge remote-tracking branch 'upstream/master' into no-bootstrapping

f90f2f4

WIP

d8f23e0

Get old and new versions building

51ed05d

Add Cabal flag to configure whether we do the bootstrapped version

eebb859

Ericson2314 mentioned this pull request Dec 5, 2020

Plans that use bootstrapping to break cycles haskell/cabal#7189

Open

int-index approved these changes Dec 21, 2020

View reviewed changes

Ericson2314 changed the title ~~WIP: Optional bootstrapping~~ Optional bootstrapping Dec 22, 2020

Ericson2314 changed the title ~~Optional bootstrapping~~ WIP: Optional bootstrapping Dec 22, 2020

Ericson2314 mentioned this pull request Dec 23, 2020

Use nix-tree-sitter for the parser haskell-nix/hnix#508

Open

Ericson2314 added 2 commits December 24, 2020 16:21

CI both ways

73e995a

Make tests parallel

0724a89

Ericson2314 changed the title ~~WIP: Optional bootstrapping~~ Optional bootstrapping Dec 24, 2020

Ericson2314 added 2 commits December 24, 2020 20:38

Fix -Wno-orphans for older GHCs

7cf2ef5

Mention exact cabal issue in cabal file

2f89f00

Ericson2314 mentioned this pull request Dec 25, 2020

Get rid of template preprocessing #179

Merged

Ericson2314 added 4 commits December 26, 2020 17:25

Fix bad word-wrap

8c217cc

Merge branch 'sync-build-infra' into optional-bootstrapping

e233d94

Remove pre-built happy from CI!

354a366

We boostrap from source now. Alex is still kept, but just for tests

Merge remote-tracking branch 'upstream/master' into optional-bootstra…

158d826

…pping

Use stub modules to avoid CPP imports

35fc9c0

Attribute grammars are now only supported when bootstrapped

7027f42

This makes the initial boostrapping stage smaller while not affecting the second stage.

Merge remote-tracking branch 'upstream/master' into optional-bootstra…

668bf50

…pping

Ericson2314 merged commit fcaca24 into haskell:master Jan 3, 2021

andreasabel mentioned this pull request Feb 16, 2021

No bootstrapping #169

Closed

Ericson2314 deleted the optional-bootstrapping branch March 19, 2021 02:10

Ericson2314 mentioned this pull request Mar 19, 2021

Modernizing the Packaging #187

Closed

6 tasks

int-index mentioned this pull request Jun 13, 2024

Build instructions in contributing.rst are wrong #274

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optional bootstrapping #175

Optional bootstrapping #175

Ericson2314 commented Dec 5, 2020 •

edited

Loading

int-index commented Dec 21, 2020

harpocrates commented Dec 21, 2020

Ericson2314 commented Dec 21, 2020 •

edited

Loading

int-index commented Dec 21, 2020

Ericson2314 commented Dec 21, 2020

int-index commented Dec 27, 2020

Ericson2314 commented Dec 27, 2020

int-index commented Dec 27, 2020

simonmar commented Dec 31, 2020

int-index commented Dec 31, 2020

Ericson2314 commented Jan 3, 2021 •

edited

Loading

int-index commented Jan 3, 2021

Optional bootstrapping #175

Optional bootstrapping #175

Conversation

Ericson2314 commented Dec 5, 2020 • edited Loading

int-index commented Dec 21, 2020

harpocrates commented Dec 21, 2020

Ericson2314 commented Dec 21, 2020 • edited Loading

int-index commented Dec 21, 2020

Ericson2314 commented Dec 21, 2020

int-index commented Dec 27, 2020

Ericson2314 commented Dec 27, 2020

int-index commented Dec 27, 2020

simonmar commented Dec 31, 2020

int-index commented Dec 31, 2020

Ericson2314 commented Jan 3, 2021 • edited Loading

int-index commented Jan 3, 2021

Ericson2314 commented Dec 5, 2020 •

edited

Loading

Ericson2314 commented Dec 21, 2020 •

edited

Loading

Ericson2314 commented Jan 3, 2021 •

edited

Loading