Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asking for help #2

Closed
SuperJMN opened this issue Jun 2, 2022 · 2 comments
Closed

Asking for help #2

SuperJMN opened this issue Jun 2, 2022 · 2 comments

Comments

@SuperJMN
Copy link

SuperJMN commented Jun 2, 2022

Hello!

My dream is to create a ANSI C compiler (using C#), but I'm a bit lost

Regarding the parser part, I've played with a very nice parser combinator library (https://github.com/datalust/superpower), and I really love how simple it is. But in fact, they're so simple that they aren't able to deal with left-recursion.

That's why I got interested in GLL parsers, because they said it was a way to fix the problems with parser combinators... until I find out that the literature is really complex and there isn't entry-level stuff around the Internet.

So here I am. I will investigate a bit more, but just in case you want to answer me, I would like to ask you a question:

What would you do if you were me?

I mean, would you attempt to create an ANSI C compiler back then? or maybe today you would totally reject doing that because of reasons?

What steps would you follow if you decided to go ahead?

Would you target an existing machine, like x86/x64, or invent a (invented) virtual architecture (to simplify things).

Thanks a lot!

@SuperJMN SuperJMN changed the title Maybe help? Asking for help Jun 2, 2022
@mykolav
Copy link
Owner

mykolav commented Jun 3, 2022

Hi @SuperJMN,
thank you for asking.

TLDR

You might find studying the source of these compilers useful. Their code bases are really small and comprehensible.

  • 8cc C Compiler -- 8cc is a compiler for the C programming language. It's intended to support all C11 language features while keeping the code as small and simple as possible. The author described their experience writing this one in a blog post How I wrote a self-hosting C compiler in 40 days

  • chibicc: A Small C Compiler -- Even though it still probably falls into the "toy compilers" category just like other small compilers do, chibicc can compile several real-world programs, including Git, SQLite, libpng and chibicc itself

Parsing

Was in a similar spot re parsing.

Parsing theory is a really interesting field of computer science, but it gets really complex really fast, just as you say.
Using a parser combinator or parser generator sounds like a good way to save time and effort vs writing a parser by hand.

But in my opinion what's more important to us, hobbyist compiler writers, is having a working example that we can look at and fully understand. And all such example compilers I managed to find have hand-crafted recursive descent parsers. So that's what I would suggest you go with. Once you get into the groove, you'll realize it isn't all that hard to write after all.

What's interesting, production-grade compilers like g++ and clang have hand-crafted recursive-descent parsers too. So if you decide to go this route, your experience will actually be closer to that of "real" compiler developers :)

The example compilers I'm talking about are:

The last two are much bigger than 8cc and chibicc and don't have easy of studying as their explicit goal.

Roadmap

What would you do if you were me?

I would surely write a compiler :)

I mean, would you attempt to create an ANSI C compiler back then? or maybe today you would totally reject doing that because of reasons?

C is considered to be a small language, but that's a bit deceptive. Writing a C compiler isn't a small project. You might want to keep that in mind depending on how much time you can dedicate to it.
But don't get discouraged, after all Rui Ueyama did 8cc in 40 days and fewer than 30 files of code. Though it seems he is insanely productive :)

What steps would you follow if you decided to go ahead?

  • I would pick up a bit of theory.
    • Read the Introduction to Compilers and Language Design book and keep it handy. Just keep in mind we're going to do a hand-crafted parser, when going through the parsing chapters.
    • Read about doing scanner (aka lexer) and parser by hand in Crafting Interpreters. Don't get distracted by the word Interpreters in the title -- the same scanning and parsing techniques are relevant for compilers and interpreters.
  • Study the source code of 8cc (or maybe chibicc -- didn't know about chibicc when I was going through these steps)
  • Finally, try and write your own compiler

Would you target an existing machine, like x86/x64, or invent a (invented) virtual architecture (to simplify things).

If I was writing a C compiler I would definitely target an existing machine, most likely x86/x64. You'll want the compiler to compile programs that invoke libc functions like printf, scanf, etc. To do that, you'll need to use an existing libc, so the compiler needs to target an existing machine. (I never researched reimplementing libc, even a minimalist version, my feeling is it's much better to use an existing one especially for someone's first compiler.)

Good luck!

@SuperJMN
Copy link
Author

SuperJMN commented Jun 4, 2022

Hi Mykola.

I smell goodness wherever it is, and I feel you are the kind of person that I'd call a mentor. I hardly fail when it comes to this perception.

Thanks for the time to think and write this helpful piece of gold. This is such a great advice!

I'll study every link, project and book you recommended. I've already got a glance at 8cc and look really promising. The fact that he even wrote a log with a short description of the goals and challenges he found is fantastic. I felt somewhat identified with some of his thoughts when I attempted to create a C compiler (https://github.com/SuperJMN/Plotty). It became a failed attempt at some point, but it was useful to make me realize I needed to "level up", if I may use this expression.

I thank you once again, and if you ever feel the need to create a C compiler, please, let me know. I will always be interested and will follow your work. You're already an authority for me.

Best wishes from Spain,
José Manuel.

@SuperJMN SuperJMN closed this as completed Jun 4, 2022
@mykolav mykolav pinned this issue Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants