Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support C-Style enum in preproc #1984

Merged
merged 5 commits into from
Jul 13, 2024
Merged

Conversation

SBird1337
Copy link
Collaborator

Gives basic support for parsing C-Style enum expressions when using preproc.

Description

In order to deal with situations where one would rather want to use an enum but can't because preproc does not know how to deal with it when compiling scripts and other assembly files together with regular C TUs.

Example:
test.h:

#ifndef TEST_H_
#define TEST_H_

enum TestEnum {
    A=010,B,C,D,
    E=1, F, G=0x3, H,

    I=
    0b1100,

	J, K,

L
};

#endif

test.s:

#include "test.h"

mov r0, #A

turns into

# 1 "test.s"
# 1 "<stdin>"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "<stdin>"
# 1 "test.s"
# 1 "test.h" 1



# 5 "test.h"
.equiv A, 8
# 5 "test.h"
.equiv B, 9
# 5 "test.h"
.equiv C, 10
# 5 "test.h"
.equiv D, 11
# 6 "test.h"
.equiv E, 1
# 6 "test.h"
.equiv F, 2
# 6 "test.h"
.equiv G, 3
# 6 "test.h"
.equiv H, 4
# 8 "test.h"
.equiv I, 12
# 11 "test.h"
.equiv J, 13
# 11 "test.h"
.equiv K, 14
# 13 "test.h"
.equiv L, 15
# 2 "test.s" 2

mov r0, #A

GNU-Style line indicators are added s.t. errors can be traced back to the original source file (e.g. if an enum field is supposed to be redefined)

Discord contact info

karathan

@SBird1337 SBird1337 marked this pull request as draft April 1, 2024 08:51
@SBird1337
Copy link
Collaborator Author

Converting to draft because I kind of forget the enum assembly macro edge case

@SBird1337 SBird1337 marked this pull request as ready for review April 1, 2024 09:55
@SBird1337
Copy link
Collaborator Author

fixed the issue and rebased.

tools/preproc/io.cpp Show resolved Hide resolved
tools/preproc/asm_file.cpp Show resolved Hide resolved
@mid-kid
Copy link
Member

mid-kid commented May 16, 2024

my 2ct that pretty much applies to all of preproc is that papering over things with a custom preprocessor that parses "just enough" of the syntax but not all of it will cause a lot of confusion and frustration from developers down the line, who will expect the compiler to behave like a normal C compiler/assembler. Take for example the following code:

#define X(prefix) prefix##_A, prefix##_B, prefix##_C
enum {
    X(foo),
    X(bar)
};

I don't think things like this will work with the current PR as-is, and I can't begin to think what other syntax quirks may pop up when people try doing more out-there things.

I should mention that the snippet above primarily wouldn't work because for some reason, assembler files are first passed through preproc before going through the C preprocessor, instead of the other way around. I think the snippet above specifically could be made to work without too much fuss, but my point is mostly about it being hard to cover all corner cases, and adhere to the principle of least surprise when a user discovers it's not "real C".

@GriffinRichards
Copy link
Member

Agreed, but I'd wager the lack of any support for C-style enums in assembly files is a bigger headache than incomplete support. We're already mixing C header files into assembly, so the door for this kind of confusion is open. These sorts of incompatibilities are basically always going to be a problem with preproc like you said, and replacing it is beyond this PR.

@mid-kid
Copy link
Member

mid-kid commented May 16, 2024

The C preprocessor has its own syntax that can be clearly discerned by the # prefix, making it fairly suitable for mixing with a bunch of languages. It avoids the confusion I'm referring to by not pretending to be something it's not, which preproc is trying to do here.

Anyway, I'm not asking to remove preproc as a whole, but make you consider that this feature might make the situation worse by making certain problems harder to debug: If the issue I pointed out in the previous message is not accointed for it's rather easy to accidentally make an enum that has different values depending on whether it's compiled for ASM or for C. I'd like to try to minimize this sort of thing, at the very least...

@pkmnsnfrn
Copy link
Contributor

I can very confidently say that there are more users that will be served by this PR versus users that are expecting the C compiler to behave a certain way. The majority of our userbase doesn't even know what a compiler is!

@mid-kid
Copy link
Member

mid-kid commented Jun 22, 2024

If newbies are modifying C, often with the intention of learning C, it's especially destructive to introduce hard-to-debug footguns like this. How do you expect newbies to realize that the code they learned from google, a friend, or a book, isn't valid here? That's what I'm concerned about: This preprocessor isn't complete enough to be C, and it's not obvious that it's not C. People will expect it to be C and be confused when their game breaks.

@mrgriffin
Copy link
Collaborator

mrgriffin commented Jul 8, 2024

mid-kid is right that the preprocessor interacts weirdly with our assembler pipeline. But there's already some sharp edges in that interaction: I can't remember exactly the examples, but I think people get themselves into a mess using .if along with preprocessor defines.

So I think as long as the enum syntax matches the syntax that's available in C this won't be too confusing for newbies (I think all that's missing is trailing comma support as mentioned by GriffinR?). From a quick glance at the Makefile changes there's now two preproc passes so perhaps if enum processing was delayed until that second pass mid-kid's example that mixes enum and #define would work?

SBird, I can take a look at PRing these changes to your branch if you'd like?

@SBird1337
Copy link
Collaborator Author

SBird1337 commented Jul 9, 2024

Incorporated MGriffins changes to address the following issues:

  • Trailing comma support for enum
  • enum mixed with #define as hinted by mid-kid
  • Convert argument parsing to use getopt
  • #include <cerrno>

E: I generally agree that preproc is a bit of a monster in terms of hacky workarounds, but since we are living with it and the lack of enum support is causing troubles I think this makes the workflow better overall :)

@GriffinRichards
Copy link
Member

This likewise fails to build for me, with only the usage information as output. I suspect it's a difference with getopt. It might be worth throwing a macOS build on the workflow at some point.

@mrgriffin mrgriffin merged commit 550e668 into pret:master Jul 13, 2024
1 check passed
issueWithLife pushed a commit to issueWithLife/pokeemerald that referenced this pull request Sep 2, 2024
* [preproc] C-style enums

- asm files parseable from stdin
- 2nd preproc pass
- add parser for C-style `enum`
- positional arguments at end of command

---------

Co-authored-by: sbird <[email protected]>
Co-authored-by: Martin Griffin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants