Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexps cannot be used to parse C! #438

Open
informatimago opened this issue May 3, 2023 · 4 comments
Open

regexps cannot be used to parse C! #438

informatimago opened this issue May 3, 2023 · 4 comments
Labels
bug_parser Bug -- specifically one that requires better parser bug

Comments

@informatimago
Copy link
Contributor

informatimago commented May 3, 2023

Hi!

A lot of issues concerning the parsing of C come actually from the fact that regexps are used to try to parse C code. This is doomed.
For example, when trying to parse function signatures followed by __attribute__ such as:
void foo() __attribute__((noreturn))
the regexps fail, because they match parentheses without balancing them.
We could have more complex declarators:
int (*)(int(*)(int)) foo(int(fun*)(int),int x) __attribute__((glop))

What is your opinion on using a real C parser such as rcc (gem), or libclang?

Note, even if we want to parse a more relaxed version of C rather than strictly adhering to a specific standard, using an actual parser would be of great benefit.

https://groups.google.com/g/ttsforums/c/Ow6acueWPdY

@mvandervoord
Copy link
Member

The goal has been to move CMock to a real parser for some time. The trouble has been that many C-compilers in the wild extend C/cpp in various ways... so we've yet to find a parser which supports the extensions we need (try to find a parser which can 100% files meant for IAR, Green Hills, and GCC from the same parser... you'll see what I mean.)

One option is to preprocess the non-standard parts out first... but then we're back to using regexes or similar.

We haven't given up on this goal. If you dig through the older open issues, you'll see that this issue is already there and has much discussion around it.

We're definitely open to more thoughts.

@informatimago
Copy link
Contributor Author

Indeed, this is a problem.

Also, do we want to generate mocks for "anonymous" functions, or functions that are just declared thru function pointer variables? eg. cf. direct_declarator and abstract_direct_declarator; the current code doesn't parse all possible cases.

Anyways, for now, I'm just implementing a parser tracking the parentheses balancing, with ad-hoc identifying of the various parts.

@Letme
Copy link

Letme commented May 4, 2023

You can do that - you just mock the function (someFunc) and then put it as global structure for the pointer variables (that's the ugly part - but only for unit test) and assign in test case setUp the mocks. Then you can mock return value, and enforce input parameters...

@jtafarrelly
Copy link

I believe this is the same issue as this one from 2017. It's been a stumbling block for me over the past few weeks, as any external library header files with a #define __foo(someinfo)__bar(otherinfo) in our project can't be mocked - something that seems to be remarkably common.

If there are any workarounds to this, they'd be very appreciated ^^

@mvandervoord mvandervoord added bug bug_parser Bug -- specifically one that requires better parser labels Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug_parser Bug -- specifically one that requires better parser bug
Projects
None yet
Development

No branches or pull requests

4 participants