Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft bootstrap decl #1918

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Conversation

duckdoom5
Copy link
Contributor

@duckdoom5 duckdoom5 commented Mar 1, 2025

Okay so I managed to generate a Declaration.h file from clang source, but I'm a bit stuck.

  • Some properties from clang are missed due to parsing/converter issues
  • Some fields should be removed since they can be mapped to simple functions
  • The original decl types add a bunch of properties that no longer exist

Since they don't really match, it's not easy to map the new data to the old one. I'm not sure how to go from here.. I kinda need this new data to properly support everything and implement the recursive AST visitor we talked about before.

I feel like the only way to get this done is to slowly convert each of the original decls to the new version, but I also don't want to waste time. Any suggestions/ideas to reduce the amount of time required for this task?

Note: This code is my first pass! Definitely not ready for review XD

@duckdoom5 duckdoom5 marked this pull request as draft March 1, 2025 13:39
@duckdoom5 duckdoom5 force-pushed the feat/bootstrap-everything branch from 4070b93 to 92e3498 Compare March 1, 2025 13:44
@tritao
Copy link
Collaborator

tritao commented Mar 1, 2025

I guess first let me ask you what do you hope to get out of this change.

Is this mainly to be able to use the Clang's Recursive AST visitor, and as such try to minimize the risk of crashes we have in complex where our walking of Clang AST might be erronous / incomplete?

Is there some other advantages you are thinking about?

@duckdoom5
Copy link
Contributor Author

duckdoom5 commented Mar 1, 2025

It's 3 things really:

  • Add properties we are missing currently
    • With those added, we'd be able to use clang's implementation of the AST dumper, which means we'll have a complete AST
  • Future proofing
    • C++23 is around the corner. If we have this generator automated, it should be a click of a button and maybe some fixes to add support for new language features. We kinda already see this happening now, where C++20 features are not really supported
  • Avoid implementation bugs.
    • Make sure the properties are actually set.
      While working on this, I found that a bunch of properties that are available on the C# side are never assigned to (eg. Class.IsStatic and Class.TemplateParameters are never assigned to, but are used throughout the codebase a lot).

@tritao
Copy link
Collaborator

tritao commented Mar 2, 2025

I'll try to think about this tomorrow and see if we can come up with a plan how best to do this.

@duckdoom5
Copy link
Contributor Author

@tritao Was there a specific reason not to include OMP and OBJC expr/stmt nodes?

@tritao
Copy link
Collaborator

tritao commented Mar 3, 2025

Were never needed for my or previous users use cases I guess. The goal of the AST layer has always been mainly for the bindings generator, and there are other tools for ObjC stuff.

@duckdoom5
Copy link
Contributor Author

Cool, was just wondering, since it would be quite easy to just support them if they are auto generated anyway

@tritao
Copy link
Collaborator

tritao commented Mar 3, 2025

Cool, was just wondering, since it would be quite easy to just support them if they are auto generated anyway

On one hand, I want to say sure, why not. On another, from a maintainer point of view, just seems potential future problems. I would say unless you have immediate need for it maybe it's not worth it for this first iteration at least.

@duckdoom5
Copy link
Contributor Author

duckdoom5 commented Mar 3, 2025

Agreed.

FYI, while this is on hold, I'm working on getting the AST visitior implementation done for stmt/expr since those are already auto generated so easy to convert

@duckdoom5 duckdoom5 force-pushed the feat/bootstrap-everything branch from 92e3498 to f81f8e6 Compare March 4, 2025 09:19
@tritao
Copy link
Collaborator

tritao commented Mar 6, 2025

So just a little feedback on this so you know what I'm thinking.

Overall from a technical perspective, generating more code versus manually writing it is nice of course. I'm just a bit worried from a future maintenance perspective, it does add some new complexity and dependencies to the overall system, as the new decls depend on a lot more of the internal details of Clang and the whole bootstrapping process is also a complex process which needs some continued babying to continue generating good code.

Previously we only surfaced the bare minimum of information we have needed, which makes it more efficient from a memory perspective I think, as well as minimizing the work when upgrading Clang. So I'm kinda torn from one I really love this work and want it, from another, I'm not sure the practical benefits outweigh the added complexity.

Also to be perfectly honest, now you are very motivated and working on this (which I am greatly grateful for) , but you might not be around in a couple of years, which is normal. And I worry I might end up being in a situation of needing to spend a ton of time to fix the system as Clang upgrades are necessary.

All in all I don't want this to discourage you, I'm not saying let's not do it. it may be I just need to understand the advantages of the new approach better.

Right now there's the new Declaration.h generated file, along with the Decl.h. So what's your idea around it?
And your thoughts in general around this?

Are the advantages worth the changes here? I think overall the main issues with the current approach is the complex crashes due to bugs when doing the manual AST walking, which I hope we could make more robust by using Clang's visitor stuff.

@duckdoom5
Copy link
Contributor Author

Yeah, I see your points.
I was a bit in the same boat, which is why I wanted your feedback.

I'm just a bit worried from a future maintenance perspective

I also agree with this. I hope if I get more familiar with the remaining parts of the codebase I can help simplify this process.
(Eg. I'm looking for ways to not have to manually specify what variables we want to keep from clang.)
There is some logic to it, like removing those properties that are simple wrappers around data.
Like here, where these really don't need to be data:
image
image
If we can read part of the expression and see that it's a simple local data wrapping function, it can just stay a function.
But there's more examples.

I also think we should go through the codebase and deduplicate some code. Make use of existing utility functions instead. For example the codebase makes use of these lines of code all over the place:

TypeMap typeMap = null;
if (TypeMapDatabase.FindTypeMap(tag, out typeMap))
{
    var typePrinterContext = new TypePrinterContext { Type = tag };
    return typeMap.SignatureType(typePrinterContext).ToString();
}

But there's also the helper function GetMappedType which does exactly the same thing.
Reducing the mental load for these functions should make it more maintainable.

Right now I'm taking a step back from this, and am focusing on the features that I need to generate source for my codebase. In the meantime, I'll try to improve/fix some of the CppSharp code.

I think overall the main issues with the current approach is the complex crashes due to bugs when doing the manual AST walking, which I hope we could make more robust by using Clang's visitor stuff.

I think so too, but it's a bit difficult to map the current decl types to Clang's visitor since it uses a lot more types.
I have frameworked a visitor implementation for Expr. I'll push it as a draft so you can see what I have prepared. I think that should give you an idea of what I'm dealing with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants