Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chore: Generated Id are now declaration-specific #5669

Merged
merged 10 commits into from
Aug 8, 2024

Conversation

MikaelMayer
Copy link
Member

Previously, Id for the compilation of local variables were global, which caused issues to merge generated files. Fixes #5330

How has this been tested?

This is a refactoring. Existing tests should pass. Generated code is a witness that this change performs as expected.

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

Previously, Id for the compilation of local variables were global, which caused issues to merge generated files.
Fixes #5330
@keyboardDrummer
Copy link
Member

keyboardDrummer commented Aug 7, 2024

Could you add some more explanation of what the difference is? What do you mean by "Generated Id are now declaration-specific" ? I thought id generation had as a performance to prevent duplicate names, which worked using I guess a rather global counter. How does the alternative work?

It would help me review this if I understand what the goal is

get;
}
string AssignUniqueName(VerificationIdGenerator generator);
string SanitizedName(CompilationIdGenerator generator);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought SanitizedName was the name used by verification, but here it takes a compilationIdGenerator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this refactoring demonstrates, SanitizeName is used only for code generation. Verification uses a different (and nested) ID generator.

Copy link
Member

@keyboardDrummer keyboardDrummer Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that IVariable.SanitizeName(..) is unused in your PR, and can be removed without issue. I've pushed a refactoring that does that and also inlines two instances of SanitizeName(..) that were only used by CompileName(..), and renames SanitizeNameShadowable to CompileNameShadowable, since it was only used for code generation. Feel free to revert if you don't like it.

Note that for declarations, the names produced by SanitizeName(..) are used as is by Boogie. There's two properties FullSanitizedName that call SanitizeName, and these are used extensively in the Boogie generator.

As a follow-up, I think it would be good to rename FullSanitizedName to QualifiedBoogieName, since that is used almost exclusively by the Boogie generator, and then I would also consider naming .SanitizeName to .BoogieName, while keeping the name NonglobalVariable.SanitizeName(..). Also we should rename AssignUniqueName to AssignBoogieName (or actually I would prefer GetOrCreateBoogieName, since that's only used by Boogie.

@@ -5,7 +5,7 @@
namespace Microsoft.Dafny;

public class CompilationData {
public readonly FreshIdGenerator IdGenerator = new();
public readonly FreshIdGenerator IdGenerator = new CompilationIdGenerator();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use the term CodeGen instead of compilation? I know it's two words instead of one, but I feel it's more precise.

The class CompilationData that this line is in, relates to the whole compilation pipeline, not just to code generation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactoring CompilationIdGenerator to CodeGenIdGenerator done.
Now that you say it, I should also remove this line as it indicates a global id generator, which I'm removing away.

To answer your initial question, I want to avoid generating unique ID that are unique across the whole program. With such unique ID, any small change in a dfy file can potentially change all the generated cs files. This PR removes the use of global ID generators and scopes them to declarations, so that local changes in a dfy won't impact globally all other cs files, at least for temporary and local variable identifier numbers.

Therefore, I identified that there was a global static ID generator used to give a unique name to all local variables, and got rid of it, so that the name is unique only within one declaration.
Your remark helped me identify two other sources of global ID generation, which are ModuleResolver and MatchFlattener. I similarly removed both of them by stating that every code context (typically declarations) should have their own fresh id generator.

I also found that there were two kinds of ID generators, and we are mixing them. This PR also separates code generation ID generation from verification ID generation. The latter uses nested ID generation for variables that won't happen in the generated code.
There is still the possibility that the codeGenIdGenerator be used for matches in ghost code, so that the editing of ghost code might impact the fresh IDs, but I think this is a secondary issue that does not need to be addressed.

Let me know if there is still some shaded

Copy link
Member

@keyboardDrummer keyboardDrummer Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great explanation. One thing about the Dafny codebase that confuses me is that variable such as locals or parameters are not considered declarations, even though I think outside of Dafny you'd call those declarations. I think a name like VariableScope or VariableContext would more clearly describe to me at what level we're introducing id generators.

I pushed a refactoring that changes formal.CompileName(SinglePassCodeGenerator.FormalIdGenerator) to formal.CompileName, so the FormalIdGenerator is unneeded.

@@ -2129,6 +2132,7 @@ protected class NullClassWriter : IClassWriter {
if (c is not TraitDecl || TraitRepeatsInheritedDeclarations) {
thisContext = c;
foreach (var member in inheritedMembers.Select(memberx => (memberx as Function)?.ByMethodDecl ?? memberx)) {
enclosingDeclaration = member;
Copy link
Member

@keyboardDrummer keyboardDrummer Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to have this value in a CompilationContext parameter that is passed down, instead of state on the generator class, but we can change that another time.

@@ -154,7 +154,6 @@ internal static class ExprExtensions {
private static bool SameBoundVar(IVariable arg1, IVariable arg2) {
return arg1 == arg2 ||
(arg1.Name == arg2.Name &&
arg1.CompileName == arg2.CompileName &&
Copy link
Member

@keyboardDrummer keyboardDrummer Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this safe? Really weird comparison btw.. checking all these different names. If you are comparing UniqueName, can't you just as well compare the object reference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's a weird comparison. Is it because cloned variables have the same compile name perhaps? Anyway, comparing a unique name should be enough of a guarantee indeed.

@keyboardDrummer
Copy link
Member

Hey, this is a great. Thanks a lot. I pushed a few commits to further refactor some things, hope you agree with them, feel free to revert and discuss.

@MikaelMayer MikaelMayer merged commit c9cd3b6 into master Aug 8, 2024
21 checks passed
@MikaelMayer MikaelMayer deleted the chore-declaration-id-generator branch August 8, 2024 15:02
@MikaelMayer
Copy link
Member Author

Hey, this is a great. Thanks a lot. I pushed a few commits to further refactor some things, hope you agree with them, feel free to revert and discuss.

All your four commits look good. I like it a lot that you got rid of the dummy id generator for formals, and the new name for SanitizeName is more descriptive as well. Approved and merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Compilation to not use a global local variable counter, but a local one
2 participants