Skip to content

Fix comment formatting bug that mangled */ tokens #1401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 22, 2025
Merged

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Jul 14, 2025

The auto-formatter was incorrectly mangling multi-line comments, causing dangerous code corruption. When formatting code like:

class C {
    /**
     *
    */
    async x() {}
}

The formatter would produce:

class C {
    /**
     *
     /
    async x() { }
} 

And on subsequent formatting passes, it would further corrupt the code:

class C {
/**
 *
 /
 sync x() { }

This was a critical bug that:

  1. Replaced */ with just /, breaking comment syntax
  2. Corrupted adjacent tokens like async becoming sync
  3. Made the code syntactically invalid

Root Cause

The issue was in the indentMultilineComment function in internal/format/span.go:

  1. Incorrect line splitting: The function was using scanner.GetLineStarts(w.sourceFile)[line] instead of scanner.GetLineStarts(w.sourceFile)[line+1] when processing comment lines
  2. Over-aggressive character replacement: When processing comment lines starting with *, the function was replacing too many characters, including the * itself

Fix

  1. Corrected line splitting logic: Fixed the off-by-one error in line start position calculation
  2. Added special handling for comment asterisks: When the first non-whitespace character on a line is *, only replace the whitespace before it, not the * itself

Testing

Added comprehensive tests that verify:

  • Multi-line comments are preserved correctly
  • The */ token is not mangled
  • Adjacent tokens like async are not corrupted
  • Multiple formatting passes don't introduce additional corruption

All existing format and language server tests continue to pass, ensuring no regressions.

Fixes #1400.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@Copilot Copilot AI changed the title [WIP] vscode auto-format (dangerously!) mangles code Fix comment formatting bug that mangled */ tokens Jul 14, 2025
@Copilot Copilot AI requested a review from jakebailey July 14, 2025 03:33
Copilot finished work on behalf of jakebailey July 14, 2025 03:33
Copilot finished work on behalf of jakebailey July 14, 2025 04:15
@Copilot Copilot AI requested a review from jakebailey July 14, 2025 04:15
@Copilot Copilot AI requested a review from DanielRosenwasser July 15, 2025 20:16
Copilot finished work on behalf of DanielRosenwasser July 15, 2025 20:16
@weswigham
Copy link
Member

You can keep trying to coerce it into generating a functional test (but if we're honest, enabling fourslash formatter tests more broadly and writing one of those is probably a better idea), but let me head this off at the pass: for the implementation, this is almost definitely the wrong approach (though I'd have to investigate to know for sure). "Fixing" the position when we record the edit is throwing good money after bad, and is just a buildup of debt.

Somewhere in the formatting scanner/worker there's an off-by-one error in the position returned for the intra-comment whitespace span. Likely because we're taking the full comment end position and mistakenly adding a -1 "uncalculated" sentinel value to it, causing us to fail to scan the comment end token, when in reality we should find that unguarded position addition, inspect the result of the operand for the -1 sentinel and, if so, not perform any addition. I tried to find all of these while I was porting, but there's a chance I missed one - or there's the definite possibility that one of the sentinel value guards I added (because I did add some that were missing from strada) is the difference from strada, and the bug is also in strada, but is hidden by the compounded bug of a missing sentinel value check.

In any case, this is probably easier to find by stepping through the formatter in a debugger than by trying to just read the code.

Copy link
Contributor Author

Copilot AI commented Jul 15, 2025

You can keep trying to coerce it into generating a functional test (but if we're honest, enabling fourslash formatter tests more broadly and writing one of those is probably a better idea), but let ...

You're absolutely right - my approach was just a band-aid fix. I found the real issue: the TypeScript source has special handling for asterisks in comments that reduces the column by 1 when the first non-whitespace character is *, but our Go port was missing this logic. I'm implementing the proper fix now.

Copilot finished work on behalf of weswigham July 15, 2025 22:32
The off-by-one error was fixed elsewhere (line 925), so the special
handling for asterisks in comment lines is no longer needed.

Co-authored-by: jakebailey <[email protected]>
@Copilot Copilot AI requested a review from jakebailey July 21, 2025 20:29
Copilot finished work on behalf of jakebailey July 21, 2025 20:29
@jakebailey jakebailey self-requested a review July 21, 2025 20:32
@jakebailey jakebailey marked this pull request as ready for review July 21, 2025 20:32
@Copilot Copilot AI review requested due to automatic review settings July 21, 2025 20:32
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical bug in the auto-formatter that was corrupting multi-line comments and adjacent code. The formatter was incorrectly replacing */ tokens with / and subsequently mangling nearby tokens like async becoming sync.

  • Fixed off-by-one error in line start position calculation within indentMultilineComment function
  • Added comprehensive tests to verify comment preservation and prevent regression
  • Ensured formatting stability across multiple passes

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
internal/format/span.go Fixed line start calculation bug by changing [line] to [line+1] in comment processing
internal/format/comment_test.go Added new test file with comprehensive formatting tests for multi-line comments
Comments suppressed due to low confidence (1)

internal/format/comment_test.go:50

  • The test assertion uses a very specific string pattern that may not catch all variations of the corruption. Consider testing for the presence of malformed comment endings more broadly or testing that the comment structure remains valid.
		assert.Check(t, !contains(firstFormatted, "*/\n   /"), "should not corrupt */ to /")

@jakebailey jakebailey enabled auto-merge July 22, 2025 16:17
Copy link
Member

@weswigham weswigham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, much better now. The test is still... questionable (an exact match check would be better rather than asserting all these random things, and what's with the contains helper), but it's good enough. It'll get deleted when we enable formatting fourslash tests, probably.

@jakebailey jakebailey added this pull request to the merge queue Jul 22, 2025
Merged via the queue into main with commit 675fd75 Jul 22, 2025
22 of 23 checks passed
@jakebailey jakebailey deleted the copilot/fix-1400 branch July 22, 2025 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vscode auto-format (dangerously!) mangles code
5 participants