Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for complicated filenames and file copies #61

Merged
merged 3 commits into from
May 26, 2021

Conversation

kwi-dk
Copy link
Contributor

@kwi-dk kwi-dk commented Mar 22, 2021

This fixes parsing of unquoted filenames with multiple spaces in them, quoted filenames with spaces in them, and diffs containing file copies (bug #48).

When filenames contain special characters, they are quoted, but not if they only contain spaces, meaning the diff --git argument line becomes ambiguous. It is, however, possible to reconstruct the filenames using the extended headers.

When filenames contain special characters, they are quoted, but not if
they only contain spaces, meaning the 'git --diff' argument line becomes
AMBIGUOUS. It is, however, possible to reconstruct the filenames using
the extended headers.

This fixes parsing of unquoted filenames with multiple spaces in them,
quoted filenames with spaces in them, and diffs containing file copies.
@codecov
Copy link

codecov bot commented Mar 22, 2021

Codecov Report

Merging #61 (28094f6) into master (a768196) will increase coverage by 2.75%.
The diff coverage is 97.18%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
+ Coverage   74.59%   77.34%   +2.75%     
==========================================
  Files           4        4              
  Lines         429      490      +61     
==========================================
+ Hits          320      379      +59     
- Misses         62       63       +1     
- Partials       47       48       +1     
Impacted Files Coverage Δ
diff/parse.go 84.03% <97.18%> (+2.61%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a768196...28094f6. Read the comment docs.

@mrnugget
Copy link
Contributor

Hey @kwi-dk! Thanks for opening this PR and sorry for not leaving a comment earlier. I plan on reviewing this PR, but have to admit that I most likely will only get to it next week.

Copy link
Contributor

@mrnugget mrnugget left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Especially for adding the tests.

I left a bunch of comments to make the code a bit more idiomatic. Let me know if you have bandwidth to incorporate them, otherwise I'm happyt to take the PR over.

diff/diff_test.go Outdated Show resolved Hide resolved
for _, input := range tests {
_, _, err := readQuotedFilename(input)
if err == nil {
t.Errorf("readQuotedFilename(`%s`): expected error", input)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Errorf("readQuotedFilename(`%s`): expected error", input)
t.Errorf("readQuotedFilename(%q): expected error", input)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern about quoting as above.

}

func TestParseDiffGitArgs_Success(t *testing.T) {
tests := []string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: I think structs would make this easier to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use structs.

diff/parse.go Outdated
// readQuotedFilename extracts a quoted filename from the beginning of a string,
// returning the unquoted filename and any remaining text after the filename.
func readQuotedFilename(text string) (value string, remainder string, err error) {
if text[0] != '"' {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if text[0] != '"' {
if text != "" && text[0] != '"' {

Otherwise this will blow up when passed an empty string (another test case?)

diff/parse.go Outdated
// returning the unquoted filename and any remaining text after the filename.
func readQuotedFilename(text string) (value string, remainder string, err error) {
if text[0] != '"' {
panic("caller must ensure filename is quoted! " + text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be a panic but an error, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here was that readQuotedFilename is a private function, and its callers always ensure that text starts with a quote. This test is thus just a sanity check; if it fails, it's a programming error, and a panic is fine.

But since this is evidently not clear at all, I'll just change to ordinary error handling, and silence the Codecov bot while I'm at it. :)

diff/parse.go Outdated
}

// parseDiffGitArgs extracts the two filenames from a 'diff --git' line.
func parseDiffGitArgs(diffArgs string) (bool, string, string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more idiomatic signature for a function like this in Go would be this:

Suggested change
func parseDiffGitArgs(diffArgs string) (bool, string, string) {
func parseDiffGitArgs(diffArgs string) (string, string, bool) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

diff/parse.go Show resolved Hide resolved
diff/parse.go Show resolved Hide resolved
diff/parse.go Outdated
// One or both filenames contain a space, but the names are
// unquoted. Here, the 'diff --git' syntax is ambiguous, and
// we have to obtain the filenames elsewhere (e.g. from the
// chunk headers or extended headers). HOWEVER, if the file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// chunk headers or extended headers). HOWEVER, if the file
// hunk headers or extended headers). HOWEVER, if the file

diff/parse.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@kwi-dk kwi-dk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay; turns out my GitHub notifications are broken. 😅

Feedback addressed in 28094f6.

for _, input := range tests {
_, _, err := readQuotedFilename(input)
if err == nil {
t.Errorf("readQuotedFilename(`%s`): expected error", input)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern about quoting as above.

diff/diff_test.go Outdated Show resolved Hide resolved
}

func TestParseDiffGitArgs_Success(t *testing.T) {
tests := []string{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use structs.

diff/diff_test.go Show resolved Hide resolved
diff/parse.go Show resolved Hide resolved
diff/parse.go Outdated Show resolved Hide resolved
diff/diff_test.go Outdated Show resolved Hide resolved
diff/parse.go Outdated
// returning the unquoted filename and any remaining text after the filename.
func readQuotedFilename(text string) (value string, remainder string, err error) {
if text[0] != '"' {
panic("caller must ensure filename is quoted! " + text)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here was that readQuotedFilename is a private function, and its callers always ensure that text starts with a quote. This test is thus just a sanity check; if it fails, it's a programming error, and a panic is fine.

But since this is evidently not clear at all, I'll just change to ordinary error handling, and silence the Codecov bot while I'm at it. :)

diff/parse.go Outdated Show resolved Hide resolved
diff/parse.go Outdated Show resolved Hide resolved
@kwi-dk kwi-dk requested a review from mrnugget May 25, 2021 11:35
Copy link
Contributor

@mrnugget mrnugget left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Thank you so much for this follow-up!

@mrnugget mrnugget merged commit 35b24a7 into sourcegraph:master May 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants