Fix macOS/BSD incompatibility in general:check-filenames
task
#423
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The ""Check Files" (Task)" template includes an asset task named
general:check-filenames
that checks for the presence of non-portable filenames in the project.Ironically, the task itself was non-portable. The problem was that it used the
--perl-regexp
flag in thegrep
command. This flag is not supported by the BSD version of grep used on macOS and BSD machines. This caused the task to fail spuriously withgrep: unrecognized option '--perl-regexp'
errors when ran on a macOS or BSD machine.The incompatibility is resolved by changing the
--perl-regexp
flag to--extended-regexp
. This flag, which is supported by the BSD and GNU versions of grep, allows the use of the modern and reasonable capable POSIX ERE syntax on all platforms.Unfortunately the regular expression used in the previous command relied on one of the additional features only present in the PCRE syntax so the previous pattern can not be used in combination with the
--extended-regexp
flag. The PCRE-specific syntax was used to check for the presence of a range of characters prohibited by the Windows filename specification:https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions
Due to the nature of these characters, they must be represented by code in the regular expression. This was done using the
\x{hhh..}
syntax supported by PCRE. Neither that syntax nor any of the equivalent escape patterns are supported by POSIX ERE. A solution is offered in the GNU grep documentation:https://www.gnu.org/software/grep/manual/grep.html#Matching-Non_002dASCII-and-Non_002dprintable-Characters
So this approach can be used to represent the character range using octal codes (
\000-\037
):https://www.gnu.org/software/coreutils/manual/html_node/printf-invocation.html#printf-invocation:~:text=printf%20interprets%20%E2%80%98%5Cooo%E2%80%99%20in%20format%20as%20an%20octal%20number
As also mentioned in the grep manual:
So the range of characters in the pattern can not include NUL. However, it turns out that even the previous command did not detect this character in filenames although it was included in the regular expression pattern, so this limitation doesn't result in any regression in practice.
Alternative solution
I also considered the alternative of using perl instead of grep. This approach allows the use of the PCRE syntax even on macOS/BSD machines.
This command is equivalent to the previous
grep
command, with similar performance to thegrep --extended-regexp
command proposed in this PR:the full command using this approach:
I am reluctant to introduce the use of perl into the assets because I feel that the maintainers of the assets and the projects they are used in are less likely to have existing familiarity with perl that with grep and also will not likely benefit much from the effort of gaining a familiarity with perl.