-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Diagnose invalid characters in bundle identifiers #1012
base: main
Are you sure you want to change the base?
Fix Diagnose invalid characters in bundle identifiers #1012
Conversation
- Implemented `validateBundleIdentifier()` to validate that the bundle identifier only contains characters allowed in URL hosts as per RFC 3986. - The method checks for invalid characters such as spaces and any characters not included in the alphanumeric, hyphen, period, underscore, and tilde sets. - If invalid characters are detected, a diagnostic warning is generated and appended to the `problems` array with details about the issue. - This enhancement helps prevent issues related to incorrect bundle identifiers by enforcing proper formatting standards.
- Implement tests for valid and invalid bundle identifiers - Cover scenarios with spaces and special characters - Use GIVEN-WHEN-THEN structure for improved readability - Ensure consistent setup() call across all tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening this PR. However, the NavigatorIndex
is not the right place to validate the bundle identifier.
It would be better to validate the bundle identifier earlier, before the build starts DocC begins processing the documentation inputs.
There are only two places where the developer can specify a bundle identifier:
- A value in the documentation catalog's Into.plist
- A
--fallback-bundle-identifier
value on the command line
It would be a nicer experience for the developer if the diagnostic let the developer know which of these two that the invalid identifier came from.
This code in DocumentationBundle+Info.swift could be one place to consider adding this validation since it knows if the value came from the Info.plist or from the command line fallback.
When I tested running docc convert /path/to/SomeCatalog.docc --fallback-bundle-identifier "not valid"
just now, it crashed failing to create a ResolvedTopicReference
value. Because of this, a "warning" level diagnostic wouldn't be suitable, since that lets the build progress until it crashes.
That said, since we can diagnose the issue and know which characters are not allowed, it's possible that we could correct the issue by removing the disallowed characters in the developer-provided bundle identifier (or by replacing them with an allowed character like "-"). Doing so would avoid the crash which would allow for a "warning" level diagnostic again.
It's also worth noting that the docc process-archive index
command has its own bundle identifier command line argument that possibly would require its own validation.
/// - Important: This method assumes that `bundleIdentifier` is a non-optional string. | ||
private func validateBundleIdentifier() { | ||
// URL host valid characters (RFC 3986) | ||
let validCharacters = CharacterSet.alphanumerics.union(CharacterSet(charactersIn: "-._~")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use CharacterSet.urlHostAllowed
for this.
// GIVEN | ||
let outputURL = try createTemporaryDirectory().appendingPathComponent("valid-bundle") | ||
let builder = NavigatorIndex.Builder( | ||
outputURL: outputURL, | ||
bundleIdentifier: "com.example.valid-bundle_identifier" | ||
) | ||
|
||
// WHEN | ||
builder.setup() | ||
|
||
// THEN | ||
XCTAssertTrue(builder.problems.isEmpty, "No problems should be reported for a valid bundle identifier") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "given", "when", "then" comments is not a style that we use anywhere else.
These specific tests won't be relevant—since the navigator index isn't the right place to verify the bundle identifier—but for any other tests, please omit those specific comments to match the style of the existing tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The navigator index is just one of the places that make use of the bundle identifier but it's not a source of it. As such, it is not the right place to perform this validation.
@agisilaos Are you still working on this? |
Fixes: #290
Changes
This PR introduces bundle identifier validation in the
NavigatorIndex.Builder
class and adds corresponding unit tests to ensure its correctness.Implementation
validateBundleIdentifier()
method toNavigatorIndex.Builder
.Tests
DiagnosticBundleIdentifierTests
to thoroughly test the new validation logic.Rationale
Improved Data Integrity: Validating bundle identifiers ensures that only valid identifiers are used, preventing potential issues downstream in the documentation generation process.
Compliance with Standards: By adhering to RFC 3986 for URL hosts, we ensure that our bundle identifiers are compatible with web-based systems and avoid potential conflicts or encoding issues.
Early Error Detection: Catching invalid bundle identifiers early in the process allows developers to address issues promptly, improving the overall quality of documentation projects.
Clear Feedback: The detailed error messages provide developers with actionable information to correct invalid bundle identifiers.
Robust Testing: The comprehensive test suite ensures that the validation logic works as expected and helps prevent regressions in future updates.
Impact
Notes for Reviewers
Please pay special attention to the character set used for validation. Are there any additional characters we should consider allowing or restricting?
The error message content and severity (warning) are open for discussion. Should we consider making this a more severe error?
The test coverage aims to be comprehensive. Are there any additional edge cases you think we should test?
Added tests
Ran the
./bin/test
script and it succeededUpdated documentation if necessary