Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to diff DocC Archives #908

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

emilyychenn
Copy link
Contributor

@emilyychenn emilyychenn commented Apr 30, 2024

Bug/issue #, if applicable:

Summary

Adds a command line option for processing DocC Archives, producing a diff between two different versions of DocC Archives for the same framework or technology.

This diff contains a list of symbols added/removed between the two versions, and is stored in an output markdown file called {FrameworkName}_ChangeLog.md that will be written to the same directory as the input files.

The output markdown page can then be copy-pasted into the corresponding project and built as an article alongside the other documentation, serving as a starting point for writing change logs or updates pages.

To run this script, run the following command:

OVERVIEW: Generate a changelog with symbol diffs between documentation archives ('.doccarchive' directories).

USAGE: docc generate-changelog <initialDocCArchive> <newerDocCArchive> [--initial-archive-name <initial-archive-name>] [--newer-archive-name <newer-archive-name>] [--show-all <show-all>]

ARGUMENTS:
  <initialDocCArchive>    The path to the initial DocC Archive to be compared.
  <newerDocCArchive>      The path to the newer DocC Archive to be compared.

OPTIONS:
  --initial-archive-name <initial-archive-name>
                          The name of the initial DocC Archive version to be compared. (default: Version 1)
  --newer-archive-name <newer-archive-name>
                          The name of the newer DocC Archive version to be compared. (default: Version 2)
  --show-all <show-all>   Boolean value to indicate whether to produce a full symbol including all properties, methods, and overrides (default: false)
  -h, --help              Show help information.

Dependencies

N/A

Testing

Steps:

  1. Run this command locally to diff two releases for a given framework, by passing in two local DocC Archives.

    For example:
    swift run docc process-archive diff-docc-archive "Release 1" /path/to/local/doccarchive/ "Release 2" /path/to/local/doccarchive2/

  2. Ensure that a new markdown file has been generated at the top level of the first DocC archive. This should be titled {FrameworkName}_ChangeLog.md (or NoFrameworkFound_ChangeLog.md if the framework name was not found).

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • Updated documentation if necessary

Copy link
Contributor

@d-ronnqvist d-ronnqvist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level I have some concerns about this command. Mainly I don't think that operating on already built documentation archives is going to be able to provide a good user experience.

Specifically, there are a handful of key issues that I foresee:

  • The render node identifier isn't necessarily stable between versions. As such, API could incorreclty be flagged as both added and removed despite not having changed.
  • The archive doesn't contain sufficient information to reconstruct a working symbol or documentation link to each added symbol.
  • The archive doesn't contain sufficient information—at least not in a structured way that can easily be operated on—to group the added or removed symbols.
    For example, if a type adds conformance to Comparable, it would be very difficult to present that information in a succinct and human friendly way, such as "Added conformance to Comparable", instead of listing the 5 new operators that can be called on that type (..<(_:), >(_:_:), <(_:_:), >=(_:_:), <=(_:_:))
  • Iterating the entire documentation archive directory structure to decode each page is both a very slow way of finding the differences and doesn't contain any information about pages that moved from one place to another.
  • The archive doesn't contain sufficient information—at least not in a structured way that can easily be operated on—to identify other types of changes that would be relevant to list in a change log, for example API that still exist but was deprecated from one version to the next.

Comment on lines 139 to 142
var removalLinks: String = ""
for removal in removalsExternalURLs {
removalLinks.append("\n- <\(removal)>")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that the generated change log file is intended to be included in another build for the newer version of the documentation, it won't be possible to link to any symbols that have been removed because those are not in that version of the source anymore.

return returnSymbolLinks
}

func findSymbolLink(symbolPath: URL) throws -> URL? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API (and also findAllSymbolLinks(initialPath:) and findExternalLink(identifierURL:)) doesn't return a value that can be used as a symbol link in DocC.

AFAIK there isn't a reliable way to go from a rendered page to a symbol page. You may need to rethink this approach and use another source of data if you're going to generate working markup files containing correct symbol links.


/// The framework name is the path component after "/documentation/".
func findFrameworkName(initialPath: URL) throws -> String? {
guard let enumerator = FileManager.default.enumerator(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use our FileManagerProtocol abstraction---instead of FileManager.default directly---to be able to write tests that don't interact with the real file system.

@emilyychenn emilyychenn force-pushed the diff-doccarchives branch 2 times, most recently from a21dba3 to bc41ccc Compare May 21, 2024 10:28
@d-ronnqvist d-ronnqvist marked this pull request as draft September 20, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants