Skip to content

Commit

Permalink
Update Swift Source Info (#108)
Browse files Browse the repository at this point in the history
  • Loading branch information
qyang-nj authored Mar 25, 2024
1 parent d829b5a commit 9df0d2b
Show file tree
Hide file tree
Showing 7 changed files with 83 additions and 31 deletions.
81 changes: 50 additions & 31 deletions articles/SwiftSourceInfo.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,35 @@
# Swift Source Info (`.swiftsourceinfo`)
The `.swiftsourceinfo` file is generated by the Swift compiler during compilation. It is emitted along with `.swiftmodule` and `.swiftdoc`, when the `-emit-module` flag is present. This file records the Swift source information of a Swift module, including source file properties and symbol (USR) declarations.
The `.swiftsourceinfo` file is generated by the Swift compiler during compilation. It is emitted alongside `.swiftmodule` and `.swiftdoc` when the `-emit-module` flag is present. As its name suggests, this file records the Swift source information of a Swift module, including file paths, timestamps, symbol (USR) declarations, and more.

The `.swiftsourceinfo` file is used to enhance diagnostics, indexing, and potentially debugging. However, it always [embeds the absolute paths](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/SerializeDoc.cpp#L765-L767). If the file is downloaded from a remote cache, local diagnostics and indexing are hindered. Therefore, [a tool](https://github.com/qyang-nj/source-info-import) is needed to remap the source paths.

## The Usage
Because `.swiftsourceinfo` is an implementation detail to the compiler, there is almost no documentation for where it is used or how it is formatted. I only found [this proposal](https://forums.swift.org/t/proposal-emitting-source-information-file-during-compilation/28794) when it's added.
Since `.swiftsourceinfo` is considered an implementation detail of the compiler, there is limited documentation available regarding its usage and format. However, I have managed to gather some insights from online discussions and my own investigations.

### Diagnostics
Based on the initial [proposal](https://forums.swift.org/t/proposal-emitting-source-information-file-during-compilation/28794), the `.swiftsourceinfo` file was introduced to enhance diagnostics, also known as compiler error messages. Within the Swift source code, I discovered [a test case](https://github.com/apple/swift/blob/17ca88c94a34b34c3c354891e899e82ce98f46ee/test/diagnostics/multi-module-diagnostics.swift), of which a simplified version is presented [here](../building/swift_source_info/diagnostics/).
``` swift
// ModuleA.swift
open class ParentClass {
open func foo(a: Int) {}
}

// ModuleB.swift
import ModuleA

open class SubClass: ParentClass {
open override func foo(a: String) {}
}
```

During an investigation, I found that **swift source info is used by SourceKit to power indexing-related features**. For example, in the code below, if we jump to definition of the symbol `foo`, it cannot succeed without `.swiftsourceinfo`, even with a fully populated IndexStore. This is because it's a synthesized symbol
(`s:10FooLibrary0A8ProtocolPA2A0A6StructVRszrlE3fooSSvpZ::SYNTHESIZED::s:10FooLibrary0A6StructV`). This symbol can be seen in the SourceKit logging and the emitted symbol graph json file.
Here are the error messages with and without the `.swiftsourceinfo` file.
![The difference of the error messages](./images/swiftsourceinfo_diagnostics.png)
In Xcode, having the full path and line number enables us to double-click the error message to jump directly to the corresponding file. Without `.swiftsourceinfo`, this functionality is not available.

### Indexing
During my investigation, I discovered that swift source info is utilized by SourceKit to enhance indexing-related features. For instance, in the code example below, attempting to jump to the definition of the symbol `foo` will not succeed without `.swiftsourceinfo`, even if the IndexStore is fully populated. This is because `foo` is a synthesized symbol (`s:10Foo0A8ProtocolPA2A0A6StructVRszrlE3fooSSvpZ::SYNTHESIZED::s:10Foo0A6StructV`). This particular symbol can be observed in both the SourceKit logs and the generated symbol graph JSON file.

```swift
// xcrun swiftc -emit-module -module-name FooModule -emit-symbol-graph -emit-symbol-graph-dir . Foo.swift
// Foo.swift
public protocol FooProtocol {}
public struct FooStruct : FooProtocol {}
Expand All @@ -23,30 +44,29 @@ extension FooProtocol where Self == FooStruct {
print(FooStruct.foo)
```

However, `.swiftsourceinfo` [always embeds the absolute paths](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/SerializeDoc.cpp#L765-L767). If the file is downloaded from a remote cache, the local indexing is also hindered. Therefore we need a tool to remap the source paths.

The `.swiftsourceinfo` file cloud also be used by the debugger (lldb), but I haven't found a case to confirm that.
### Debugging
The `.swiftsourceinfo` file could also be utilized by the debugger, but I have yet to find a specific case to confirm this.

## The File Format
> [!WARNING]
> The format of `.swiftsourceinfo` is not guaranteed to be stable. It may change across different compiler versions. The following content is verified with Swift 5.9.
> The format of `.swiftsourceinfo` is not guaranteed to be stable. It may change across different compiler versions. The following content is verified with Swift 5.9 and 5.10.
Like everything else in LLVM and Swift, `.swiftsourceinfo` file is in a [LLVM Bitstream](https://llvm.org/docs/BitCodeFormat.html#bitstream-format) binary format. Using `llvm-bcanalyzer`, we can see its high level block structure.

```
$ llvm-bcanalyzer -dump FooModule.swiftmodule/Project/arm64-apple-ios-simulator.swiftsourceinfo
$ llvm-bcanalyzer -dump Foo.swiftsourceinfo
<BLOCKINFO_BLOCK/>
<MODULE_SOURCEINFO_BLOCK NumWords=367 BlockCodeSize=2>
<CONTROL_BLOCK NumWords=41 BlockCodeSize=3>
<METADATA abbrevid=5 .../> blob data = 'Apple Swift version 5.9.2 (swiftlang-5.9.2.2.56 clang-1500.1.0.2.5)'
<MODULE_NAME abbrevid=4/> blob data = 'FooModule'
<TARGET abbrevid=6/> blob data = 'arm64-apple-ios17.2-simulator'
<MODULE_SOURCEINFO_BLOCK NumWords=281 BlockCodeSize=2>
<CONTROL_BLOCK NumWords=36 BlockCodeSize=3>
<METADATA abbrevid=5 op0=3 op1=0 op2=0 op3=0 op4=0 op5=0 op6=0 op7=0/> blob data = 'Apple Swift version 5.10 (swiftlang-5.10.0.13 clang-1500.3.9.4)'
<MODULE_NAME abbrevid=4/> blob data = 'Foo'
<TARGET abbrevid=6/> blob data = 'arm64-apple-macosx14.0'
</CONTROL_BLOCK>
<DECL_LOCS_BLOCK NumWords=321 BlockCodeSize=4>
<SOURCE_FILE_LIST abbrevid=4/> blob data = unprintable, 168 bytes.
<BASIC_DECL_LOCS abbrevid=5/> blob data = unprintable, 552 bytes.
<DECL_USRS abbrevid=6 op0=324/> blob data = unprintable, 396 bytes.
<TEXT_DATA abbrevid=7/> blob data = unprintable, 117 bytes.
<DECL_LOCS_BLOCK NumWords=240 BlockCodeSize=4>
<SOURCE_FILE_LIST abbrevid=4/> blob data = unprintable, 84 bytes.
<BASIC_DECL_LOCS abbrevid=5/> blob data = unprintable, 460 bytes.
<DECL_USRS abbrevid=6 op0=252/> blob data = unprintable, 292 bytes.
<TEXT_DATA abbrevid=7/> blob data = unprintable, 75 bytes.
<DOC_RANGES abbrevid=8/> blob data = unprintable, 1 bytes.
</DECL_LOCS_BLOCK>
</MODULE_SOURCEINFO_BLOCK>
Expand Down Expand Up @@ -109,22 +129,23 @@ struct DeclLocRecord {
```
* [Serializing BASIC_DECL_LOCS](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/SerializeDoc.cpp#L734-L750)
* [Parsing BASIC_DECL_LOCS](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/ModuleFile.cpp#L1199-L1222)
* The `LocationDirective` is related to the usage of `#sourceLocation`

##### DECL_USRS
`DECL_USRS` is a serialized `llvm::OnDiskIterableChainedHashTable`, where the key is a USR, and the value is the index of a location record in `BASIC_DECL_LOCS`. Virtually, the deserialized `DECL_USRS` looks like below.
```
s:10FooLibrary0A8ProtocolPA2A0A6StructVRszrlE3fooSSvpZ -> 4
s:10FooLibrary0A6StructV -> 2
s:10FooLibrary0A8ProtocolPA2A0A6StructVRszrlE3fooSSvgZ (Foo.swift:10:33) -> 3
s:10FooLibrary0A8ProtocolP -> 1
s:10FooLibrary3BarC -> 0
s:e:s:10FooLibrary0A8ProtocolPA2A0A6StructVRszrlE3fooSSvpZ -> 5
s:10Foo0A8ProtocolPA2A0A6StructVRszrlE3fooSSvpZ -> 4
s:10Foo0A6StructV -> 2
s:10Foo0A8ProtocolPA2A0A6StructVRszrlE3fooSSvgZ (Foo.swift:10:33) -> 3
s:10Foo0A8ProtocolP -> 1
s:10Foo3BarC -> 0
s:e:s:10Foo0A8ProtocolPA2A0A6StructVRszrlE3fooSSvpZ -> 5
```
* [Serializing DECL_USRS](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/SerializeDoc.cpp#L550-L562)
* [Serializing DECL_USRS](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/SerializeDoc.cpp#L555-L568)
* [Parsing DECL_USRS](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/ModuleFileSharedCore.cpp#L1157-L1167)

##### TEXT_DATA
`TEXT_DATA` is a list of `\0` terminated strings, which are the actual source file paths. As mentioned before, they’re [always absolute paths](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/SerializeDoc.cpp#L760-L762).
`TEXT_DATA` is a list of `\0` terminated strings, which are the actual source file paths. As mentioned before, they’re [always absolute paths](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/SerializeDoc.cpp#L765-L767).

##### DOC_RANGES
`DOC_RANGES` is a list of fixed-size item, representing the the location of documentation. A documentation is a code comment in a [DocC format](https://www.swift.org/documentation/docc/documenting-a-swift-framework-or-package). The layout of doc range for a USR is described below.
Expand All @@ -137,8 +158,6 @@ struct {
uint32_t Unknown; // Unknown what is this used for
} DocRangeRecord[N]; // N DocRangeRecord
```
* [Parsing DOC_RANGES](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/ModuleFile.cpp#L1206-L1217)
* [Parsing DOC_RANGES](https://github.com/apple/swift/blob/c2ca810126074406f03dc29a44f4ad4b12f04c79/lib/Serialization/ModuleFile.cpp#L1206-L1218)

### Conclusion
The way the source info is stored facilitates workflows such as finding the definition (jump-to-definition) and documentation (showing the help window) of a given USR.

Binary file added articles/images/swiftsourceinfo_diagnostics.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions building/swift_source_info/diagnostics/ModuleA.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
open class ParentClass {
open func foo(a: Int) {}
}
5 changes: 5 additions & 0 deletions building/swift_source_info/diagnostics/ModuleB.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
import ModuleA

open class SubClass: ParentClass {
open override func foo(a: String) {}
}
12 changes: 12 additions & 0 deletions building/swift_source_info/diagnostics/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/bin/zsh

xcrun swiftc -emit-module -o build/ModuleA.swiftmodule ModuleA.swift

echo "\033[34m>>> Compiler diagnostics with .swiftsourceinfo:\033[0m"
xcrun swiftc -typecheck -Ibuild ModuleB.swift

# We can also use `-avoid-emit-module-source-info` to avoid emitting .swiftsourceinfo files.
rm build/ModuleA.swiftsourceinfo

echo "\033[34m>>> Compiler diagnostics without .swiftsourceinfo:\033[0m"
xcrun swiftc -typecheck -Ibuild ModuleB.swift
8 changes: 8 additions & 0 deletions building/swift_source_info/indexing/Foo.swift
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
public protocol FooProtocol {}
public struct FooStruct : FooProtocol {}

extension FooProtocol where Self == FooStruct {
public static var foo: String {
"Hello, world!"
}
}
5 changes: 5 additions & 0 deletions building/swift_source_info/indexing/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/zsh

# Emit the symbol graph for Foo.swift,
# so that we can see the "SYNTHESIZED" symbols
xcrun swiftc -emit-module -o build/Foo.swiftmodule -emit-symbol-graph -emit-symbol-graph-dir build Foo.swift

0 comments on commit 9df0d2b

Please sign in to comment.