Skip to content

Allow callers to run a subprocess and provide low and high water marks when using SequenceOutput to emit standard output and standard error as soon as it arrives. #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

rdingman
Copy link

@rdingman rdingman commented May 8, 2025

Resolves #39

@rdingman rdingman requested a review from iCharlesHu as a code owner May 8, 2025 17:10
when using SequenceOutput to emit standard output and standard error
as soon as it arrives.

Resolves swiftlang#39
Copy link
Contributor

@iCharlesHu iCharlesHu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for the issue and PR! I agree this is a great addition to the API surface.

As a overall comment, could you add a test to make sure the new behavior works as intended?

self.buffer = []
self.currentPosition = 0
self.finished = false
self.streamIterator = Self.createDataStream(with: diskIO.dispatchIO, bufferSize: bufferSize).makeAsyncIterator()
Copy link
Contributor

@iCharlesHu iCharlesHu May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsyncBufferSequence is a shared type across all platforms therefore we can't unconditionally refer to platform specific type dispatchIO here. We may see Windows build failure as a result

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I'll look into implementing this in such a way that platforms without dispatchIO don't break.

Comment on lines 213 to 221
internal let lowWater: Int?
internal let highWater: Int?
internal let bufferSize: Int

internal init(lowWater: Int? = nil, highWater: Int? = nil, bufferSize: Int = readBufferSize) {
self.lowWater = lowWater
self.highWater = highWater
self.bufferSize = bufferSize
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think it’s appropriate to include these parameters here for a couple of reasons:

  • (This isn’t directly related to your change) right now, we’re in the middle of some major architectural updates: Adopt ~Copyable in Subprocess #38. This PR makes SequenceOutput internal, so you can’t use .sequence or .sequence(lowWater: …) anymore.
  • More importantly, this looks like a platform-specific feature. Setting this parameter won’t have any impact on Windows, and (also unrelated to your change) we’re planning to move away from DispatchIO on Linux soon, so it won’t work there either.

Considering all this, I suggest we move these parameters to Darwin’s specific PlatformOptions, maybe under a nested struct PlatformOptions.StreamOptions.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. I'll look into moving these parameters.

@rdingman
Copy link
Author

rdingman commented May 9, 2025

Thanks so much for the issue and PR! I agree this is a great addition to the API surface.

As a overall comment, could you add a test to make sure the new behavior works as intended?

I did add a test named "testSlowDripRedirectedOutputRedirectToSequence". Does that not cover the new behavior like you intend?

@rdingman rdingman marked this pull request as draft May 9, 2025 23:59
@rdingman rdingman force-pushed the rdingman/issue-39 branch 5 times, most recently from 73a31d4 to a1abbf5 Compare May 10, 2025 00:35
@rdingman
Copy link
Author

@iCharlesHu Any suggestions on how to build and test this on windows? I have things building on windows, but I'm having trouble debugging the tests.

@rdingman rdingman force-pushed the rdingman/issue-39 branch from a1abbf5 to 14584a9 Compare May 11, 2025 19:09
@rdingman rdingman force-pushed the rdingman/issue-39 branch from d9f85be to 7b6899c Compare May 11, 2025 21:16
@rdingman rdingman marked this pull request as ready for review May 11, 2025 23:15
@rdingman
Copy link
Author

@iCharlesHu I figured out how to get the debugger working in VSCode on Windows.

However, it appears that several of the tests crash with an exception because a file descriptor is being closed more than once (this is the case on main). Is this a known issue?

I tried out your PR #38 to see if it fixed those issues, but it has not.

For what its worth I'm running on:

  • VSCode 1.100.0
  • Swift Extension 2.2.0
  • LLDB DAP 0.2.13
  • Windows 11 Pro 24h2
  • Swift version 6.1 (swift-6.1-RELEASE)
    Target: aarch64-unknown-windows-msvc

@iCharlesHu
Copy link
Contributor

Thanks so much for the issue and PR! I agree this is a great addition to the API surface.
As a overall comment, could you add a test to make sure the new behavior works as intended?

I did add a test named "testSlowDripRedirectedOutputRedirectToSequence". Does that not cover the new behavior like you intend?

Ahh yes! Sorry I totally missed testSlowDripRedirectedOutputRedirectToSequence. That should work thanks!

@iCharlesHu
Copy link
Contributor

@iCharlesHu I figured out how to get the debugger working in VSCode on Windows.

However, it appears that several of the tests crash with an exception because a file descriptor is being closed more than once (this is the case on main). Is this a known issue?

I tried out your PR #38 to see if it fixed those issues, but it has not.

For what its worth I'm running on:

  • VSCode 1.100.0
  • Swift Extension 2.2.0
  • LLDB DAP 0.2.13
  • Windows 11 Pro 24h2
  • Swift version 6.1 (swift-6.1-RELEASE)
    Target: aarch64-unknown-windows-msvc

@iCharlesHu I figured out how to get the debugger working in VSCode on Windows.

However, it appears that several of the tests crash with an exception because a file descriptor is being closed more than once (this is the case on main). Is this a known issue?

I tried out your PR #38 to see if it fixed those issues, but it has not.

For what its worth I'm running on:

  • VSCode 1.100.0
  • Swift Extension 2.2.0
  • LLDB DAP 0.2.13
  • Windows 11 Pro 24h2
  • Swift version 6.1 (swift-6.1-RELEASE)
    Target: aarch64-unknown-windows-msvc

Thanks so much for looking into the Windows build. Unfortunately we do have some known test failures on Windows currently (#22) and I'll address them separately. Right now we want to make sure all new changes at least build on Windows.

@rdingman
Copy link
Author

@iCharlesHu I figured out how to get the debugger working in VSCode on Windows.
However, it appears that several of the tests crash with an exception because a file descriptor is being closed more than once (this is the case on main). Is this a known issue?
I tried out your PR #38 to see if it fixed those issues, but it has not.
For what its worth I'm running on:

  • VSCode 1.100.0
  • Swift Extension 2.2.0
  • LLDB DAP 0.2.13
  • Windows 11 Pro 24h2
  • Swift version 6.1 (swift-6.1-RELEASE)
    Target: aarch64-unknown-windows-msvc

@iCharlesHu I figured out how to get the debugger working in VSCode on Windows.
However, it appears that several of the tests crash with an exception because a file descriptor is being closed more than once (this is the case on main). Is this a known issue?
I tried out your PR #38 to see if it fixed those issues, but it has not.
For what its worth I'm running on:

  • VSCode 1.100.0
  • Swift Extension 2.2.0
  • LLDB DAP 0.2.13
  • Windows 11 Pro 24h2
  • Swift version 6.1 (swift-6.1-RELEASE)
    Target: aarch64-unknown-windows-msvc

Thanks so much for looking into the Windows build. Unfortunately we do have some known test failures on Windows currently (#22) and I'll address them separately. Right now we want to make sure all new changes at least build on Windows.

@iCharlesHu Great, that's good to know. Thanks!

Comment on lines 52 to 53
streamIterator = diskIO.readDataStream(upToLength: readBufferSize).makeAsyncIterator()
return data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem right. Why are we creating a new iterator when the first one ends? There will be nothing to read from this second iterator because all the data in the pipe would already been read.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pushback on this. I was going to explain my thinking and then realize if I have to explain this then it probably should be written in a more straightforward manner. The first implementation was using one iterator per chunk and when we reached a chunk boundary we'd switch to a new iterator. While this did work, it was admittedly a little clunky. Now, we use one AsyncThrowingStream (and iterator) across all chunks (even if the chunk is broken up into sub-chunks fora single read as in my original motivation for this issue. Please check on the new implementation to see if it makes sense to you.

@@ -665,6 +665,48 @@ extension SubprocessUnixTests {
#expect(catResult.terminationStatus.isSuccess)
#expect(catResult.standardError == expected)
}

@Test func testSlowDripRedirectedOutputRedirectToSequence() async throws {
let threshold: Double = 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately in tests you'll have to write

guard #available(SubprocessSpan , *) else {
    return
}

In the beginning to work around the same availability issue. See other tests for examples.

Copy link
Author

@rdingman rdingman May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iCharlesHu When I first started working on this, I was very confused as to why some of the tests weren't running my new code and it was because of this check. Wouldn't it be better to have them skipped and noted as such in the test output rather than falsely succeeding? I'm thinking something like this:

    @Test(
        .enabled(
            if: {
                if #available(SubprocessSpan , *) {
                    true
                } else {
                    false
                }
            }(),
            "This test requires SubprocessSpan"
        )
    )
    func testSlowDripRedirectedOutputRedirectToSequence() async throws {
    }

Of course, we can have a helper function to make this less verbose.

Thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iCharlesHu I went ahead and conditionalized this one test this way as an example. Let me know if you don't like that and would like me to revert to a guard

Comment on lines 203 to 210
public struct StreamOptions: Sendable {
let lowWater: Int?
let highWater: Int?

init(lowWater: Int? = nil, highWater: Int? = nil) {
self.lowWater = lowWater
self.highWater = highWater
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I know I initially suggested using a StreamOptions nested struct, but after revisiting this API, I think we should reconsider the lowWater and highWater properties. Here’s why: 1) Their names can be quite confusing outside of the DispatchIO context, and 2) we’d need to add runtime validation to ensure lowWater < highWater.

How about we try something like this instead?

struct PlatformOptions {
    
    let preferredStreamBufferSizeRange: Range<Int>? = nil
}

This approach makes it clear that we’re requesting a range (with a lower and upper bound), and it eliminates the need for validation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iCharlesHu The issue I see with this suggestion is that it presumes that you will set both the lower and upper bound, or neither. There is no mechanism for setting just one or the other. As I see it, we have a few options:

  1. Adopt your suggestion and take the stance that you cannot set these independently.
  2. Recognize that these are platform specific options and rename them to indicate that. On Linux, these options would be removed once the Linux implementation moves away from DispatchIO because they are DispatchIO specific. We'd add the appropriate runtime check here. (FWIW, DispatchIO handles when lowWater > highWater and makes them the same).
  3. Attempt to use some sort of sentinel values to represent the "don't set this" or "use the default" case. For lowWater mark, this could be something like -1 (the default is "unspecified"). For highWater this is tougher because the documentation says the default value is SIZE_MAX which isn't representable by Int. IMO, this option doesn't seem very intuitive, but I thought I'd include it anyways.
  4. Change to use an enum which represents the four cases of set neither, set lower, set upper, set both. Something like:
enum BufferSizeOptions {
case none
case lowerBound(Int)
case upperBound(Int)
case range(Range<Int>)
}

Thoughts?

I think option 3 is too awkward and unintuitive, so I don't think we should consider it. If you feel strongly about option 1, we can go with that, but I wanted to bring up this issue.

Copy link
Author

@rdingman rdingman May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iCharlesHu I modified your proposal a bit to use a RangeExpression rather than just a concrete Range that way we can express things like 0... to only set the lower bounds or ...4096 to only set the upper bounds. I didn't want to make PlatformOptions fully generic because this would be more cumbersome. Instead, I added some API on PlatformOptions to enforce the requirement that RangeExpression.Bound must be an Int.

Check it out and let me know what you think.

Comment on lines 179 to 180
public var outputOptions: StreamOptions = .init()
public var errorOptions: StreamOptions = .init()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we only need one (see my comments below).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced these with preferredStreamBufferSizeRange above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants