Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Add async/await proposal #406
[RFC] Add async/await proposal #406
Changes from 1 commit
7ec050b
e2f078e
f96567c
110ab5e
992f006
22d8b45
9687247
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether
HTTPHeaders
is the right type here. For http/2 connections we'll need to normalise the header names to be lowercased and repackage them intoHPACKHeaders
which isn't cheap (in gRPC we used to use the http2-to-1 codecs and deal with http/1 types; we saw a pretty big increase in performance from using http2 directly, most of which came from not converting the headers). It'd be a shame to have to incur that cost here as well.Keeping
HTTPHeaders
(overHPACKHeaders
or a new AHC-provided headers type) is probably the preferred option at the moment since it's more or less a currency type but doesn't feel like an optimal solution.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktoso @weissi @PeterAdams-A wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In gRPC did you not just move the overhead to HTTP/1? Ie, made HTTP/2 the default and now have to convert when on HTTP/1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. It's a much clearer cut decision for gRPC since http/2 is much more common; so much so that the cost of doing the extra work for http/1 doesn't really matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably at the point where this is currently created we don't know which HTTP version we'll be using? Would we need to do some sort of delayed creation? I think we need to provide and easy version to use, even if we also have a fast option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, we need the headers before we know what version we're using.
One option is AHC providing its own headers type which could wrap either an
HTTPHeaders
orHPACKHeaders
. This has the advantage of being able to change the backing type without breaking API if we decide to change our bias from http/1 to http/2 or vice versa. Of course this potentially makes things worse by adding a third headers type...Another would be making interop between
HTTPHeaders
andHPACKHeaders
cheaper -- I'm not sure how possible this is though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could make the third way a protocol with default implementations for both such that if we get the right one it's close to free, and wrong one is just the conversion we'd have to do anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only allowing
url
to be set in theinit
might be a little too restrictive. This pattern makes a lot of sense for large configuration objects etc. where adding a newinit
for each new piece of configuration quickly becomes a burden for maintainers.I don't know if that's justified here since this is likely to be used much more frequently so we should make it easy to use even if the maintenance cost is higher. Moreover I suspect there's much less scope for adding new properties to the request.
On the other hand, the "convenience" APIs might cover this just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I like to validate earlier rather than later so personally would mildly push back against this change...
but since there's been enough issues about it that I guess it's fair to follow up and change... ok then 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want the URL validation to be part of request creation, you could create a wrapper which initializes a request from a URL object but creates the request using a String. That should give you a reasonable guarantee that HTTPClient will not reject the request due to it being an invalid URL.
It would mean parsing the URL twice, but that’s relatively cheap, and there are potentially ways to avoid it if we get shared Strings in the standard library at some point (they are already part of the ABI). If the shared String is owned by a URL object, the URL parser can assume it is already a valid URL and recover the parsed URL object from the owner without parsing the String again or allocating fresh storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but we just said it can't? A bit weird phrasing I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you imagine to store the sequence in the internal enum? I think we would probably need to erase it with some thing like
AnyAsyncSequence
, similar to whatAnySequence
does forSequence
. But as discussed in this thread, there is noAnyAsyncSequence
in the standard library andAnySequence
is already slow.AnyAsyncSequence
would probably be even slower. I therefore think we can't get reasonable performance in the case where the element is of typeUInt8
because of the overhead introduced by erasing the type of the sequence. They only way I can currently think of making this fast would require making AsyncRequest generic over the actual Body type, which has it own downsides.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way we make it fast in case of
AsyncSequence where Element == UInt8
is by having an inlinable transformer from UInt8 to buffer. That is, a wrapper sequence that builds up buffers of, say, 1kB in size (or whatever) before it passes it on. We then wrap that sequence in the body.So long as the transformer is inlinable we can have it specialized by the caller. We can then wrap it in our type erasing logic (simply a callback of type
() async throws -> ByteBuffer?
) and voila, we get pretty good straight line performance. It's not perfect, but it should be pretty good.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A proof of concept has been implemented here... https://github.com/fabianfett/async-http-client/blob/ff-async/Sources/AsyncHTTPClient/AsyncAwait/AsyncRequest.swift#L74-L87
In the
static
factory methods we translate theAsyncSequence
into a callback, that makes use of the specialization.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's what any stream (IO related at least) will have to do and I've mentioned this repeatedly to folks working on IO and streams. The problem with our async story is that it's not clear who and where must do this chunking. So we may double-chunk un-necessarily... Say the "async bytes" abstraction https://developer.apple.com/documentation/foundation/filehandle/3766681-bytes does exactly that. So feeding a request from a file by doing
.bytes
on it and putting it into this request body will work but we may end up with two such buffers that aggregate where we might have wanted one... but I digress, yes such buffers are exactly what we need to do at asynchronous boundaries of streams (such as here, where we get passed one).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a choice, we have to do it. The reason we have to do it is that we have to type-erase the sequence, and the moment we do that the compiler can't see through the loop. This means we can only iterate on the chunks. The better move, I think, would be for
.bytes
to be augmented by a.chunkedBytes
API that returned some appropriate buffer type. This would allow adopters building pipelines to delay chunking until the last moment.Of course, now we get into an argument about what that buffer type is. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah agreed that we have to do it here. Interesting idea with offering a chunkedBytes to allow more control to users there, might be worth a radar :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't lose this capability actually, we just express it the right way (well, "right" in my brain at least) now :-)
One should always accept "some" async sequence. The way this sequence is produced matters not to the consumer of it (this API).
Now, the way a "writer" is expressed, is by the end-user creating an
Swift.AsyncStream
(which actually is more like "async sequence factory" though the name stuck). That stream then can be yielded into:stream.yield(some bytes) -> YieldResult
and this way the end user may construct their own "writer style" API, and we don't have to prepare anything for it, since we only deal with it in terms ofAsyncSequence
👍There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do lose it:
AsyncStream.yield
is not isomorphic toStreamWriter
. Importantly,StreamWriter.write
returns anEventLoopFuture
, which allows it to exert backpressure on the producer.AsyncStream.Continuation.yield
is not an async function, and so it cannot exert backpressure in the same way. Users could retrofit it in a ghetto kinda way by using a buffered stream and observing the return values fromyield
. They can also retrofit it by usingAsyncStream(unfolding:onCancel:)
but that forces them to construct a closure and essentially "invert" their backpressure (from push-based to pull-based) which is fairly non-obvious.gRPC is proposing an alternative API that keeps the
StreamWriter
-like design, which tbh I prefer.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the entire reason behind the yielded values which we made sure to include for exactly this reason.
You're saying that it's dependent on the intermediate buffer being full or not, and not on the actual "end that does the write being full or not" but that's just the reality of dealing with buffered streams -- and any stream may be buffered so that's it, you won't get the exact same semantics but these are the semantics you work with in stream APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a minor issue implementing
StreamWriter
usingAsyncStream
. If I am looking for the following apiI need the
StreamWriter
to be ready to accept ByteBuffers immediately. The construction ofAsyncStream
takes abuild
closure where you setup your stream. The api does not make it obvious that this closure will have been called by the timeAsyncStream.init
has finished. Having looked at the code I know this is the case but can I assume that will always be the case. Here is the best I could do to implement aboveI had to force unwrap the
write
closure which makes me unhappy.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we'd fix the
StreamWriter
API, right? There's already #291There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
problem statements in #194 and #264
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we discussing removing it and replacing with
AsyncSequence
which in the end I'm all for. I do think supplying a stream implementation with back pressure would be helpful asAsyncStream
isn't up to the job yet.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know about that limitation and now ticketified the specific case for it: https://bugs.swift.org/browse/SR-15076
Either way, even if offering the StreamWriter API -- that's totally fine -- let's please not forget trying to improve the capabilities of the new concurrency types. Some of them are rough around the edges and need more love :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to reword the above about "losing" the writer approach, since we still have it -- thanks to AsyncStream, this way this is only simplification without feature loss 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth exploring an akka http entity inspired design here before we commit to this API.
Specially for the response object it matters because as we create it we know if there will be more bytes or not, so we can immediately construct the right "entity object", e.g. Empty or Strict() or a stream.
Representing the body as an abstract
HTTPEntity
which may be "strict" or not allows us then to create store the strict bytes via reference to it directly, we don't have to use the stream then at all other than lazy creating it when someone pulls from it.The entire topic of "there's a stream of bytes, but noone subscribed to it" is quite hellish in general.
What is the approach of this implementation about it? If we're doing a proper back-pressured stream it means that not reading it is directly connected to not reading from the wire, which may or may not be what the user intended. In akka we had to invent subscription timeouts to resolve such zombie streams.
In case I'm not clear about the case: response has streamed body; we're a nice streaming API and people expect back-pressure via such stream API. If we never consume this stream, what happens?
(In Akka's case we were very loud that this is user error and one has to do
.discardBytes()
(but it's annoying))Note though that I don't yet have good performance intuitions about AsyncSequences... perhaps it is not worth optimizing for knowing that there's a
.single(<bytes>)
at all since setup and logic costs are comparable between this and an iterator... So mostly sharing as an "are we sure and have we considered alternatives?" note.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid we will have to do the same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked offline a little bit -- I think we're in a better place here, because the request's dropping can cancel (and does, but was not documented in this document), which leaves us without the nasty "hanging stream" issue 👍
The must always consume rule is fine, we'll have to be explicit about it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably throw if it is known that body is not consumed yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm on the crash side as well. If the user consumes the trailers before the body, it will never be right. Crash seems to be the right tradeoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's true, unlikely it'll be sometimes correct, and even if it is people really must fix it. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we can't commit to making them happen for Swift 6, it's a clear and important thing we want to support in swift concurrency. So I agree not doing another "our own" thing until then is the right call here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could there be a convenience API that makes a request a
Task
? eg.This is a syntactic sugar that allows the HTTP task to be handled by another function. That is, we don’t need to
await
for the result in the same context.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swift already has special syntax to create a child task through
async let
. What you propose would create a detached Task. I think this would probably make it too easy and users would create detached Tasks just because there is a method for it, even if they would actually want a child Task throughasync let
.