Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apple Network Framework Socket Changes #662

Open
wants to merge 199 commits into
base: grand_dispatch_queue
Choose a base branch
from

Conversation

sbSteveK
Copy link
Contributor

Apple Network Framework socket integration

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor

@bretambrose bretambrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial initial review. Will do more tomorrow.


Upon a connection being established, the new socket (either as the result of a `connect()` or `start_accept()` call)
will not be attached to any event loops. It is your responsibility to register it with an event loop to begin receiving
notifications.


#### V-Table
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this does not belong here. Are people supposed to keep this in sync with the real definition? Anyone looking at the code can look at the real definition.

As far as I'm concerned, no vtables should be present in a library README.

Copy link
Contributor

@bretambrose bretambrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another checkpoint on the overall review. Beyond the comments, I'd love to see regular trace-level logging for every callback function if it does not already have it. We want to be able to follow the functional flow in the logs.

aws_mem_release(readable_args->allocator, readable_args);
}

static void s_schedule_on_readable(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better name would be something along the lines of handle_incoming_data. That describes the purpose of the function; the fact that it pushes the data onto a scheduled task is an interior detail.


s_lock_socket_state(nw_socket);

if (readable_args->is_complete) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't we condition the whole lock-unlock block on this check? Then we never do the locking until the socket gets closed.

aws_mem_release(task_args->allocator, task_args);
}

static void s_schedule_on_listener_success(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s_handle_on_listener_success maybe

// released when the connection state changed to nw_connection_state_cancelled
s_socket_acquire_internal_ref(new_nw_socket);

AWS_LOGF_TRACE(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like server connection establishment deserves more than Trace logging. Also, it would be nice if we logged the fact this function was invoked at all. Right now, we're only logging the success path.


args->nw_socket = nw_socket;
args->allocator = nw_socket->allocator;
args->error_code = error_code;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we convert all Apple error codes to CRT error codes before sticking them in an integer? Maybe log them first before converting? This would apply everywhere we stick an Apple code into an int currently.

If I see error_code in a struct, my expectation is that it's a CRT error code not an OS error code.

Copy link
Contributor

@bretambrose bretambrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite finished, but getting close.

Would suggest doing a pass on log levels and ensuring that they make sense.
Trace - most appropriate for data processing step and extremely common/spammy events
Debug/Info - appropriate for connection-level events, state changes, etc...

AWS_LOGF_DEBUG(
AWS_LS_IO_SOCKET, "id=%p handle=%p: beginning connect.", (void *)socket, socket->io_handle.data.handle);

if (socket->event_loop) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the previous fatal assert, this can be removed

// "connect" action after aws_socket_init() regardless it's a UDP socket or a TCP socket.
AWS_FATAL_ASSERT(on_connection_result);
s_lock_socket_state(nw_socket);
if (nw_socket->synced_state.state != INIT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you change the state here as well, this isn't at all concurrent-safe. Maybe our usage of aws_socket keeps this from being an issue but it's really uncomfortable having "this-is-only-safe-because-this-is-an-internal-API-and-our-internal-usage-just-happens-to-not-blow-it-up" blocks throughout the implementation.

return AWS_OP_ERR;
}

struct socket_address address;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a CRT-defined union type (which I don't think is needed) and then we pass it to a system call (nw_endpoint_create_address). That's not something we want to ever do even if the types currently match.

struct socket_address address;
AWS_ZERO_STRUCT(address);
int pton_err = 1;
if (socket->options.domain == AWS_SOCKET_IPV4) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better as a switch statement maybe

address.sock_addr_types.un_addr.sun_len = sizeof(struct sockaddr_un);

} else {
AWS_FATAL_ASSERT(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it a fatal assert? This is user-controlled; should passing in an unknown domain cause things to crash?

(void *)nw_socket,
(void *)nw_socket->os_handle.nw_connection);

s_schedule_next_read(nw_socket);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary given that the read callback always tries to schedule a new one?

}

struct nw_socket *nw_socket = socket->impl;
if (!(nw_socket->synced_state.state & CONNECTED_WRITE)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this safe? Are we asserting that only the event loop ever modifies this?


AWS_FATAL_ASSERT(written_fn);

dispatch_data_t data = dispatch_data_create(cursor->ptr, cursor->len, NULL, DISPATCH_DATA_DESTRUCTOR_DEFAULT);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this fail? I don't think we should assume it can't. Unlike our stuff, the OS isn't going to intentionally crash if resource allocation fails.


if (error_code) {
nw_socket->last_error = error_code;
aws_raise_error(error_code);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not useful to raise error here in a callback

Copy link
Contributor

@bretambrose bretambrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked through channel bootstrap.

Would really like to see _new, _destroy functions for all the task-arg situations across the whole PR.

@@ -296,6 +305,9 @@ AWS_IO_API int aws_server_bootstrap_set_alpn_callback(
AWS_IO_API struct aws_socket *aws_server_bootstrap_new_socket_listener(
const struct aws_server_socket_channel_bootstrap_options *bootstrap_options);

AWS_IO_API struct aws_socket *aws_server_bootstrap_new_socket_listener_async(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just make all listener implementations use an async path?


struct socket_shutdown_setup_channel_args *close_args =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not repeat this and instead add _new and _destroy functions for socket_shutdown_setup_channel_args

@@ -607,14 +662,19 @@ static void s_on_client_connection_established(struct aws_socket *socket, int er
connection_args->channel_data.channel = aws_channel_new(connection_args->bootstrap->allocator, &args);

if (!connection_args->channel_data.channel) {
struct socket_shutdown_setup_channel_args *close_args =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use _new here too

aws_mem_release(shutdown_args->allocator, shutdown_args);
}

static void s_socket_shutdown_complete_setup_connection_args_no_release_fn(void *user_data) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it would be simpler just to add a bool controlling release in the args structure rather than duplicating a function

@@ -1402,12 +1606,20 @@ void s_on_server_connection_result(

error_cleanup:
/* no channel is created */
connection_args->incoming_callback(connection_args->bootstrap, aws_last_error(), NULL, connection_args->user_data);

(void)socket; // to avoid expression error after a label
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can just use a semicolon

struct aws_allocator *allocator = new_socket->allocator;

struct socket_shutdown_server_connection_result_args *close_args =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this pattern everywhere. Can we create some helper functions and use them instead?

size_t current_offset;
};

static void s_destroy_read_queue_node(struct read_queue_node *node) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial: naming convention on destroy functions. This should be renamed to s_read_queue_node_destroy(). Also, if we have a destroy function for read_queue_node we should probably set up its pair of s_read_queue_node_new() that is called when we are creating something that needs to be destroyed. It looks like there's only one place where this node is created but it feels right to have them both.

memcpy(socket->local_endpoint.address, hostname, to_copy);
socket->local_endpoint.port = port;
}
nw_release(local_endpoint);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local_endpoint doesn't appear to be released if there's no socket. We also appear to not use the hostname and port members if there's no socket. We could probably move hostname and port into this if block and move the nw_release of local_endpoint outside of it unless it's being taken care of elsewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants