Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crt1-command.c: always call wasi_proc_exit for _REENTRANT #367

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

yamt
Copy link
Contributor

@yamt yamt commented Dec 20, 2022

Note: a return from main() is an equivalent of exit(), which should terminate sibling threads.

yamt added 2 commits December 20, 2022 15:09
Note: a return from main() is an equivalent of exit(), which
should terminate sibling threads.
// Call atexit functions, destructors, stdio cleanup, etc.
__wasm_call_dtors();

#ifdef _REENTRANT
// `__wasi_proc_exit` terminates sibling threads.
__wasi_proc_exit(r);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the host runtime to take care of this, even we just return.. the host should terminate all threads from the outside, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't expect so because:

  • at wasi level, nothing wrong with making the main thread returns while keeping other threads running.
  • it's also consistent with what happens when other thread (wasi_thread_start) returns.
  • it's simpler to have less special-casing for main thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If "exit and keep threads running" is something that we would want at the wasi level why wouldn't we also want it at the libc level? Can we agree on what they both want and just return from main here?

Another way of putting it, assuming returning from main without __wasi_proc_exit is a useful thing to do, do we not want to allow wasi-libc users to do it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Also, in the future when we migrate off of instance-per-thread, threads running after main returns is potentially awkward, because a main function could be called by a caller that isn't expecting the instance to keep running after main returns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, threads running after main thread exits is actually quite normal as far as i know.
such applications achieve that by calling pthread_exit, though.

as far as C is our concern, maybe there is not much differences between either approaches.
however, IMO, there is little point to make the main thread special in this regard.

I agree. Also, in the future when we migrate off of instance-per-thread, threads running after main returns is potentially awkward, because a main function could be called by a caller that isn't expecting the instance to keep running after main returns.

it's normal for unix waitpid etc to wait for all threads exit, not only the main thread.
i guess embedder apis should follow the semantics.

Copy link
Member

@sbc100 sbc100 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__wasi_proc_exit is implemented like a trap, with a stack unwind in some runtimes, and for small short-lived instances this is sometimes noticeable in performance. Letting _start return to its outside-world caller is faster.

That seems like an implementation choice isn't it? There isn't anything about proc_exit that requires a trace is there? Also this is something that happens exactly once per instance.. are you sure it makes a measurable difference to real programs? ...also if there is a cost is seems odd to make only non-zero returning programs pay that cost. If true, that kind of sounds like an argument for a __wasi_set_exit_code API ..

On Linux, there is no caller for _start to return to, so calling the _exit function is required for exiting the process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__wasi_proc_exit is implemented like a trap, with a stack unwind in some runtimes, and for small short-lived instances this is sometimes noticeable in performance. Letting _start return to its outside-world caller is faster.

That seems like an implementation choice isn't it? There isn't anything about proc_exit that requires a trace is there? Also this is something that happens exactly once per instance.. are you sure it makes a measurable difference to real programs? ...also if there is a cost is seems odd to make only non-zero returning programs pay that cost. If true, that kind of sounds like an argument for a __wasi_set_exit_code API ..

Actually a better solution might be to have _start return the exit code .. then we could simply delete this call.

On Linux, there is no caller for _start to return to, so calling the _exit function is required for exiting the process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__wasi_proc_exit is implemented like a trap, with a stack unwind in some runtimes, and for small short-lived instances this is sometimes noticeable in performance. Letting _start return to its outside-world caller is faster.

That seems like an implementation choice isn't it? There isn't anything about proc_exit that requires a trace is there? Also this is something that happens exactly once per instance.. are you sure it makes a measurable difference to real programs? ...also if there is a cost is seems odd to make only non-zero returning programs pay that cost. If true, that kind of sounds like an argument for a __wasi_set_exit_code API ..

_start returning is just a normal function return operation. But proc_exit is an unwind stack frames operation, which in some settings involves a fair amount of work. Wasm instances don't require starting up new OS processes; they can be very cheap.

Actually a better solution might be to have _start return the exit code .. then we could simply delete this call.

Yes, returning a value would have been a better way to do it. In the preview2 work we've already moved to having main have a return value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__wasi_proc_exit is implemented like a trap, with a stack unwind in some runtimes, and for small short-lived instances this is sometimes noticeable in performance. Letting _start return to its outside-world caller is faster.

That seems like an implementation choice isn't it? There isn't anything about proc_exit that requires a trace is there? Also this is something that happens exactly once per instance.. are you sure it makes a measurable difference to real programs? ...also if there is a cost is seems odd to make only non-zero returning programs pay that cost. If true, that kind of sounds like an argument for a __wasi_set_exit_code API ..

_start returning is just a normal function return operation. But proc_exit is an unwind stack frames operation, which in some settings involves a fair amount of work. Wasm instances don't require starting up new OS processes; they can be very cheap.

do you have a specific implementation in your mind?
for that implementation, rewinding a single small function (_start) is so heavy?
if so, i suppose it's a problem of that specific implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size of the function doesn't change anything. In implementations where proc_exit calls _Unwind_Resume it sometimes has to do a fair amount of work. You're right that that's a cost of such implementations. And we are working on moving to return values which will obviate this whole situation. So the question here is what we want the behavior of proc_exit and wasi_thread_exit to be; see my comment below.

@sbc100
Copy link
Member

sbc100 commented Dec 21, 2022

To summarize my thinking based on this discussion:

  1. Returning from _start is semantically the same as calling proc_exit(0)
  2. Under those circumstance its up to the runtime to tear down all threads (just like on posix).
  3. The only way (in posix) to have threads outlive the main thread is to call pthread_exit from the main thread (which we don't yet implement).

@yamt
Copy link
Contributor Author

yamt commented Dec 21, 2022

To summarize my thinking based on this discussion:

1. Returning from `_start` is semantically the same as calling `proc_exit(0)`

2. Under those circumstance its up to the runtime to tear down all threads (just like on posix).

3. The only way (in posix) to have threads outlive the main thread is to call `pthread_exit` from the main thread (which we don't yet implement).

i still think returning from _start should be the same as thread_exit.
with that approach, it might be possible to implement pthread_exit with other mechanisms (eg exception-handling) w/o wasi_thread_exit.

yamt added a commit to yamt/toywasm that referenced this pull request Dec 21, 2022
Note: for some reasons, the current logic in wasi-libc crt returns
without wasi_proc_exit when the exit code is 0.  maybe just to save
a host call?

cf. WebAssembly/wasi-libc#367
@sbc100
Copy link
Member

sbc100 commented Dec 21, 2022

i still think returning from _start should be the same as thread_exit.

But wouldn't that make it hard to port existing programs that assume posix semantics (returning from _start is the same as exiting the entire process).

If we are looking for way to achieve "exit while keeping background threads alive" why would we choose "return from start function" to implicitly mean that? Especially when existing program already assume that not to be true? Why not be explicit about that with a new API or an argument to new argument to the existing wasi_proc_exit?

I think this is another reason we need to add wasi_thread_exit: WebAssembly/wasi-threads#7. Until we do that, we don't have a good to signal "exit with other threads staying alive" from the main thread.

@yamt
Copy link
Contributor Author

yamt commented Dec 21, 2022

i still think returning from _start should be the same as thread_exit.

But wouldn't that make it hard to port existing programs that assume posix semantics (returning from _start is the same as exiting the entire process).

C programs don't care about "returning from _start".
what they care is "returning from main", which is this PR is about.

If we are looking for way to achieve "exit while keeping background threads alive" why would we choose "return from start function" to implicitly mean that? Especially when existing program already assume that not to be true? Why not be explicit about that with a new API or an argument to new argument to the existing wasi_proc_exit?

I think this is another reason we need to add wasi_thread_exit: WebAssembly/wasi-threads#7. Until we do that, we don't have a good to signal "exit with other threads staying alive" from the main thread.

i feel the opposite. actually, "terminate other threads" is a new concept which wasi-threads is going to introduce. why not be explicit about that?

@sunfishcode
Copy link
Member

If I understand correctly, one of the motivations for this is so that if we say that exiting from _start is the same as wasi_thread_exit, then in the future we can more easily replace wasi_thread_exit with an unwind, once we have unwinding.

However, elsewhere, other people have been contemplating replacing proc_exit with an unwind, once we have unwinding. And that wants exiting from _start to be the same as proc_exit.

Both of these would eliminate an API function and simplify the system. But we can't easily do both. What should we do?

@sbc100
Copy link
Member

sbc100 commented Dec 22, 2022

C programs don't care about "returning from _start".
what they care is "returning from main", which is this PR is about.

Ah, I think I see now. This PR mean that returning from main always results in proc_exit (at least in the multi-threaded case) which then means we can reserve returning from _start to potentially signify something else.

In that case, I withdraw my previous objection.

@yamt
Copy link
Contributor Author

yamt commented Dec 22, 2022

If I understand correctly, one of the motivations for this is so that if we say that exiting from _start is the same as wasi_thread_exit, then in the future we can more easily replace wasi_thread_exit with an unwind, once we have unwinding.

However, elsewhere, other people have been contemplating replacing proc_exit with an unwind, once we have unwinding. And that wants exiting from _start to be the same as proc_exit.

who is "other people" and where is it?

Both of these would eliminate an API function and simplify the system. But we can't easily do both. What should we do?

theoretically we can do the both if we want as a wasm function can return multiple return values.

@sunfishcode
Copy link
Member

If I understand correctly, one of the motivations for this is so that if we say that exiting from _start is the same as wasi_thread_exit, then in the future we can more easily replace wasi_thread_exit with an unwind, once we have unwinding.
However, elsewhere, other people have been contemplating replacing proc_exit with an unwind, once we have unwinding. And that wants exiting from _start to be the same as proc_exit.

who is "other people" and where is it?

I don't have links handy, but I think the motivation is straightforward: proc_exit is a special and non-virtualizable function. The platform would be cleaner if we didn't need it, and with unwinding support, we shouldn't need it. And it only supports i32 exit codes; as we add support for Typed Main, we'll want programs to be able to return other types of data too. And as we do more with linking multiple modules together, explicit unwinding clarifies the scope of the exit—_start would catch the exception, making it clear that the caller of _start isn't intended to be terminated.

Both of these would eliminate an API function and simplify the system. But we can't easily do both. What should we do?

theoretically we can do the both if we want as a wasm function can return multiple return values.

Are you suggesting having a return value that tells the caller of main whether to wait for all threads or kill all the threads?

@yamt
Copy link
Contributor Author

yamt commented Dec 23, 2022

If I understand correctly, one of the motivations for this is so that if we say that exiting from _start is the same as wasi_thread_exit, then in the future we can more easily replace wasi_thread_exit with an unwind, once we have unwinding.
However, elsewhere, other people have been contemplating replacing proc_exit with an unwind, once we have unwinding. And that wants exiting from _start to be the same as proc_exit.

who is "other people" and where is it?

I don't have links handy, but I think the motivation is straightforward: proc_exit is a special and non-virtualizable function. The platform would be cleaner if we didn't need it, and with unwinding support, we shouldn't need it. And it only supports i32 exit codes; as we add support for Typed Main, we'll want programs to be able to return other types of data too. And as we do more with linking multiple modules together, explicit unwinding clarifies the scope of the exit—_start would catch the exception, making it clear that the caller of _start isn't intended to be terminated.

ok.

the most of motivation (except Typed main one) seems to apply to thread_exit as well.

the "terminate other threads" part of proc_exit can't be implemented with unwind alone, right?

i'm not sure how the "Typed Main non-i32 return value" part can work when a non-main thread does proc_exit. do you have an idea?

Both of these would eliminate an API function and simplify the system. But we can't easily do both. What should we do?

theoretically we can do the both if we want as a wasm function can return multiple return values.

Are you suggesting having a return value that tells the caller of main whether to wait for all threads or kill all the threads?

sort of.
eg. it can return two values like: (is_exit, exit_value)

yamt added a commit to yamt/toywasm that referenced this pull request Jan 7, 2023
@abrown
Copy link
Collaborator

abrown commented Jul 11, 2023

Let's figure this out in the "thread exit" issue (WebAssembly/wasi-threads#7) and then come back to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants