Skip to content
This repository has been archived by the owner on Oct 13, 2023. It is now read-only.

Command stdout being closed early #48

Closed
guybedford opened this issue Jan 5, 2023 · 5 comments
Closed

Command stdout being closed early #48

guybedford opened this issue Jan 5, 2023 · 5 comments

Comments

@guybedford
Copy link

guybedford commented Jan 5, 2023

I have a large binary (as opposed to a reduced test case) where the stdout descriptor is somehow being set to a Descriptor::Closed value. Initial fd_write calls are ok, but later fd_write calls to stdout then fail, but the failure is also being swallowed resulting in an ok completion and no indication of the failure.

In this scenario, there is no fd_close call, in fact the only other host call happening is monotonic clock calls. So I'm not sure exactly why the Descriptor::Closed value is appearing. I can see that the stdout descriptor is running through the Descriptor Drop though which might be the culprit. Perhaps due to the overwriting approach being done for commands, I'm not sure.

I'm going to test out a non-command case tomorrow and will report back if this works around the issue, but wanted to post up the bug for now at least. The second bug that the fd_write error is not being propagated also seems like an important one to fix, it would be more useful to always treat errors as errors of the command function itself I think.

@guybedford
Copy link
Author

I ran further tests on a non-command version today, and the same issues with printing applied, where stdout will just entirely stop working.

@dicej
Copy link
Collaborator

dicej commented Jan 6, 2023

This sounds similar to the memory corruption issue I encountered while working on #45 and fixed in 0732139. The culprit there was an errant ptr::copy call that was overflowing its input and output buffers by a factor of 256, clobbering a bunch of State, including State::ndescriptors and State::descriptors.

I don't think that's the issue you're running into here given that (A) the fix was merged into main before you opened this issue and (B) it only happened if you used readdir, but perhaps a similar kind of memory corruption lurks elsewhere.

@guybedford
Copy link
Author

With the help of @dicej I was finally able to narrow down the replication here (including the fixes in #51).

Here's a complete replication of the issue demonstrating the closed descriptor - https://github.com/guybedford/spidermonkey-component-test.

I agree with Joel's assessment that this is likely another type of memory corruption issue.

@alexcrichton
Copy link
Member

I helped debug this a bit last friday and the issue was tracked down to WebAssembly/wasi-libc#377 more-or-less. The problem is that just after the main module is instantiated, but before it has anything called on it, two pages are added to memory. One is for the adapter module's stack and one is for the adapter module's State structure. These two pages are then erroneously considered part of the main module's heap, meaning the main module can malloc inside of it and corrupt state.

This wasn't ever hit prior since it requires having >64k heaps and I don't think any of the test programs do that, but Spidermonkey would indeed do that.

@guybedford
Copy link
Author

I can confirm that patching the issue described in WebAssembly/wasi-libc#377 resolves the corruption here. Thanks for the debugging help @dicej @alexcrichton.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants