Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to write and run tests #373

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from
Draft
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 132 additions & 0 deletions docs/Testing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Testing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the sources for test should be in a separate directory, that way a test source can be in any format, include multi-file or even nested hierarchy or files. Specifically I see two advantages to this:

  1. The test binary directory (the one the test runner processes) only contains files with a fixed set of known suffixes.
  2. The test source directory structure can be language-specfic and take what ever form it likes. Perhaps some source languages like to have each program in its directory, or each program requires several files?

Copy link
Contributor Author

@caspervonb caspervonb Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably we'll gather all the tests in a webassembly/wasi-testsuite repository ala webassembly/testsuite that can be pulled in as submodule or subtree containing all the test fixtures you'll need to bring your own runner in an existing test setup.

Having a single self contained source file here makes it really easy to reference exactly what is failing; an implementation's test runner can even parse that programatically from a trace and output that.

That said; some of the webassembly proposal repositories do have a test/meta directory which contains the files used to generate tests.

Generating parametric tests this way is definitively an option but I'd still like the generated sources to be available in the main test directory next to the binary for reference without having to "ship" the entire meta directory.

Copy link
Contributor Author

@caspervonb caspervonb Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test binary directory (the one the test runner processes) only contains files with a fixed set of known suffixes.

I don't think we can promise this in general.

As proposals emerge the tests may have to be expanded upon with new meta files; additional permissions for sockets/networking comes to mind for example.

This initials set is just enough to support the filesystem and command proposals.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional comment to this: though wasi-nn is rather exotic compared to, say, wasi-filesystem, I would like to consider what it would take to write WASI tests. With the current spec, the ML model and the input tensors would likely be shipped as separate files since they are rather large. Does that fit into this model?

Additionally, the output tensor(s) will likely not be exact matches but instead need some type of fuzzy assertion. Not sure how we will handle that...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional comment to this: though wasi-nn is rather exotic compared to, say, wasi-filesystem, I would like to consider what it would take to write WASI tests. With the current spec, the ML model and the input tensors would likely be shipped as separate files since they are rather large. Does that fit into this model?

It's open ended and additive so it can be extended to encompass whatever we need but I'm not sure what wasi-nn needs, not that familiar with the proposal but seems it just loads models from memory?

If so then a data section seems more appropriate.

Additionally, the output tensor(s) will likely not be exact matches but instead need some type of fuzzy assertion. Not sure how we will handle that...

Depends on the previous question but probably just define some fuzzy asserters in the source language.
Internal logic should be handled internally IMO.

Really a case of you tell me tho; I'm not familiar enough with that proposal at this time to make any recommendations.

Copy link
Contributor

@abrown abrown Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on the previous question but probably just define some fuzzy asserters in the source language.
Internal logic should be handled internally IMO.

Sounds good.

I'm not sure what wasi-nn needs, not that familiar with the proposal but seems it just loads models from memory?

That's right but, practically speaking, those models/weights/tensors are large and are easier to manage as separate files than static data sections in a compiled Wasm file. It's not an exact corollary, but in an example over in Wasmtime I load this data from files before using them in wasi-nn. If we could bundle additional files with the tests and were allowed to use these files in <basename>.arg (or even statically in the test, I guess), that would seem easiest. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason to embed any data the test needs to that it avoid the dependency on the filesystem API, making the test more precise and less fragile. It should be fairly straight forward to embed data in wasm, I think there are a few different ways you can do this if you are using an llvm-based toolchain.

Copy link
Contributor Author

@caspervonb caspervonb Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we could bundle additional files with the tests and were allowed to use these files in .arg (or even statically in the test, I guess), that would seem easiest. What do you think?

You could but this also introduces a lot of extra unrelated dependencies in your tests that can be avoided.

Different toolchains have different mechanisms for it but it can be quite trivial to embed the data. In Rust for example one could use the include_bytes macro.

let model_1 = include_bytes!("model_1.tensor")

Other toolchains have their ways too.

A test case takes the form of a binary \<basename\>.wasm file, next to that
that there will always be a \<basename\>.\<ext\> source file from which the
test was originally compiled from which can be used as a reference in the event
of an error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplicity of this, but I expect it'll be too limiting as we add more tests in more languages. It'll be convenient to build Rust source files with cargo, and other language source files with their respective build systems, and that'll often require additional support files, so it won't always be as simple as <basename>.<ext>.

What would you think of a convention where we have a sources directory, and in it we have source code and build scripts that one can run to re-generate the wasm files?

Copy link
Contributor Author

@caspervonb caspervonb Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think of a convention where we have a sources directory, and in it we have source code and build scripts that one can run to re-generate the wasm files?

Some of the WebAssembly proposals have a meta directory that contain wast generators; might want to keep with that convention?

Would still like to see the "final" source files next to the test binaries as part of the build for reference as more often than not assertion errors don't make much sense without looking at the source.

Slight duplication but copying it back out to the parent directory next to the binary as part of the build means the implementor running the tests doesn't have to care how a particular proposal structured the build system.

In-fact, this way the meta directory could just be omitted from an implementation's test-data folder entirely if they have no need for it.


Additionally, any of the following optional auxilary files and directories may
be present:
- `<basename>.arg`
- `<basename>.env`
- `<basename>.dir`
- `<basename>.stdin`
- `<basename>.stdout`
- `<basename>.stderr`
- `<basename>.status`

## Running tests

To run a test do the following steps to prepare for execution:

- Prepare inputs
- Given an `<input>.wasm` file; take the `<basename>` of said module.
caspervonb marked this conversation as resolved.
Show resolved Hide resolved
- If `<basename>.<arg>` exists; take the program arguments from said file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a minor detail, but I'd find this more clear with ".args" instead of ".arg".

Also, we should document the whitespacing convention in the arguments file. Can we say that arguments are separated by whitespace, and '"' quotes may be used to enclose an argument containing whitespace?

- If `<basename>.<env>` exists; take the program environment from said file.
- If `<basename>.<dir>` exists; preopen the directory from said file.
- If `<basename>.<stdin>` exists; pipe said file into the program as stdin.
- Run program
- Collect results
- If `<basename>.<stdout>` exists; assert that the programs stdout matches
said file.
- If `<basename>.<stderr>` exists; assert that the programs stdout matches
said file.
- If `<basename>.<status>` exists; assert that the programs stdout matches
said file; otherwise assert that the program exited with status 0.
- Pass

For example:

```bash
# Usage: $1 <runtime> <path_to_binary.wasm>
# $1 wasmtime proc_exit-success.wasm
# $1 wasmer proc_exit-failure.wasm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know these are only examples, but in the interests of not wanting to exclude engines, we shouldn't mention either of these examples here. Would "Usage: $1 <runtime_cmd> <path_to_binary>.wasm" be clear enough, without the examples?

#!/usr/bin/env bash

runtime=$1
input=$2

prefix=${input%.*}
if [ -e "$prefix.stdin" ]; then
stdin="$prefix.stdin"
else
stdin="/dev/null"
fi

output="$(mktemp -d)"
stdout_actual="$output/stdout"
stderr_actual="$output/stderr"
status_actual="$output/status"

if [ -e "$prefix.arg" ]; then
arg=$(cat "$prefix.env")
else
arg=""
fi

if [ -e "$prefix.env" ]; then
env=$(sed -e 's/^/--env /' < "$prefix.env")
else
env=""
fi

if [ -e "$prefix.dir" ]; then
dir="--dir $prefix.dir"
else
dir=""
fi

status=0

"$runtime" $dir $input -- $arg \
caspervonb marked this conversation as resolved.
Show resolved Hide resolved
< "$stdin" \
> "$stdout_actual" \
2> "$stderr_actual" \
|| status=$?

echo $status > "$status_actual"

stdout_expected="$prefix.stdout"
if [ -e "$stdout_expected" ]; then
diff -u "$stderr_expected" "$stderr_actual"
fi

stderr_expected="$prefix.stderr"
if [ -e "$stderr_expected" ]; then
diff -u "$stdout_expected" "$stdout_actual"
fi

status_expected="$prefix.status"
if [ -e "$prefix.status" ]; then
diff -u "$status_expected" "$status_actual"
elif [ ! "$status" -eq "0" ]; then
cat $stderr_actual
exit 1
fi
```

## Writing tests

Any source language may be used to write tests as long as the compiler supports
the wasm32-unknown-unknown target and is well-known and freely available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasm32-unknown-unknown is an llvm-ism so we probably don't want to mention that here. Perhaps:

"Any source language may be used to write tests as long as the compiler supports
WebAssembly output targeting the WASI API"

The part about "well-known and freely available" will probably want to be tightened up too.

Copy link
Contributor Author

@caspervonb caspervonb Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasm32-unknown-unknown is an llvm-ism so we probably don't want to mention that here. Perhaps:

Targeting wasm32-wasi is probably fine actually, just don't pull in libc.

I just want to make sure we point out that using wasm32-wasi's main is generally going to make for a bad test.

For example; trying to test proc_exit in main isn't great as it'll do all the bootstrapping that libc needs but isn't relevant in the context.

The part about "well-known and freely available" will probably want to be tightened up too.

Yeah I'll need to think about what that means.

I'm thinking:

  • We can't allow proprietary compilers.
  • We can't allow non-portable compilers.
  • We can't allow toy languages that won't be or aren't maintained.

I'll double back to this later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

Copy link
Contributor Author

@caspervonb caspervonb Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do tend to think we do want to allow tests to pull in libc. It does call proc_exit and some of the stdio routines needlessly pull in fdstat_get and perhaps other things, however bare WASI is really inconvenient to code to, so if we have to do everythin in bare WASI, it seems like we'd end with people writing fewer tests. It seems better to have more tests, even if some of them do call more functions than they strictly need to.

From experience it was a bit tedious to deal with as _main also calls into the prestat, args and env functions as-well (at-least at the time).

Simple to write tests against but early days of WASI in Deno basically couldn't be tested with my libc tests until everything was bootstrapped.

I think we'll just have to revisit this once the split is done and it's a bit clearer how that's going to work; but presumably we'd want to generate headers and such based on the the local witx files and not be dependent on the current state of say the wasi-libc implementation and headers or the wasi crate (which would also be a circular dependency).

As for wasi-libc itself we can test that in wasi-sdk with the same harness, it's already fairly similar we'd just have to collect them along with the tests from this repository into something like webassembly/wasi-testsuite to let implementers make use of them as-well.


Each source file must be accompanied by a pre-compiled binary version of said
source file.

Each source file must also start with a comment containing the complete command
that was used to compile the binary.

Tests must follow the naming convention of \<system_call\>[-\<variant\>].<ext>.

Tests should be scoped to the smallest unit possible.

In addition to the source and binary files, a test can have any of the following
auxilary files and directories:

- `<basename>.arg`
- `<basename>.env`
- `<basename>.dir`
- `<basename>.stdin`
- `<basename>.stdout`
- `<basename>.stderr`
- `<basename>.status`