Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance testing experience #136

Merged
merged 6 commits into from
Dec 15, 2019
Merged

Enhance testing experience #136

merged 6 commits into from
Dec 15, 2019

Conversation

x4lldux
Copy link
Contributor

@x4lldux x4lldux commented Nov 29, 2019

Sorry for a big dump, but I didn't want to create a bunch of smaller but codependent PRs. Each commit message has some more description in it so please look there also.

The overall motive behind it is to make testing of PropCheck simpler and easier. As mentioned in #130, running tests is not straight forward:

  • some tests need to be disabled by default,
  • tests are very noisy with a lot of debug information being printed out,
  • some tests are flaky or even unstable,
  • they are quite slow.

Dispatching tests

I've disabled unstable and external tests by default and added two new aliases:

  • test_ext for running external tests,
  • tests for running external and normal tests. CI uses this with --cover and --trace arguments like before.

Noisy tests

Printing of debug information is now controller by DEBUG and VERBOSE environment variables. To do that, I've introduced a new case template PropCheck.TestCase which imports few debug printing functions and sets default options for each module. From now on all tests need to use PropCheck.TestCase instead of ExUnit.Case (not counting external tests).
The only thing that is still creating noise are doctest, but changing that would either make them less comprehensible or need copying doctests to own separate tests.

Flaky tests

I've isolated 2 flaky tests and increased the number of tests or search steps to increase the probability that the generated examples had the "correct" path in them (both of them were checking for at least one correct example). I've been increasing the number of numtests/search_steps then running each test 3000 times to check if they fail, until I was not able make them fail.

Slow tests

I've reduce the numtests to 33 and search steps to 333, and introduced NUMTESTS and SEARCH_STEPS env variables, that control those options (via PropCheck.TestCase). Some tests with hard coded number of tests, are now scaled proportionally. Additionally, most tests are now run asynchronously. This reduced time of mix test from ~28 seconds (on my computer), to ~3.7 seconds. Running external and regular tests takes now around 12 seconds.
To offset the reduced numbers, CI builds have them increased to 200 and 1000 (for numtestsand search_steps). CI also disables asynchronous test by using --trace.

@alfert
Copy link
Owner

alfert commented Dec 1, 2019

Wow, this is quite a mouthful. I took a quick glance at it and I really like most of it. Thank you! 💚

Beside the conflict with .travis.yml, the handling of the environment variables should be changed and follow along the implementation started in #126. So please add a PROPCHECK_ prefix and check, if you can reuse (or refactor) the existing solution for handling configurations. Verbosity is similar to exception handling, so I suggest to apply the same approach.

Please do a rebase to the current state of master!

@alfert
Copy link
Owner

alfert commented Dec 2, 2019

Another thought: Are those debug output functions intended to be used by regular PropCheck users? If yes, then we need to document them and explain how/why to use them. For the latter, I would guess this approach:

  • use the new functions for any kind of output
  • if the test fails, do a rerun with DEBUG enabled, then the recorded counter-examples are run together with their output shown

However, this approach fails for measure, collect and the like, since they expect a working property. Here, the DEBUG means checking if the property is defined well, it is not about finding / identifying bugs in the system under text.

@alfert
Copy link
Owner

alfert commented Dec 2, 2019

And something else: I like the various test targets. Please document them in the README.md.

@x4lldux
Copy link
Contributor Author

x4lldux commented Dec 2, 2019

So please add a PROPCHECK_ prefix

Will do!

handling of the environment variables should be changed and follow along the implementation started in #126

I deliberately chose to implement that based on #114, because it was just easier to do and also I think we need to tackle configuration in a separate issue. I'd leave that as is, and change it once we unify configuration. Right now there are 5 ways to pass options to PropEr from PropCheck - three of which are somewhat "global".

Are those debug output functions intended to be used by regular PropCheck users?

No. It's just for testing PropCheck itself. PropCheck.TestCase is also just for tests meant to test PropCheck itself.

Here, the DEBUG means checking if the property is defined well, it is not about finding / identifying bugs in the system under text.

Right, DEBUG-output is used to inspect the property, so in most cases it's not useful to have it on.

And something else: I like the various test targets. Please document them in the README.md.

Good point! I should have thought about that!

@alfert
Copy link
Owner

alfert commented Dec 2, 2019

PropCheck.TestCase is also just for tests meant to test PropCheck itself.

Ahh, interesting. I did not see this. We should state that explicitly in the docs. This goes also for the
the debug functions. The reason behind is that the test cases are also examples how to use PropCheck. Using then functions not supposed to be used by ordinary users requires at least an explanation. Your writing above about the noisy tests is a good rationale that should find its place in the documentation of PropCheck.TestCase.

we need to tackle configuration in a separate issue

Valid point. I open a feature request for that.

@evnu
Copy link
Collaborator

evnu commented Dec 2, 2019

We should state that explicitly in the docs.

Maybe it would make sense to mark the module with @moduledoc false and state its function in a regular comment. It should then not be part of the documentation.

@x4lldux
Copy link
Contributor Author

x4lldux commented Dec 2, 2019

It should then not be part of the documentation.

ExDocs won't generate docs from test/support path unless MIX_ENV is set to test. It's also marked as only: :dev in deps, so that wouldn't even work.
That said, I'd leave it as a docs, but improve the docs and add note to README.

@alfert
Copy link
Owner

alfert commented Dec 3, 2019

I assume the conflict with .travis.yml disappears after rebasing.

@x4lldux
Copy link
Contributor Author

x4lldux commented Dec 3, 2019

Ahh, interesting. I did not see this. We should state that explicitly in the docs. This goes also for the
the debug functions. The reason behind is that the test cases are also examples how to use PropCheck. Using then functions not supposed to be used by ordinary users requires at least an explanation.

That's a very good point! We should consider whether to choose between new user's experience and our convenience. Maybe it would be better pick the former and to undo PropCheck.TestCase and explicitly replace that in each test module with something like:

use PropCheck, default_opts: &PropCheck.TestHelpers.config/0
import PropCheck.TestHelpers, except: [config: 0]

Where PropCheck.TestHelpers would contain those debug_* functions and default config which currently is in PropCheck.DefaultOpts. Maybe the need for default_opts: &PropCheck.TestHelpers.config/0 will disappear altogether once #138 is closed.


Or we could also extract those examples to a separate examples/ directory and just not run them during testing and leave our tests to just check the library, not to explain the use cases.

@alfert
Copy link
Owner

alfert commented Dec 4, 2019

Or we could also extract those examples to a separate examples/ directory and just not run them during testing and leave our tests to just check the library, not to explain the use cases.

At least we must run it regularly to ensure that it is still valid code. From that perspective it is still test code, but not optimised for our internal development processes.

Removing PropCheck.TestCase does not help, since we would still use the debug functions, which are not supposed to be used by (new) users. What about setting the verbosity more aggressively to false?

@x4lldux
Copy link
Contributor Author

x4lldux commented Dec 4, 2019

The simplest solution would be to just comment out any collect/3, measure/3 and aggregate/3 functions (and just leave debug/1 and scale_*/2,3 functions from TestHelpers). I don't think they are needed on day-to-day test running.

@x4lldux x4lldux force-pushed the saner_tests branch 2 times, most recently from e17fb88 to 4512fcf Compare December 10, 2019 14:32
Running tests in PropCheck isn't exactly easy. There are few tags tests
that are intended to fail and some which are unstable, additionally
there are few external tests that need be run form a bash shell. So
simply doing `mix test` doesn't work and will just confuse users.

This commit simplifies that by introducing new two Mix alias:
 * `test_ext` which runs external tests,
 * `tests` which runs both external and regular tests (limited only to
   reliable tests).
To accomplish that, this commit, replaces most printing with `debug/1`
function and comment out unnecessary printouts (like `collect/3` or
`aggregate/3`). Printing of debug information can now be controlled by
setting `PROPCHECK_DEBUG` environment variable. Setting
`PROPCHECK_DEBUG` to `1` or `TRUE` will enable printing of debug
statements in the tests itself.

In CI environment, both variables are set to preserver current output
state.
@x4lldux
Copy link
Contributor Author

x4lldux commented Dec 10, 2019

Strange. All of my travis builds were OK while here all but one of the builds, which failed with erl_crash.dump. Probably memory consumption got too big when creating a tree in that last test. @alfert can you rerun the build?

If this will keep happening, search_steps will need to be adjusted.

@alfert
Copy link
Owner

alfert commented Dec 11, 2019

Rerunning the branch on my notebook works smoothly (mix tests). But on travis, it fails while allocating 3.9 GB of heap, which is quite a lot. Travis provides "only" 7.5 GB RAM in their machines, so we could really run against a wall here.

I would suggest to adjust the search_steps.

Default number of tests is reduced to 33 and tests which had hard coded
number of tests are scaled based on their relation to PropEr's default
of 100. That is, if test had hardcoded `numtests: 10`, it's now
`scale_numtests(0.1)` because 10 is 0.1 of 100. This allows to keep the
proportions of individual tests while globally controlling number of
tests via `PROPCHECK_NUMTESTS` environment variable.

The same was done to `search_steps`, but is controlled by
`PROPCHECK_SEARCH_STEPS` environment variable and scaled by
`scale_search_steps` function.

This is meant to reduce time wait for completing tests on day-to-day
development. To compensate for the potentially missed opportunities to
generate failing examples on day-to-day development, number of tests is
increased on CI.
Tests involving PingPong and Movie servers cannot be run safely in
parallel, so only one in each group was chosen to be run asynchronously.
Increasing the number of tests increases the chances of find the correct
path.
@x4lldux
Copy link
Contributor Author

x4lldux commented Dec 11, 2019

I've modified .travis.yml and explicitly set PROPCHECK_SEARCH_STEPS to 1000, which is the default value, and PROPCHECK_NUMTESTS to 200, twice the default. If that doesn't help, then this particular tests will have to be altered.

But, the VM I'm using to develop it has "only" 5.9GB RAM and half of it is used by firefox and emacs, and I didn't had any issue like that. I think the probabilistic nature of this tests will generate enormous structures on "rare" occasions. Will see how rare.

@alfert alfert merged commit 4725c0f into alfert:master Dec 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants