Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tests #5

Closed
Technologicat opened this issue Jul 24, 2019 · 14 comments
Closed

Refactor tests #5

Technologicat opened this issue Jul 24, 2019 · 14 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@Technologicat
Copy link
Owner

Tests need some refactoring:

  • Tease apart the integration tests, currently mixed up wantonly with the unit tests.
    • This applies to both unpythonic/test and unpythonic/syntax/test.
  • To make it easier to locate a regression, make all unit tests run first, and the integration tests after that, so that the first error that appears most likely points out where the bug is.
  • Run the integration tests in a specific order. This could act as further documentation on dependencies in the architecture (what builds on what).
  • Switch to a proper test framework such as pytest. The current runtests.py is a hack.
  • Measure test coverage (at least branch coverage, if not path coverage). Extend tests as required.

(When tackling this, opening sub-issues for each individual point may be a good strategy.)

@Technologicat Technologicat added the enhancement New feature or request label Jul 24, 2019
@Technologicat Technologicat added this to the 0.14.3 milestone Aug 8, 2019
@Technologicat
Copy link
Owner Author

The stdlib's unittest is lightweight and probably enough for the purposes of unpythonic. Being in the stdlib is an advantage, introducing no dependencies.

@Technologicat
Copy link
Owner Author

unittest both inspects directory structure, and uses getattr to retrieve submodules from the imported objects. Hence it won't work with unpythonic until we fix #44.

@Technologicat
Copy link
Owner Author

Technologicat commented Nov 22, 2019

Hmm, maybe pytest, it has a very concise syntax, and features to support even complex setups? (Though... macros?)

While at it, we could convert code examples to doctest format, and pytest would then automatically check the output matches what the docstring claims. Automatic checking helps keep the docstring examples up to date.

@Technologicat
Copy link
Owner Author

pytest and pytest-cov are great, let's use them. Installing and using unpythonic won't need pytest; only developing unpythonic will.

Some thought is required in setting this up, because macro test modules must run under the macropy3 wrapper.

Coverage measurement may also help discover more bugs (are the tests actually running the specific lines or code paths that they claim to?).

@Technologicat Technologicat modified the milestones: 0.14.x, 0.14.3 Aug 8, 2020
@Technologicat
Copy link
Owner Author

Let's try to at least factor the tests properly in 0.14.3.

@Technologicat Technologicat self-assigned this Aug 8, 2020
@Technologicat
Copy link
Owner Author

Hmm. This isn't simple, because:

  • unpythonic being a language extension, parts of it are written in itself. Especially dyn and box get a lot of use in various parts of the library. Very few of test modules import only from a single unpythonic module. So most of the time, we don't really have a layer of independent units we could neatly test before moving on to integration.
  • Language features interact. If we were to split, say, integration tests related to interaction of curry with other features into a separate module, that just chops the tests into unreadable ravioli. The important point is, curry is the secondary feature being integrated with, with the other feature (that is interacting with curry) is the primary focus of the user's interest. Hence the most logical place to host those integration tests is with the unit tests of the primary features.
  • Our macro layer needs the MacroPy macro expander, which is implemented as an import hook. So we can't admit a testing tool that brings its own import hook, because that would disable the macro expander. The otherwise excellent pytest is thus out.

So while making regressions maximally pinpointable (so that the test most likely to pinpoint the error correctly alerts first) is a worthy goal, maybe our primary focus should be elsewhere for now - getting a whole testset to run even when some tests fail. The assert syntax for test cases is nice, but without something like pytest to rewire what it means, it doesn't really achieve that.

To fix that, in the upcoming 0.14.3, we have added a test[] expr macro that behaves like assert, but with the superpower of on-error-continue. This superpower is made possible by the condition system added in 0.14.2.

The reason we provide a new expr macro, instead of rewiring assert, is that MacroPy macros only come in expr, block and decorator variants; we can't rewire any type of AST node willy-nilly. But this is arguably a good thing - let the word assert mean assert, and let the word test mean test. Not all asserts are tests; there are situations where we actually want an assert failure to abort the program.

Of course, the only reason to use assert, or the test[] macro for that matter, instead of just a simple error-reporting function, is that because these are syntax-level constructs, they can automatically capture the offending source code snippet (or an approximation thereof) when an assertion goes south. Usability-wise, that's a very nice feature.

Where this leads to is that right now, I'm leaning toward rewriting unpythonic's automated tests in terms of the test[] macro, and requiring MacroPy for development of unpythonic. (For using unpythonic, it will remain a strictly optional dependency at least for the foreseeable future, likely indefinitely.)

@Technologicat
Copy link
Owner Author

Technologicat commented Aug 12, 2020

Now we have also unpythonic.test.fixtures providing start, with testset, and summary, leaning on test[] for defining individual test assertions.

A language being written in itself leads to the thought that language can be a strange loop, so perhaps we shouldn't expect modularity from a project of this nature. Words are often defined in terms of other words. Similarly, in a programming language extension, there may be a few instances where construct A needs to account for construct B, and construct B needs to account for construct A, even if the user-facing sides of these constructs seem orthogonal. So what each of them means (operationally) changes whenever the other one is present.

Or from another angle, the whole library is in a state of mutual equilibrium, its constructs having co-evolved in presence of each other. As long as enough "meaning" (operationally) is in place as each definition runs, it "bottoms out" so the code can run.

Maybe this is the kind of thing that naturally comes to mind to anyone familiar with Hofstadter's classic book; or the one by Gell-Mann (I seem to recall he called this kind of situation a Hartree-Fock equilibrium).

Maybe it's just a lame excuse for not organizing tests properly. ;)

But now we at least have a rudimentary testing framework in place for macro-enabled Python code, so we can first make this thing produce proper test reports, before worrying about modularizing the tests further.

@Technologicat
Copy link
Owner Author

We should also investigate if we can use coverage.py to measure coverage here. There's a relevant issue in the tracker concerning coverage measurement of macro-enabled Python code.

@Technologicat
Copy link
Owner Author

Test framework pretty much complete for now, now just to rewrite the tests in terms of it...

@Technologicat
Copy link
Owner Author

After much updating, as of ca964e2, the tests for regular code now use unpythonic.test.fixtures. Next, macro tests...

@Technologicat
Copy link
Owner Author

As of a409d6f, we're finally done!

All tests of unpythonic now use unpythonic.test.fixtures as the testing framework. While at it, a couple of small bugs found and fixed (documented in CHANGELOG.md).

Now MacroPy is required for development of unpythonic. For using unpythonic, it remains strictly optional.

The test result on CPython 3.6.9 is:

Testset 'top level' END: Pass 1588, Fail 0, Error 2, Total 1590 (99% pass)

And on PyPy3 7.3.0:

Testset 'top level' END: Pass 1587, Fail 0, Error 3, Total 1590 (99% pass)

Two of the errors in both cases are due to the macro expander not liking bytes literals, see azazel75/macropy#26. The third error in case of PyPy3 is just the automated tests reporting that PyPy3 doesn't support async_raise (because I simply don't want that to fail silently).

Remaining:

  • Set up a coverage measurement system. Maybe coverage.py?
  • Get horrified. Coverage is likely lacking at places.
  • Start planning which things to fix and in which version.
  • Fix MacroPy crashes on complex and bytes azazel75/macropy#26 to get the tests of unpythonic.net working again. At that point, also enable the commented-out lines in the tests of unpythonic.typecheck, which are disabled for the same reason.

@Technologicat
Copy link
Owner Author

Technologicat commented Aug 19, 2020

Yes, coverage.py.

It's not bad for use with macro-enabled code, but of course with block macros, which line numbers get reported as covered depends on which line numbers the macro implementation injects into the resulting AST and/or how MacroPy fills in the missing ones.

The coverage is actually surprisingly high, at a total of 84%, and that's including the false misses due to macro expansion. A detailed investigation will take some time.

@Technologicat
Copy link
Owner Author

Ok, now we have a CI setup that runs the test suite on Pythons 3.4 through 3.7 and PyPy3, as well as runs coverage.py and uploads the result to codecov.io. This was pieced together from GitHub's own Python package workflow template, plus suggestions from this blog article, and this workflow file.

Turns out there were a couple of bugs that prevented some features from working on Pythons other than 3.6 and PyPy3; these have been fixed. The tests now pass on 3.4, 3.5, 3.6, 3.7 and on PyPy3.

The main remaining issue (beside figuring out a technically reasonable, yet highly readable organization for the tests, for which I don't see a straightforward strategy) is the final 15% of missing code coverage.

But that's already getting pretty far from "refactor tests". I think that part is done for now.

I'm closing this issue, and opening a new one to track the coverage, to be tackled later, perhaps already during the 0.14.x series. Quite a lot of changes have already accumulated since the recently released 0.14.2, so the plan is to get 0.14.3 out soon-ish - no year-long mega-release this time.

Python 3.8 support is planned for the 0.14.x series. Whether it will be included in 0.14.3 or not depends on how much regular code there is to fix once I get the macros working and can flip the switch to start CI'ing with 3.8, too.

@Technologicat
Copy link
Owner Author

Technologicat commented Apr 30, 2021

Python 3.8 and 3.9 support has arrived in unpythonic version 0.15.0.

Macro expander changed to mcpyrate. It supports Python 3.6+ (so also 3.8 and later), reports correct coverage, and works with bytes literals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant