You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Automatic testing is fundamental to keep a collaborative developed project from endless bugs corrupting modules that originally work. As for a deep learning library, always running the whole training or analyzing process from the outermost can consume lots of time and computational resources. Minor bugs may also not be triggered in a fixed training setting. Thus, it's necessary to test at different levels to ensure proper functioning as much as possible.
I propose adding the following 4 categories of testing:
Unit testing: Testing if every innermost method works well with mock data, e.g. a single forward pass in a minimal SAE, a single generation of activation. Unit testing should cover almost all parts of the library, so every single test is required to run fast.
Integrated testing: Testing if low-level modules work with one another properly, e.g. getting feature activation directly from text input (needs co-working of transformers and SAEs), a single training pass, and loading pretrained SAEs from HuggingFace. These tests should cover the common usage of the library at a rather high level. It also requires an acceptable time cost (maybe no more than several seconds). These tests should not depend on GPUs if possible.
Acceptance testing: Testing if modules work with a high performance (loss, memory allocated, time cost), e.g. if a pretrained SAE gives a reasonable loss. Some of these tests may require GPUs to run. Failure of these tests may be acceptable in some situations.
Benchmarks: Testing the time usage of a complete process and some bottleneck modules.
Continuous Integration (CI) with GitHub workflows should also be added to run testing on every push/PR. PRs should not be merged unless all tests are passed.
The text was updated successfully, but these errors were encountered:
Tracking: @Frankstein73 has added CI workflows to automatically run tests on pushing/making PRs to main/dev branch. Feel free to complete the missing tests for all modules!
Automatic testing is fundamental to keep a collaborative developed project from endless bugs corrupting modules that originally work. As for a deep learning library, always running the whole training or analyzing process from the outermost can consume lots of time and computational resources. Minor bugs may also not be triggered in a fixed training setting. Thus, it's necessary to test at different levels to ensure proper functioning as much as possible.
I propose adding the following 4 categories of testing:
Continuous Integration (CI) with GitHub workflows should also be added to run testing on every push/PR. PRs should not be merged unless all tests are passed.
The text was updated successfully, but these errors were encountered: