Hi there! This is a reimplementation library for computer vision models by PyTorch and Einops. Our mission is to bridge papers and codes with consistent and clear implementations.
-
Consistent Structure
Every models share similar building objects:xxxBase
: A model's base. For checking common input arguments and storing important variables. Sometimes it can also provide specified weight initialization methods or necessary tensor operations, like patching and flattening images inViT
.xxxBackbone
: A model's backbone architecture. It includes every needed components to build the model except the classifier.xxxWithLinearClassifier
:xxxBackbone
plus a projection head as its classifier. Only acceptxxxConfig
as its argument. Similar to Huggingface. Might provide some variants for differenet objective in the future.xxxConfig
: A configuration for all possible coefficients. It also provides model specializations mentioned in the papers.
-
Consistent Namings for papers and across papers
To make researchers or developers understand implementations as soon as possible, we tightly follow the names of model components from the official papers and be consistent on common namings across papers. -
Clear Tensor Operations
We use Einops for almost all tensor operations to unveil the dimensions of tensors, which are usually hidden in the code, and make our implementations explain by themselves. -
Clear Arguments
To expose all possible arguments to users but still remain convenience, we categorize building objects into a hierarchical order with 3 levels listed from bottom to top as below:- Basic : The paper-proposed and essential objects that mostly inherit directly from
nn.Module
or other Basic objects, likeViTBase
,MultiheadAttention
,SpatialGatingUnit
, etc. - Wrapper: Intermediate objects or wrappers that organize Basic ones, like
TransformerEncoderLayer
,PerceiverBlock
,MLPMixerLayer
,xxxWithLinearClassifier
, etc. - Model:
xxxBackbone
andxxxConfig
.
Basic and Model objects are the ones crucial for paper-to-code mappings and model usages, so we require their arguments to be fully explicit to users (list all arguments in
__init__
methods). And for the sake of convenience, Wrapper objects can useargs
or**kwargs
to pass down necessary arguments. The overall model structures in term of the number of required arguments will look like a hourglass. - Basic : The paper-proposed and essential objects that mostly inherit directly from
-
Semantic Naming
Excluding some common names likex
for the input tensors,ff_dropout
for the dropout rate of feed forward networks, andact_func_name
for a string of activation function's name supported by PyTorch, all variables, helper functions, and objects should be named meaningfully. -
Detailed Model Information
Every models has its ownREADME.md
that provides usages, one-by-one argument explanations, and all usable objects and specializations. The official implementations are provided as well if any mentioned in the official paper.
pip3 install comvex
Please check out the Usage section detailed in models' own README.md
.
Please check out the CONTRIBUTING.md
for details.
- Continuously implementing models, please check them out under the
comvex
folder for more details andexamples
folder for some demos. - Pull requests are welcome!
- From this issue, inheritance doesn't support in
torchscript
. Therefore, most of our implementations aren't scriptable. Buttrace
seems that doesn't exist this kind of issue and we will usetrace
as our default and gradually update our code to make ComVEX a trace-supported library.