Skip to content

Latest commit

 

History

History
66 lines (46 loc) · 3.67 KB

README.md

File metadata and controls

66 lines (46 loc) · 3.67 KB

GenerativeMIL

Repository in still development!!!

This repository is being developed as complement to GroupAD.jl, where the most procedures are located. GenerativeMIL mostly provide advanced models for generative modeling of Multi Instance (Learning) data and Set structured data. Models are implemented in the most optimal which we could think of. So there might be other better way.

Model Zoo

Implemented models CPU training GPU training variable cardinality1 (in/out) 2 note
SetVAE yes yes yes/yes Implementation is 1:1 Python to Julia code from original repository.
FoldingNet VAE yes yes 3 yes/no batched training on CPU via broadcasting / GPU training in special case 3
PoolModel (ours) yes yes 4 yes/yes TODO masked forward pass for variable cardinality on GPU
SetTransformer yes yes yes/no classifier version only
Masked Autoencoder for Distribution Estimation (MADE) yes yes possible5/no TODO: add support for multiple masks6.
Masked Autoregressive Flow (MAF) ? ? not finished
Inverse Autoregresive Flow (IAF) ? ? not finished
SoftPointFlow ? ? yes/yes not finished
SetVAEformer (ours) yes yes yes/yes not finished/ Similar to Vanilla SetVAE but better ;)

DrWatson

This code base is using the Julia Language and DrWatson to make a reproducible scientific project named

GenerativeMIL

To (locally) reproduce this project, do the following:

  1. Download this code base. Notice that raw data are typically not included in the git-history and may need to be downloaded independently.
  2. Open a Julia console and do:
    julia> using Pkg
    julia> Pkg.add("DrWatson") # install globally, for using `quickactivate`
    julia> Pkg.activate("path/to/this/project")
    julia> Pkg.instantiate()
    

This will install all necessary packages for you to be able to run the scripts and everything should work out of the box, including correctly finding local paths.

Footnotes

  1. As cardinality, we consider to be the number of elements in a single bag/set. For real world this number in can vary for each set, which makes training in batches impossible. If a model contains a method/way how to bypass this problem, it is considered capable of handling "variable cardinality". Most models require modification to fulfil this such as masking inputs as well as intermediate outputs.

  2. "in variable cardinality" is thought as different cardinality of sets in input batch and "out variable cardinality" is whether the model can output batch with distinct cardinalities then in input batch. In other words it can sample arbitrary number of elements for each set.

  3. FoldingeNet VAE is trainable on gpu via function "fit_gpu_ready!". It is a special case with fixed cardinality and without KLD of reconstructed encoding. 2

  4. At this point PoolModel works only for constant cardinality.

  5. Since there is no cardinality reduction or expansion

  6. This model is essentially building block for MAF, IAF and SoftPointFlow