To train a model already defined in torchdeepretina/models.py
, you can simply run the following line at the terminal from the training_scripts
folder:
$ python3 main.py hyperparams.json hyperranges.json
The hyperparams.json should have a list of all the desired user setting for the training. The hyperranges.json should have the values wished to be searched over for each desired user setting key.
-
exp_name
: str- The name of the main experiment. Model folders will be saved within a folder of this name. WARNING: do not use names that include an integer right after an underscore, i.e. do not use names like
myexp_1
ormy_exp_1
ormy_1_exp
. The first integer following an underscore is assumed to be theexp_num
in some portions of the code.
- The name of the main experiment. Model folders will be saved within a folder of this name. WARNING: do not use names that include an integer right after an underscore, i.e. do not use names like
-
save_root
: str (optional)- Path to the folder in which you would like to save your experiment data.
-
exp_num_offset
: int- This value is added to the determined experiment number. This allows for trainings to be performed on different machines without
exp_num
conflicts.
- This value is added to the determined experiment number. This allows for trainings to be performed on different machines without
-
n_repeats
: int- The number of times to repeat any given hyperparameter set
-
cross_val
: bool- if true, will cross validate model with
n_cv_folds
train-val splits
- if true, will cross validate model with
-
cross_val_idx
: int or null- if
cross_val
is true, this parameter has no effect and will be cycled through during training. Ifcross_val
is false, this will specify the train-val split to use during training.
- if
-
n_cv_folds
: int- the number of cross validation folds to use
-
save_every_epoch
: bool- A boolean determining if the model
state_dict
should be saved for every epoch, or only the most recent epoch.
- A boolean determining if the model
-
seed
: int, None, or"exp_num"
- This is the random seed for both pytorch and numpy. If None is argued, the current value of
time.time()
is used. If the string"exp_num"
is used, the seed takes on the value of the assigned experiment number. This makes comparitive searches a little bit easier.
- This is the random seed for both pytorch and numpy. If None is argued, the current value of
-
startpt
: string or None- This is the file path to the desired starting checkpoint if any. Will load the model state dict from the argued checkpoint in addition to the optimizer state dict and the zero dict with the appropriate value for
zero_bias
.
- This is the file path to the desired starting checkpoint if any. Will load the model state dict from the argued checkpoint in addition to the optimizer state dict and the zero dict with the appropriate value for
-
model_type
: str- The string name of the main model class to be used for training. Options are each of the classes defined in
models.py
- The string name of the main model class to be used for training. Options are each of the classes defined in
-
n_layers
: int- the number of layers to be used. Only applies if using "VaryModel" model type. Must be greater than or equal to one.
-
bnorm
: bool- if true, model uses batch normalization where possible
-
bnorm_d
: int- determines if model uses batchnorm 2d or 1d. Only options are 1 and 2
-
bias
: bool- if true, model uses trainable bias parameters where possible
-
softplus
: bool- if true, a softplus activation function is used on the outputs of the model
-
chans
: list of ints- the number of channels to be used in the intermediary layers
-
ksizes
: list of ints- the kernel sizes of the convolutions corresponding to each layer
-
img_shape
: list of ints- the shape of the incoming stimulus to the model (do not include batchsize but do include depth of images)
-
stackconvs
: bool- if true, convolutions are trained using linear convolution stacking
-
finalstack
: bool- if true, final layer is trained using linear convolution stacking. only applies if fully convolutional model.
-
convgc
: bool- if true, ganglion cell layer is convolutional
-
rand_onehot
: bool- if true, the onehot layer in retinotopic models is randomly initialized to values between 0 and 1. If false, the values are inititialized as
1/(height*width)
in which the height and width are the last dims of the incoming activations.
- if true, the onehot layer in retinotopic models is randomly initialized to values between 0 and 1. If false, the values are inititialized as
-
dataset
: str- the name of the dataset to be used for training. code assumes the datasets are located in
~/experiments/data/
. The dataset should be a folder that contains h5 files.
- the name of the dataset to be used for training. code assumes the datasets are located in
-
cells
: str or list of int- if string, "all" is only option which collects all cell recordings from the dataset. Otherwise, you can argue only the specific cells you would like to train with. See
datas.py
for more details.
- if string, "all" is only option which collects all cell recordings from the dataset. Otherwise, you can argue only the specific cells you would like to train with. See
-
stim_type
: str- the name of the h5 file (without the
.h5
extension) contained within the dataset folder
- the name of the h5 file (without the
-
shift_labels
: bool- if true, the training labels are grouped and shifted by a random amounts within the group. This acts as a random control.
-
lossfxn
: str- The name of the loss function that should be used for training the model. Currently options are "PoissonNLLLoss" and "MSELoss"
-
log_poisson
: bool- only relevant if using "PoissonNLLLoss" function. If true, inputs are exponentiated before poisson loss is calculated.
-
shuffle
: bool- boolean determining if the order of samples with in a batch should be shuffled. This does not shuffle the sequence itself.
-
rand_sample
: bool or None- boolean determining if the sampled batches should have randomized order. If None, defaults to value of
shuffle
. When used in conjunction withshuffle
, allows you to train with a set randomized order of data, or allows you to split the data into training and validation sets with preserved order but still randomize the data during training. Generally, however, the value should be true or null.
- boolean determining if the sampled batches should have randomized order. If None, defaults to value of
-
batch_size
: int- the number of samples to used in a single step of SGD
-
n_epochs
: int- the number of complete training loops through the data
-
lr
: float- the learning rate
-
l2
: float- the l2 weight penalty
-
l1
: float- the l1 activation penalty applied on the final outputs of the model.
-
noise
: float- the standard deviation of the gaussian noise layers
-
drop_p
: float- the dropout probability used in between linearly stacked convolutions. Only applies if
stackconvs
is set to true.
- the dropout probability used in between linearly stacked convolutions. Only applies if
-
gc_bias
: bool- if true, final layer has a bias parameter
-
bn_moment
: float- the momentum of the batchnorm layers
-
prune
: bool- if true, layers are pruned
-
prune_layers
: list of str- enumerates the layers that should be pruned. If empty list or string "all", all intermediary convolutional layers are pruned.
-
prune_tolerance
: float- the maximum drop in accuracy willing to be tolerated for a channel removal
-
prune_intvl
: int- the number of epochs to train for when trying a new dropped channel
-
alpha_steps
: int- the number of integration steps when calculating the integrated gradient
-
intg_bsize
: int- batch size of integrated gradient calculations
-
zero_bias
: bool- determines if bias should be zeroed in addition to pruned filter
-
reset_lr
: bool- determines if the learning rate should be reset to its initial value after every pruning attempt
-
abssum
: bool- if true, will take absolute value of integrated gradient prior to summing over channel. If false, will sum over channel first and then take absolute value. Either case will take mean over time.
-
reset_sd
: bool- If true, the model will be reset to the original state dict after each channel is pruned.
-
altn_layers
: bool- If true, the pruning will alternate layers for each channel pruned. If false, will exhaustively prune earliest layer first, then move on to subsequent layers in order.
-
retinotopic
: bool- determines if retinotopic training. Overwritten by
prune
- determines if retinotopic training. Overwritten by
-
semantic_scale
: float- the weighting of the semantic loss in the total loss
-
semantic_l1
: float- the scaling of an l1 penalty applied to the weight matrix of the onehot layer.