You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 16, 2024. It is now read-only.
I found Jsonnet through AllenNLP. Hence a few words on that first. We use AllenNLP for writing our Deep Learning experiments in our team. It is primarily built for doing NLP research on top of PyTorch. However, the abstractions in the library are well designed and easily extensible that it can actually be used for building any fairly straightforward neural network experiments. Perhaps I would write a separate post on how to adopt AllenNLP for non-NLP experiments, but here is an example of how it has been used for Computer Vision.
Jsonnet
Jsonnet is a DSL for creating data templates and comes in handy to generate JSON based configuration data. It comes with a standard library std that includes features like list comprehension, string manipulation, etc. It is primarily meant for generating configuration files. std has a bunch of manifestation utilities that can be used to convert the template to generate targets in .ini, .yaml. For a robust templating language with more complex needs however, I'd suggest to use the awesome StringTemplate.
AllenNLP uses Jsonnet for writing experiment configurations. In other words, the dependencies for running an AllenNLP experiment are specified in a .jsonnet file and the objects are constructed using built in factories. Below is a section from a configuration that defines a simple feedforward MNIST classifier network:
One minor quibble with the above config: when running multiple experiments, the most obvious thing one would do is to change those numbers in every layer. Say to increase the output dimension of the mnist_encoder, input_dim of projection_layer needs to be adjusted too. For a more complex architecture, there is a good possibility that this would lead to a chain of changes to be done manually.
Obvious thing to do now is to use variables. In the below example, notice how the same variable is used for configuring both the output and input sizes of mnist_encoder and projection_layer layers respectively.
The next natural step of the experiment is to try different types of layers. In the above example mnist_encoder is a feedforward block. It could be a convolution based encoder as below:
Here ConvSpec is a jsonnet object with a bunch of member attributes that defines how the convolution block is used in the classifier. Jsonnet also supports inheritance of objects as shown here.
Functions
The subsequent layers such as projection_layer and final_layer are going to be present in the conv example too and the projection_layer needs an input size to be defined. This will be output size of the conv layer and hardcoding this number is going to cause the same set of problems that we saw above in the first example. Let's define a simple function that computes the output sizes of each layer in a conv block.
compute_conv_output_sizes is a tail recursive function calculates the output size of each conv layer based on the defined kernel size, stride, padding and dilation.
Let's incorporate this definition in the mnist_encoder example:
Now a layer that follows mnist_encoder can make use of ConvSpec.output_size to configure its input sizes.
Imports
So we have defined two variants of encoders here for a classifier. Except for the encoder all the other parts of the model configuration and training configuration are going to be the same. Let's put the base scaffolding that defines the classifier in a .libsonnet file. This will serve as an importable lib that two different experiments can use.
Notice above how the MNIST classifier has been parameterized which can be used from different variants of the architecture. Now, let's redefine the feedforward variant to do the import.
mnist_feedforward.jsonnet
local lib = import"lib-mnist.libsonnet";
local INPUT_SIZE = 784;
local INPUT_ENCODER_OUTPUT_SIZE = 512;
local PROJECTION_SIZE = 64;
local FINAL_OUTPUT_SIZE = 2;
local DROPOUT = 0.25;
local NUM_PROJECTION_LAYERS = 1;
local ENCODER = {
"spec": {
"num_layers": 2,
"activations": ["relu", "relu"],
"input_dim": INPUT_SIZE,
"hidden_dims": INPUT_ENCODER_OUTPUT_SIZE,
"dropout": DROPOUT
}
};
lib.MNIST(ENCODER.spec, PROJECTION_SIZE, FINAL_OUTPUT_SIZE, DROPOUT, NUM_PROJECTION_LAYERS);
mnist_conv.jsonnet
local lib = import"lib-mnist.libsonnet";
local INPUT_SIZE = 784;
local PROJECTION_SIZE = 64;
local FINAL_OUTPUT_SIZE = 2;
local DROPOUT = 0.25;
local NUM_PROJECTION_LAYERS = 1;
local ConvSpec = {
"num_layers": 2,
"input_dim": INPUT_SIZE,
"kernels": [[3, 3], [3, 3]],
"stride": [1, 1],
"activations": ["relu", "relu"],
"output_channels": [32, 64],
"output_sizes": compute_conv_output_sizes(self, self.input_dim, 0),
"output_size": self.output_sizes[-1]
};
local ConvEncoder = {
"spec": {
"type": "conv2d""num_layers": ConvSpec.num_layers,
"input_dim": ConvSpec.input_dim,
"output_channels": ConvSpec.output_channels,
"kernels": ConvSpec.kernels,
"stride": ConvSpec.stride,
"activations": ConvSpec.activations
}
};
lib.MNIST(ConvEncoder.spec, PROJECTION_SIZE, FINAL_OUTPUT_SIZE, DROPOUT, NUM_PROJECTION_LAYERS)
Now this kind of a setup would allow to easily bring in rapid prototyping and experimentation. Just replace the encoder with RNN or Self Attention based layers and pass it to lib.MNIST.
Jsonnet brings in a lot of flexibility to defining configurations as we have seen above. There are a lot more interesting things that could be done with respect to configuring neural net experiments with Jsonnet. Imagine writing a grid search procedure in Jsonnet that would generate all possible configurations for the different hyperparameter combinations. That wouldn't be too difficult I guess. A lot of interesting example experiment configurations can be found in AllenNLP's source here.
The text was updated successfully, but these errors were encountered:
I found Jsonnet through AllenNLP. Hence a few words on that first. We use AllenNLP for writing our Deep Learning experiments in our team. It is primarily built for doing NLP research on top of PyTorch. However, the abstractions in the library are well designed and easily extensible that it can actually be used for building any fairly straightforward neural network experiments. Perhaps I would write a separate post on how to adopt AllenNLP for non-NLP experiments, but here is an example of how it has been used for Computer Vision.
Jsonnet
Jsonnet is a DSL for creating data templates and comes in handy to generate JSON based configuration data. It comes with a standard library
std
that includes features like list comprehension, string manipulation, etc. It is primarily meant for generating configuration files.std
has a bunch of manifestation utilities that can be used to convert the template to generate targets in.ini
,.yaml
. For a robust templating language with more complex needs however, I'd suggest to use the awesome StringTemplate.AllenNLP uses Jsonnet for writing experiment configurations. In other words, the dependencies for running an AllenNLP experiment are specified in a
.jsonnet
file and the objects are constructed using built in factories. Below is a section from a configuration that defines a simple feedforward MNIST classifier network:Sample Configuration
mnist_feedforward.jsonnet
Variables
One minor quibble with the above config: when running multiple experiments, the most obvious thing one would do is to change those numbers in every layer. Say to increase the output dimension of the
mnist_encoder
,input_dim
ofprojection_layer
needs to be adjusted too. For a more complex architecture, there is a good possibility that this would lead to a chain of changes to be done manually.Obvious thing to do now is to use variables. In the below example, notice how the same variable is used for configuring both the output and input sizes of
mnist_encoder
andprojection_layer
layers respectively.mnist_feedforward.jsonnet
Objects
The next natural step of the experiment is to try different types of layers. In the above example
mnist_encoder
is a feedforward block. It could be a convolution based encoder as below:mnist_conv.jsonnet
Here
ConvSpec
is a jsonnet object with a bunch of member attributes that defines how the convolution block is used in the classifier. Jsonnet also supports inheritance of objects as shown here.Functions
The subsequent layers such as
projection_layer
andfinal_layer
are going to be present in the conv example too and theprojection_layer
needs an input size to be defined. This will be output size of the conv layer and hardcoding this number is going to cause the same set of problems that we saw above in the first example. Let's define a simple function that computes the output sizes of each layer in a conv block.compute_conv_output_sizes
is a tail recursive function calculates the output size of each conv layer based on the defined kernel size, stride, padding and dilation.Let's incorporate this definition in the
mnist_encoder
example:mnist_conv.jsonnet
Now a layer that follows
mnist_encoder
can make use ofConvSpec.output_size
to configure its input sizes.Imports
So we have defined two variants of encoders here for a classifier. Except for the encoder all the other parts of the model configuration and training configuration are going to be the same. Let's put the base scaffolding that defines the classifier in a
.libsonnet
file. This will serve as an importable lib that two different experiments can use.lib-mnist.libsonnet
Notice above how the MNIST classifier has been parameterized which can be used from different variants of the architecture. Now, let's redefine the feedforward variant to do the import.
mnist_feedforward.jsonnet
mnist_conv.jsonnet
Now this kind of a setup would allow to easily bring in rapid prototyping and experimentation. Just replace the encoder with RNN or Self Attention based layers and pass it to
lib.MNIST
.Jsonnet brings in a lot of flexibility to defining configurations as we have seen above. There are a lot more interesting things that could be done with respect to configuring neural net experiments with Jsonnet. Imagine writing a grid search procedure in Jsonnet that would generate all possible configurations for the different hyperparameter combinations. That wouldn't be too difficult I guess. A lot of interesting example experiment configurations can be found in AllenNLP's source here.
The text was updated successfully, but these errors were encountered: