Releases: desh2608/pytorch-tdnn
v1.1.0
The following changes have been made:
-
The semi-orthogonal loss is now computed as the Frobenius norm of P (
P = torch.mm(M, M.T)
), instead of the Frobenius norm of (P - \alpha^2 I). This makes it consistent with the loss reporting in Kaldi. -
The
forward()
function in theTDNNF
class now takessemi_ortho_step
as argument instead oftraining
. This allows the calling function to make the decision about whether or not to take the step towards semi-orthogonality. -
The initialization of the
TDNN
layer now takes abias
argument, which specifies whether or not to use bias in theConv1D
layer. When using the TDNN for theSemiOrthogonalConv
class forTDNNF
, we setbias = False
, so that the matrix factorization checks out correctly.
First release
This release contains basic versions of TDNN and TDNN-F layers, with some constraints for the contexts.
Using the TDNN layer
from pytorch_tdnn.tdnn import TDNN as TDNNLayer
tdnn = TDNNLayer(
512, # input dim
512, # output dim
[-3,0,3], # context
)
Note: The context
list should follow these constraints:
- The length of the list should be 2 or an odd number.
- If the length is 2, it should be of the form
[-1,1]
or[-3,3]
, but not
[-1,3]
, for example. - If the length is an odd number, they should be evenly spaced with a 0 in the
middle. For example,[-3,0,3]
is allowed, but[-3,-1,0,1,3]
is not.
Using the TDNNF layer
from pytorch_tdnn.tdnnf import TDNNF as TDNNFLayer
tdnn = TDNNFLayer(
512, # input dim
512, # output dim
256, # bottleneck dim
1, # time stride
)
Note: Time stride should be greater than or equal to 0. For example, if
the time stride is 1, a context of [-1,1]
is used for each stage of splicing.
Credits
- The TDNN implementation is based on: https://github.com/jonasvdd/TDNN.
- Semi-orthogonal convolutions used in TDNN-F are based on: https://github.com/cvqluu/Factorized-TDNN.