You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The np.diag(sigma2) in __ndlp, uses O(n^2) memory, where n grows linearly with signal (audio) length. I believe it can be fixed by replacing a matrix multiply with an element-wise multiply. Note two things:
(1) np.dot on two 2D arrays is interpreted as matrix multiply.
(2) np.dot(A, np.diag(B)) = matmul(A, np.diag(B)) = A * B[np.newaxis, :]
(2.1) Just to elaborate why the above holds. The matrix multiply can be thought of repeating for each row in A, multiply together the ith column by the ith row in the diagonal matrix (because everywhere else is zero). Thus, this reduces to an element-wise multiply.
Numerical check (run it as often as you want to verify):
tmp1 = np.random.rand(2,8)
tmp2 = np.random.rand(8)
res1 = (tmp1 @ np.diag(tmp2))
res2 = (tmp1 * tmp2[None, :])
np.allclose(res1,res2)
Code that is to be modified
def __ndlp(self, xk):
"""Variance-normalized delayed liner prediction
Here is the specific WPE algorithm implementation. The input should be
the reverberant time-frequency signal in a single frequency bin and
the output will be the dereverberated signal in the corresponding
frequency bin.
Args:
xk: A 2-dimension numpy array with shape=(frames, input_chanels)
Returns:
A 2-dimension numpy array with shape=(frames, output_channels)
"""
cols = xk.shape[0] - self.d
xk_buf = xk[:,0:self.out_num]
xk = np.concatenate(
(np.zeros((self.p - 1, self.channels)), xk),
axis=0)
xk_tmp = xk[:,::-1].copy()
frames = stride_tricks.as_strided(
xk_tmp,
shape=(self.channels * self.p, cols),
strides=(xk_tmp.strides[-1], xk_tmp.strides[-1]*self.channels))
frames = frames[::-1]
sigma2 = np.mean(1 / (np.abs(xk_buf[self.d:]) ** 2), axis=1)
for _ in range(self.iterations):
x_cor_m = np.dot(
#np.dot(frames, np.diag(sigma2)), # REPLACE THIS LINE WITH THE FOLLOWING
frames * sigma2[None, :],
np.conj(frames.T))
x_cor_v = np.dot(
frames,
np.conj(xk_buf[self.d:] * sigma2.reshape(-1, 1)))
coeffs = np.dot(np.linalg.inv(x_cor_m), x_cor_v)
dk = xk_buf[self.d:] - np.dot(frames.T, np.conj(coeffs))
sigma2 = np.mean(1 / (np.abs(dk) ** 2), axis=1)
return np.concatenate((xk_buf[0:self.d], dk))
The text was updated successfully, but these errors were encountered:
The np.diag(sigma2) in __ndlp, uses O(n^2) memory, where n grows linearly with signal (audio) length. I believe it can be fixed by replacing a matrix multiply with an element-wise multiply. Note two things:
(1) np.dot on two 2D arrays is interpreted as matrix multiply.
(2) np.dot(A, np.diag(B)) = matmul(A, np.diag(B)) = A * B[np.newaxis, :]
(2.1) Just to elaborate why the above holds. The matrix multiply can be thought of repeating for each row in A, multiply together the ith column by the ith row in the diagonal matrix (because everywhere else is zero). Thus, this reduces to an element-wise multiply.
Numerical check (run it as often as you want to verify):
tmp1 = np.random.rand(2,8)
tmp2 = np.random.rand(8)
res1 = (tmp1 @ np.diag(tmp2))
res2 = (tmp1 * tmp2[None, :])
np.allclose(res1,res2)
Code that is to be modified
The text was updated successfully, but these errors were encountered: