Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about training auxiliary classifier F #2

Open
kingofprank opened this issue Sep 3, 2021 · 2 comments
Open

Some questions about training auxiliary classifier F #2

kingofprank opened this issue Sep 3, 2021 · 2 comments

Comments

@kingofprank
Copy link

kingofprank commented Sep 3, 2021

Thanks for your implementation. I suffer some troubles when training auxiliary classifier F. I use some landmarks(48 dims) as control attributes. The dim of latent is 512 (sample from z). I try some configurations in training(e.g., bs=8, 128 && lr=2e-4, 2e-5, etc.). But, the mse loss is still around 0.1 to 0.5 (50w iters in training or val phase), which make the prediction is not accuracy. I want to know whether this loss range is correct? Could you provide some configs in your exps? Thanks for your work~

@kingofprank kingofprank changed the title Some Some questions about training auxiliary classifier F Sep 3, 2021
@hyungkwonko
Copy link
Owner

Hi @kingofprank,
To answer your question, the loss range you have said does not matter for the final result.
I had the exact same issue what you are talking about, and the below is what Minjun who is the first author of the original paper said when I asked.

What Minjun said:
As for training the Auxiliary Mapping, from my experience, the loss value cannot reflect everything. The main thing I check using the loss value is to prevent the model from overfitting. Which would degenerates to a trivial solution that F only map z to z regardless of c, resulting to $\frac{\partial{F}}{\partial{z}}=1$ and $\frac{\partial{F}}{\partial{c}}=0$.
So as long as it is not overfitting, I think you can give it a try using Algo. 1, which essentially the Euler method for ODE.


By the way, I am sorry that I am too late to answer your question.
Hope it helps & Please let me know if there are any more questions.

@huangshenneng
Copy link

Thanks for your implementation. Algorithm 1 show that how to use Euler ODE solver to get new Z in the paper,but in your code doesn't has this operation.

in your code,why has tow delta_z whice one is come from delta_c and another is come from old delta_z?
z_out0, _ = AUX(z0, c0)
z_out1, _ = AUX(z0, c0 + delta_c)
delta_z = z_out1 - z_out0 # Algo1: line9

    z_out1, _ = AUX(z0 + delta_z, c0)
    delta_z += (z_out1 - z_out0)  # Algo1: line12

    z0 += delta_z  # Algo1: line14

Thanks for your work again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants