Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to reproduce the training results of autoencoder #8

Open
bluestyle97 opened this issue Jul 7, 2022 · 7 comments
Open

Unable to reproduce the training results of autoencoder #8

bluestyle97 opened this issue Jul 7, 2022 · 7 comments

Comments

@bluestyle97
Copy link

Hi, I'm trying to reproduce the results of Clip-Forge myself by training from scratch. I trained the autoencoder on the ShapeNet data downloaded from the repository of occupancy-networks, but got unsatisfactory results compared to the pretrained model. I did not change the hyperparameters except that I changed the batch_size from 32 to 256 to better fit into the GPU memory (I think this should not harm the performance but improve it). So I'm wondering if you used the same default hyperparameters to train the autoencoder, or you used some special training tricks? And do you have any idea to improve the performance of the autoencoder, since it's crucial to the final shape generation ability?

Here are some visualizations to show the differences in reconstruction results on the training set.

Pretrained autoencoder:
image
image
image
image

Training from scratch:
image
image
image
image

@sanghiad
Copy link
Collaborator

sanghiad commented Jul 8, 2022

@bluestyle97 can you please mention the value you are using for the threshold parameter? I find that saving based on a low threshold value is ideal. Also are this the results from best.pt or last.pt

@bluestyle97
Copy link
Author

@sanghiad The threshold is 0.1, and the results are from best_iou.pt.

@sanghiad
Copy link
Collaborator

sanghiad commented Jul 8, 2022

i would recommend using 0.05 threshold value. That will improve autoencoder results. I believe thats what I use during training to select the best_iou.pt

@sanghiad
Copy link
Collaborator

sanghiad commented Jul 8, 2022

Threshold makes a huge difference during text based generation too, so i would recommend trying different threshold values from 0.03-0.2 during inference in Stage 2.

@bluestyle97
Copy link
Author

@sanghiad Hi, I have trained an autoencoder which shows similar reconstruction performance to the pretrained autoencoder. However, I still cannot reproduce the final generation results since I failed to train a good flow model. I want to know what hyperparameters you used to train the flow model, e.g., batch_size, epochs, num_views, and so on?

@sanghiad
Copy link
Collaborator

@bluestyle97 sorry for the delay. Can you please give your hyperparameters?

@happysxpp
Copy link

@sanghiad I am also unable to train a good flow model. The hyperparameters are provided by default code. May I ask what specific hyperparameter settings are?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants