-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to reproduce the training results of autoencoder #8
Comments
@bluestyle97 can you please mention the value you are using for the threshold parameter? I find that saving based on a low threshold value is ideal. Also are this the results from best.pt or last.pt |
@sanghiad The threshold is 0.1, and the results are from best_iou.pt. |
i would recommend using 0.05 threshold value. That will improve autoencoder results. I believe thats what I use during training to select the best_iou.pt |
Threshold makes a huge difference during text based generation too, so i would recommend trying different threshold values from 0.03-0.2 during inference in Stage 2. |
@sanghiad Hi, I have trained an autoencoder which shows similar reconstruction performance to the pretrained autoencoder. However, I still cannot reproduce the final generation results since I failed to train a good flow model. I want to know what hyperparameters you used to train the flow model, e.g., batch_size, epochs, num_views, and so on? |
@bluestyle97 sorry for the delay. Can you please give your hyperparameters? |
@sanghiad I am also unable to train a good flow model. The hyperparameters are provided by default code. May I ask what specific hyperparameter settings are? |
Hi, I'm trying to reproduce the results of Clip-Forge myself by training from scratch. I trained the autoencoder on the ShapeNet data downloaded from the repository of occupancy-networks, but got unsatisfactory results compared to the pretrained model. I did not change the hyperparameters except that I changed the batch_size from 32 to 256 to better fit into the GPU memory (I think this should not harm the performance but improve it). So I'm wondering if you used the same default hyperparameters to train the autoencoder, or you used some special training tricks? And do you have any idea to improve the performance of the autoencoder, since it's crucial to the final shape generation ability?
Here are some visualizations to show the differences in reconstruction results on the training set.
Pretrained autoencoder:
Training from scratch:
The text was updated successfully, but these errors were encountered: