Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question of a detail #9

Open
Interesting6 opened this issue Apr 2, 2019 · 5 comments
Open

question of a detail #9

Interesting6 opened this issue Apr 2, 2019 · 5 comments

Comments

@Interesting6
Copy link

Interesting6 commented Apr 2, 2019

Thanks for your work, sir. I have a question about the following instruction:

Re-run in trainval mode python scripts/train/few_shot/run_trainval.py. This will save your model into results/trainval by default.

what does it mean? restart a training? or on the basis of the first running, then using its parameters train the model to find a better solution of parameters?

and why the results(embedding parameters) are equal when I run your code twice? due to the random seed? if not do that, can you guarantee the embedding results are equal?

@schatty
Copy link

schatty commented Apr 2, 2019

Hi, as I'm reading this repo right now maybe I can help a bit. run_trainval.py is for continuing training from existing model, you can see --model.model_path parameter in run_trainval.py script, i.e. model with trained embedding will be loaded and will be optimised even more from this point.
The equality of the results can be due to the torch random seed fixation in train.py

@Interesting6
Copy link
Author

Thank you Schatty, I understand that. meanwhile, some question raised. why need that rerun the model? and how that can make sure the model achieve a better performance? it's that because of the different training dataset?
besides, if the random seed removed, the accuracy may hardly arrive this performance I think.

@schatty
Copy link

schatty commented Apr 2, 2019

I think the logic in trainval.py can be used to train the model in several steps for example in the case of the lack of continuous time access to computation resources. It can also be used to perform additional training on some other dataset, but I didn't see mention on it so far. Why do you think removal of random seed will affect the accuracy so badly?

@Interesting6
Copy link
Author

Interesting6 commented Apr 4, 2019

sorry schatty, recently I'm busy so that I have no time to see it.

I'm quite confused to what do you mean "the lack of continuous time access to computation resources". Can you explain it with more details?

just now, I review the source code of loading data. I see that file split/vinyals/train.txt is using to load all training classes.

Angelic/character01/rot000
Angelic/character01/rot090

as above showed, for Angelic/character01, rotation 0 degrees and rotation 90 degrees are two different training classes? So I am coming here to make sure that it's using to increasing classes rather than data argumented(or says increasing number of samples)?

and last, I have a question about its training&test strategies. I see in many paper cite this paper and said, "the consistency between training and test environment alleviate the distribution gap and improves generalization". I think here the environment is the "n-way, k-shot" episode strategy. so why that can alleviate the distribution gap and improves generalization?

@schatty
Copy link

schatty commented Apr 7, 2019

I just mean that it is often needed to continue to work with the model trained earlier if the full training procedure can not be performed at once.
Yes, that are two different training classes, i.e. rotating used not for augmenting purpose

It is hard for me to answer the last one, but it seems to have the same environment of training and evaluation is always good to maintain distribution of some parameters, but I don't remember related statements about gaps and generalisation in the original paper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants