Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

other python 2.7 #2

Open
phobrain opened this issue Aug 29, 2017 · 9 comments
Open

other python 2.7 #2

phobrain opened this issue Aug 29, 2017 · 9 comments

Comments

@phobrain
Copy link

phobrain commented Aug 29, 2017

Lots of similar changes as in download issue #1 I closed with fix, plus this pattern:

-- train.py

< def get_pct_accuracy(pred: Variable, target) :

def get_pct_accuracy(pred, target) :

@phobrain
Copy link
Author

train.py ~ line 54 (doing this stuff for the 1st time in python)

# make directory for storing models.
models_path = os.path.join("saved_models", opt.name)
try:
    os.stat(models_path)
except OSError:
    os.makedirs(models_path)

@phobrain
Copy link
Author

models.py, instances of super() need args I haven't figured out yet. Current attempt:

super(type(ArcBinaryClassifier), self).__init__()

TypeError: super(type, obj): obj must be an instance or subtype of type

@phobrain
Copy link
Author

phobrain commented Aug 30, 2017

Bit the bullet and installed the self-reviling

https://pypi.python.org/pypi/magicsuper/

and am chugging away nicely on an old Macbook:

Iteration: 170 Train: Acc=46%, Loss=0.693698883057 Validation: Acc=54%, Loss=0.692266881466
Iteration: 5390 Train: Acc=72%, Loss=0.537120938301 Validation: Acc=79%, Loss=0.423486799002
Significantly improved validation loss from 0.435477942228 --> 0.423486799002. Saving...
Iteration: 15550 Train: Acc=81%, Loss=0.502185702324 Validation: Acc=93%, Loss=0.185158133507
Significantly improved validation loss from 0.210910066962 --> 0.185158133507. Saving...

(I reinstalled pytorch after installing torch, to handle some problem.)
(I don't suppose there's a way to multithread it? Tensorflow on keras/inceptionv3 makes the fan run and goes over 300% of 2 cores, while this is getting 110% with no fan, so I wonder if there might be some flag that could be added. Don't force me to buy a heater! ;-] Maybe multithreading could be done for a ConvARC [hint ;-], if ARCs are inherently sequential?)

@sanyam5
Copy link
Owner

sanyam5 commented Aug 30, 2017

Hey @phobrain, sorry again. The syntax errors you are getting are all artifacts of Python3 features.

Good to know that it started training! You reached a pretty good accuracy. I am curious as to what hyper parameters you used.

While it is true that parts of ARC are inherently sequential (you seen the next glimpse based on the information gathered from the previous glimpse) there is definitely some parallelization possible (in matrix multiplication, etc). And I may be wrong but I thought that PyTorch did that automatically under the hood. Not sure why it is stuck at 110%.

@phobrain
Copy link
Author

phobrain commented Aug 30, 2017

With these bugs filed, 2.7'ers have hacks at least. I don't know what speed I'm sacrificing with that super(). I just ran the default params, don't see offhand where to set them. Will investigate thread count some more now you've given me hope; matrix ops was my hope for parallel.

Iteration: 44660 Train: Acc=85%, Loss=0.349155157804 Validation: Acc=89%, Loss=0.245932474732

Algorithmically, it would be interesting to experiment with multiple training threads asynchronously updating shared weights, ideally one per GPU.

@phobrain
Copy link
Author

I'm waiting to see if it ever stops, or until I have adapted a version to try on 299x299.

Iteration: 59290 Train: Acc=81%, Loss=0.414705693722 Validation: Acc=89%, Loss=0.253573656082
Iteration: 59300 Train: Acc=87%, Loss=0.329242378473 Validation: Acc=94%, Loss=0.160938769579

@phobrain
Copy link
Author

phobrain commented Aug 30, 2017

It looks like only explicit multithreading is supported in pytorch/torch - at least I couldn't find any setting/flag to turn it on for low-level ops, instead finding examples of how to code parallel. Maybe at some point I'll try to translate it to keras if tensorflow is better-optimized, tho stuck now with an adapted keras siamese/inceptionv3 net, with a problem that I don't understand.

Iteration: 78620 Train: Acc=86%, Loss=0.37217849493 Validation: Acc=90%, Loss=0.218237221241

Iteration: 79430 Train: Acc=84%, Loss=0.348631471395 Validation: Acc=90%, Loss=0.240455701947

+11 hours:

Iteration: 207930 Train: Acc=92%, Loss=0.22611771524 Validation: Acc=92%, Loss=0.208926916122
Iteration: 207940 Train: Acc=87%, Loss=0.318686276674 Validation: Acc=96%, Loss=0.113623209298

Later..

Iteration: 340620 Train: Acc=92%, Loss=0.187863066792 Validation: Acc=92%, Loss=0.185851037502
Iteration: 340630 Train: Acc=88%, Loss=0.23383910954 Validation: Acc=92%, Loss=0.183825999498

@sanyam5
Copy link
Owner

sanyam5 commented Aug 31, 2017

I have not implemented early stopping yet.

@phobrain
Copy link
Author

phobrain commented Sep 1, 2017

Realizing it hadn't moved much in a day, I killed it after the above, since it finally started some continuous low-level fan action.

Could you provide a simple script that would load weights and tell if two images were 'the same' or to what degree?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants