-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot call load_model
on network trained using multi_gpu
.
#3
Comments
This looks like an issue with how Keras serializes/deserializes models; unless you really need to de/serialize the multi-gpu version, I would recommend keeping a copy of the original single GPU model around, and saving /loading that model, rather than the parallelized model. The weights are shared between the original model and the new model. |
Thanks, I'll give this a try. |
I have an ugly but functional workaround: Change the return statement to the following code:
This monkey-patches the old model's save onto the new model's save (calling the parallel model's save will call the simple model's save) When loading, you must load the simple model before creating the parallel model. |
Just one simple question:
Is the first line running in GPU or CPU? If in CPU, then we should expect the GPU is running in an order, say after first GPU finish, second GPU begin. So I don't see the parallism inside. Could you explain to me? |
@pswpswpsw The actual python code just runs once (in serial on the cpu taking only seconds) and just sets up a graph and tells tensorflow which parts of the graph should be computed with which GPU's. This is true of most of the tensorflow/keras code you write. It only runs once, and it isn't very important to optimize it for speed. The code that actually does the training will run when |
Is there any way to recover a saved multi-gpu model? I see that above fix has to applied before the model is saved, but is there a way to load an already saved one? |
@aman-tiwari and anyone else who may stumble across this: you can recover an already saved multi GPU model simply. Temporarily edit your virtualenv's from keras.models import load_model
multi_model = load_model('your_multi_gpu_model.hdf5')
old_model = multi_model.layers[-2] # The last layer is the merge layer from make_parallel
old_model.save('single_gpu_model.hdf5') |
Use |
In reply to the original comment, I found I was able to get the original model by going to It is was also how I modified which layers I wanted to train. But after editing this original model, I would rerun make_parallel on it as I was uncertain if I was working on a copy of what is in layers[-2] or the original. It is not a perfect solution, as making a model parallel can take sometime, but it works. Note: where |
I came across your post on Medium and was instantly hooked. Nice job!
I've been developing a series of deep learning experiments that use only a single GPU and decided to switch them over to a multi-GPU setting. After training the models are serialized to disk via
model.save
.However, when I try to call
load_model
on to load the pre-trained network for disk I get an error:Looking at
multi_gpu.py
it's clear that TensorFlow is imported so I'm not sure why the error is being generated.The text was updated successfully, but these errors were encountered: