-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incompatible shapes #7
Comments
I get the exact same error. Would appreciate some help on this. |
i get similar error. |
Did you hardcode the batch size in your first layer input (batch_input_shape), or give input_dim ? |
@Caduceus96 just gave input_dim. Batch size is hardcoded when I call |
The same error here! Running Keras 2.0.2 with Tensorflow 0.12.1
|
It might be related to the function |
OK, I'm probably wrong. The error seems to come from my callback function. If I don't do callbacks, everything is fine no matter how many rows of input data. |
I actually see this error when I try to run the example in the website. |
Same as @Eric2333 , don't use callbacks or change them to lambda functions and it works fine. |
Also ran in to this error with Keras 2.0.3 and TensorFlow 1.1.0 73997312/73997516 [============================>.] - ETA: 0s - loss: 12.1832/home/ubuntu/devhome/tensorwords2/multi_gpu.py:45: UserWarning: The During handling of the above exception, another exception occurred: Traceback (most recent call last): Caused by op 'mul', defined at: InvalidArgumentError (see above for traceback): Incompatible shapes: [204,34] vs. [200,34] |
The number of samples just needs to be a mutiple of the total number of GPUs. |
@jwilt1 Thanks!! Your example is nice work. |
If you have large training set it's not an issue and you can always cut it like: train_cut = len(train_index)%GPUs And it works fine. But after training I have issue with predictions, it have to be multiple by GPUs as well. |
You can use the same kind of trick as for training, but instead of removing the last remainder elements you pad the end of your dataset to make it divisible by # of gpus, then select the unpadded indices as your actual prediction.
…Sent from my iPhone
On Jun 25, 2017, at 3:15 AM, Sergey Zhitansky ***@***.***> wrote:
If you have large training set it's not an issue and you can always cut it like:
train_cut = len(train_index)%GPUs
train_index = train_index[:-train_cut]
And it works fine. But after training I have issue with predictions, it have to be multiple by GPUs as well.
Any ideas?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@Caduceus96 I sliced my training data into multiples of gpus, the first epoch runs well, but when it comes to the second epoch, error raises ......
train_shape=[(3800, none, 1)] * 10, |
Is your training set size evenly divisible by gpu #?
…Sent from my iPhone
On Jul 17, 2017, at 9:33 AM, Ling-han Jiang ***@***.***> wrote:
I sliced my training data into multiples of gpus, the first epoch runs well, but when it comes to the second epoch, error raises InvalidArgumentError (see above for traceback): Incompatible shapes: [12,3] vs. [14,3] [[Node: sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](concatenate_2/concat/_851, _recv_concatenate_2_target_0/_853)]] [[Node: add_3/_857 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_3571_add_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@Caduceus96 I guess so, 3800/4 =950. |
@JiangLing-han it is evident you are using small batch sizes during your training (as the progress bar output from your Keras You need to make sure your batches are of equal size and divisible by |
@ktamiola @Caduceus96 I solved this problem by set size of validation set to multiples of 4. |
If you want to predict just one at a time, instead of a multiple of the GPUs used during training, you can create a 2nd model that is identical and load the weights of your parallelized model.
model1.predict(val[0:10,:,:]) -> success |
Many thanks to your code! |
Has anyone else faced an error using regularizers? Using Layers like this:
I get the error: |
batch size : 64 |
@DNXie, I am having the same error, the shape[0] gets halfed. Did you find a solution? A related issue: keras-team/keras#9449 |
Same issue here with the latest Keras version. |
Hi, was a fix issued for this error? I am facing the same issue. model.fit works for batch size 64 when not using multi GPU. But when I put the same model through multi_gpu_model and call fit on it, it is raising error that 16 and 64 are incompatible shapes. |
I am getting the error
full error is as follows: |
here is full code import numpy as np num_decoder_tokens=40 train_labels_vecs = np.random.randint(num_decoder_tokens, size=(100, len_label_vector)) decoder_input_data = train_labels_vecs[:, :-1] decoder_inputs = Input(shape=(None,), name='Decoder-Input') # for teacher forcing seq2seq_Model = Model([decoder_inputs], decoder_outputs) print(seq2seq_Model.summary()) seq2seq_Model.compile(optimizer=optimizers.Nadam(lr=0.001), history = seq2seq_Model.fit([decoder_input_data], |
same error and the followings are completely true when i run a seq2seq architecture on a local pc.
BUT, there is no error when i run the codes on a kaggle kernel with the same tf version1.12.0 and the keras version2.2.4. |
I also have a very similar error and changing the batch size and sample size to fit the multiple of GPU doesn't solve the problem. My error is as follows:
This problem only happens when the model has a ConvLSTM2D layer, without it the code runs just fine. As for other properties:
|
Same here:
Keras 2.2.4 |
Getting the same error at the end of the first epoch with only 1 GPU. I am using a generator (Sequence), and when I set shuffle = True, the error gets thrown in the middle of somewhere during the first epoch instead of the end. Keras 2.1.6 Update:
|
I again the same error and can reproduce with the code I have pasted earlier. With batch_size=1 no problem. I have - tensorflow==1.14.0 machine : Intel(R) Xeon(R) Platinum 8153 CPU @ 2.00GHzTraceback (most recent call last): |
I am very new to deep learning and getting familiar with its theory slowly.I am also getting a similar kind of error. Can anyone explain what and why this error could have occured, does it hacve something to do with the weight's size?. InvalidArgumentError: Incompatible shapes: [786432] vs. [131072] It would be great if someone could help me out here. |
Have you solve it? I met the similar kind of error about "BroadcastGradientArgs".It would be great if you could reply to me here. Thx. @bhavyakariwal9 |
it dose not worked.Still error occur |
I think this answer is too simple that everybody can find it.But it sitll not work |
This worked fine for me, thanks a looooot |
i got similar problem but i have no GPU in my system. how can i solve this error. |
i also faced the same error. It was resolved by making two changes:-
But i dont know how or why it works , if someone can explain?? |
I am running make_parallel with 2 GPUs, the error occurred with gradients/sub_grad/BroadcastGradientArgs:
"InvalidArgumentError (see above for traceback): Incompatible shapes: [483,1] vs. [482,1]
[[Node: gradients/sub_grad/BroadcastGradientArgs = BroadcastGradientArgs[T=DT_INT32, _class=["loc:@sub"], _device="/job:localhost/replica:0/task:0/gpu:0"](gradients/sub_grad/Shape, gradients/sub_grad/Shape_1/_79)]]"
The text was updated successfully, but these errors were encountered: