-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Question: Running generation with batches #3484
Comments
The following script produces the exact same output for me, regardless of batchsize: #!/usr/bin/env python3
BS = 2
from parlai.core.agents import create_agent_from_model_file
agent = create_agent_from_model_file(
"zoo:blender/blender_3B/model", {'model_parallel': False}
)
clones = [agent.clone() for _ in range(BS)]
acts = []
for index in range(BS):
acts.append(clones[index].observe({'text': 'hello', 'episode_done': True}))
responses = agent.batch_act(acts)
for i in range(BS):
print(responses[i]['text']) In all instances, my output is always "Hi! How are you? I just got home from work. I work at a grocery store." A few possible confounders you may be experiencing:
|
Great, thank you for the quick reply! BS=2: BS=1: log probs are also identical (-9.483329772949219 vs. -9.591983795166016 for the 'best' decoded texts, as above), so I think my script just did the same thing. I've no idea what would possibly the problem here, any idea? My setup: |
Let me add:
|
We must be using different models. Mine never said it has a dog. I expect small floating point errors, but BS=50 is wildly off jeez. BS=2 is too. What GPU are you using? Are you doing anything other than out-of-the box parlai? Can you replicate this on master?
Those are the outputs you got from the script I pasted above? |
First, to your questions, sorry if it wasn't 100% clear.
What I did today:
Results:
Now the new thing:
ps the corresponding text to the best score -8.57 has always been the above mentioned "Hi! How are you? I just got home from work. I work at a grocery store." |
Hm, we haven't tried pytorch 1.8 in open source land yet, so I can't vouch for that. We've used in plenty in internal use cases and not had problems, but I can't rule out that pytorch 1.8 has issues separate from yours What if you turn off model parallelism? What if you use CUDA_VISIBLE_DEVICES to limit yourself to 4 gpus? To 2? Can you try another power of 2? BS 4, 8, etc? It's interesting that BS 2 and 100 get the same off value. Makes me suspiciou
So definitely something wrong with ModelParallelism... I reverted #3326 and thinks look consistent, so I must have something wrong with that. |
With the reversion and BS-50 I still observe a weird few observations |
I was printing tensors to find where the differences occur and the first one I found is here: I was looking only at the very first occurrence (so first layer of the encoder). Input text was always 'hello'. Tensor before lin2: BS=1 BS=2 Tensor after lin2: BS=1 BS=2 Note, the scores have been for BS=1 -9.57.. and for BS=2 -8.58.. (according to the second table.) On CPU all tensors seem exactly identical, not a single different digit found. However the CPU tensors are also all different to the GPU ones. So apparently this is just due to normal floating point precision. Under the hood pytorch (or CUBLAS, whatever) seems to handle this linear layer differently for different batch sizes (?) |
If you have time (I don't immediately), can you trace through with model parallel and non-model parallel and see where things diverge? |
This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening. |
Bump to keep this open |
This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening. |
Hello!
I'm generating texts with blender_3B like this (all options are default, except "model_parallel=False"):
I get the following results for batch_size=2 (both predictions are exactly the same, I just cut the rest off for readibility):
However when I remove the second item in the batch I get:
Now the question is of course: The predictions shouldn't they be in all cases the same given that the inputs are the same? Or is this a numerical issue? I couldn't find an example how to run generation with batches so I wasn't sure if I'm doing this actually the correct way.
The ParlAI code is from today.
Python 3.7.5
Ubuntu 18.04 LTS
The text was updated successfully, but these errors were encountered: