Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Onnx Optimization is taking a long time to generate tokens #1

Open
dynamikapps opened this issue Jan 24, 2021 · 1 comment
Open

Onnx Optimization is taking a long time to generate tokens #1

dynamikapps opened this issue Jan 24, 2021 · 1 comment

Comments

@dynamikapps
Copy link

Hi,

Thanks for created the everything gpt tutorial. It's been very helpful. I'm noticing that the Onynx optimized model is taking a long time to generate more than 200 token. By a long time I mean anywhere from 10m-15m or even timing out. This happens even on the colab using the V100. When trying to generate a token on AWS using the configuration you mentioned, it usually times out. I would love to get your thoughts on a resolution for this.

@dynamikapps
Copy link
Author

Also I notice the GPT-2-Simple generates text much faster than the Pytorch or ONNX versions. Why is that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant