Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Colab TPU support with Colab Notebook and modified repo #47

Open
wants to merge 54 commits into
base: master
Choose a base branch
from

Conversation

aditya-malte
Copy link

I have made a colab notebook to allow for easy use of Google's Colab TPU.
The same has been tried and test successfully for Colab TPU.

(This includes additional things such as downloading weights to gcs buckets but keeping spiece file locally)

Thank you

@kimiyoung kimiyoung self-requested a review June 24, 2019 18:05
@kimiyoung kimiyoung self-assigned this Jun 24, 2019
@aditya-malte
Copy link
Author

Hello,
I have uploaded the final working version after making MODEL/OUTPUT_DIR separate. Please check
Thank you

@kimiyoung kimiyoung removed their assignment Jun 27, 2019
@kimiyoung
Copy link
Collaborator

@aditya-malte Thanks for your contribution. It would be nice if you could do the following:

  • merge your changes with the original configure_tpu function to support all the cases;
  • remove other unnecessary changes such as README and other code changes in run_classifier.py;
  • I noticed you add a trac dataset but without a processor. Could you change that to IMDB or STS-B so that it's consistent with the main examples in README.
  • move the contents in README to the notebook.

@aditya-malte
Copy link
Author

Hello @kimiyoung,
Yes, I'll make the changes shortly and update you on it

@aditya-malte
Copy link
Author

Working perfectly for IMDB dataset for max_seq =128 and batch_size 64. Currently testing how far I can push the Colab TPU by increasing max_seq and/or batch_size

@aditya-malte
Copy link
Author

Gives near SOTA results for eval_accuracy(IMDB) result equal to 0.9512. with batch size 32 and max_seq=256(Increasing max_seq further throws OOM error). Training and evaluation combined get completed within an hour or so.
Thank you

@aditya-malte
Copy link
Author

Hello @kimiyoung ,
I have synchronized the fork to reflect the changes that have happened to the source repository.
Any updates from your side?
Thank you

Best Regards,
Aditya Malte

@aditya-malte
Copy link
Author

aditya-malte commented Jul 20, 2019

Hello @kimiyoung ,
I have updated my repo again to reflect changes with your latest xlnet repo,
To summarise:

  1. run_classifier remains exactly the same except the added flags: use_colab(default=False) and tpu_address(default=None)
  2. modified model_utils make corresponding changes if use_colab is set ( a simple if condition), rest remains exactly the same.
  3. README is identical(and updated to your current/latest repo), except for two lines saying that colab TPU version also available
  4. Colab TPU Notebook in notebooks folder.

Thanks

Sync(update) with source repo changes
@aditya-malte
Copy link
Author

The above merge PR, is in order to update my repo with all the changes made in yours

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants