We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The NgramTokenizer is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.
The text was updated successfully, but these errors were encountered:
if you can provide an example i can help with the rest
Sorry, something went wrong.
If we decide to replace the dependency, this would be about 5 lines of code: https://pytorch.org/text/stable/_modules/torchtext/data/utils.html#ngrams_iterator
torchtext is used here:
https://github.com/ludwig-ai/ludwig/blob/00c51e0a286c3fa399a07a550e48d0f3deadc57d/ludwig/utils/tokenizers.py#L142C1-L145C60
can we just copy the code over?
yeah that would probably be the solution for this tokenizer.
No branches or pull requests
The NgramTokenizer is using torchtext. We want to remove torchtext as a dependency so this Tokenizer has to be refactored not using it.
The text was updated successfully, but these errors were encountered: