-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Span embeddings with HuggingFace #8
Comments
Hi @ogarciasierra! Just to make sure: I haven't really looked at doing this directly HuggingFace Transformers, so I assume that you would like to do extract contextual word embeddings for spans using spaCy? |
Hi @thiippal I would like to extract contextual embeddings for spans using Hugging Face Transformers or pytorch. The main thing is to use a Hugging Face model to generate those embeddings. I dont care which library we use for extracting them :) Thanks! |
Okay @ogarciasierra, one way to do this is to follow the process here.
A demo, which assumes that you've created the custom component and added it to the Transformer-powered spaCy pipeline:
This outputs TL;DR: Just slice spaCy Docs and access the representation using the |
Yes, I checked your code with spaCy before! But my doubt is about how to do it with a Hugging Face model and its own embeddings. Those trf_data atributes are onle available for spaCy models, I am afraid. The process is amazing with your spaCy tutorial, so I tried to do it with a pre trained Hugging Face model, its easy with just one token) , but wasn't able to do it with a span. Sorry to bother you again. |
Hi everyone. I was wondering if is it possible to do the same "span contextual embeddings" with a HuggingFace model. I`ve been able to generate token contextual embeddings (https://discuss.huggingface.co/t/generate-raw-word-embeddings-using-transformer-models-like-bert-for-downstream-process/2958), but cannot do it with spans. For example, in “three days ago I ate meat”, I would like to get contextual embeddings for “three days ago” in a similar way Tuomo does it with spaCy in the ALT blog.
Thanks everyone.
The text was updated successfully, but these errors were encountered: