Skip to content

Simple script to access Google's pre-trained sentence embeddings from TensorFlow Hub.

Notifications You must be signed in to change notification settings

saranya-venkatraman/pretrained_sent_embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

Get Pre-trained Sentence Embeddings from TensorFlow Hub

An easy script to get sentence embeddings from Google's pre-trained models on TensorFlow Hub. This script includes 6 such models of varying embedding dimensions (20-512) and/or architectures.

Example Usage

Returns an embedding vector per sentence of the input.

EMBEDDING_SIZE = n

Input = ["Colorless green ideas sleep furiously.", \
	"Noam Chomsky offered this as an example of a grammatically valid, \
	semantically nonsensical sentence."]
				
Output = array of shape (m,n) #m=number of sentences(=2), n=EMBEDDING_SIZE

Installation and Use

Run pip3 install -r requirements.txt

Run python3 get_embeddings.py

Can't find the embedding dimension/model you need?

Add the required model's URL available here to dictionary here.

Note

  1. Create a key for new dictionary values (URLs) with the format "embed_size/modelName_model_url".
  2. Pass size/modelName as EMBEDDING_SIZE.

Credits

All the models used (and more) are available here on TensorFlow Hub.

About

Simple script to access Google's pre-trained sentence embeddings from TensorFlow Hub.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages