Media Cloud Word Embeddings Server

A micro-service to support analyzing words based on models of word embeddings (aka. "word2vec").

Dev Installation

Install python.
Run pip install -r requirements.txt to install dependencies
Run python download-google-news-model.py to download the google news model file.

Developing

Configuration happens via environment variables. We use python-dotenv to manage this on locall dev machines. Make a .env file and define the following in it:

SENTRY_DSN - optional url for Sentry logging
SECRET_KEY - for Flask
LOG_LEVEL - DEBUG, INFO, etc.
MEDIA_CLOUD_API_KEY - your Media Cloud API key

Running

Two options:

Development: run python run.py to test it out
Production-like: run ./run.sh to run it with gunicorn

You can then hit the local homepage to try it out from a simple web-testing harness: http://localhost:8000

Or you can test that with something like this (the first request takes a while to load the giant model into memory):

import requests
response = requests.post("http://localhost:8000/api/v2/google-news2d.json",
                         data = {'words[]':['apples', 'bananas', 'three']})
print response.json()

Deploying

This is configured to deploy as a Heroku buildpack to dokku.

You'll need to do something like this to set the required environment variables:

dokku config:set word-embeddings SECRET_KEY=oiwajj243josadjoi SENTRY_DSN=https://THING1:[email protected]/THING3 MEDIA_CLOUD_API_KEY=MY_AWESOME_KEY

Releasing

Update the semantic version number in server/__init.py__
Tag the repository with that number, like v4.5.2
Push it to the server, like git push dokku v4.5.2:master

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
config		config
doc		doc
logs		logs
server		server
.buildpacks		.buildpacks
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
app.json		app.json
download-google-news-model.py		download-google-news-model.py
requirements.txt		requirements.txt
run.py		run.py
run.sh		run.sh
runtime.txt		runtime.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Media Cloud Word Embeddings Server

Dev Installation

Developing

Running

Deploying

Releasing

About

Releases

Packages

Contributors 3

Languages

License

mediacloud/word-embeddings-server

Folders and files

Latest commit

History

Repository files navigation

Media Cloud Word Embeddings Server

Dev Installation

Developing

Running

Deploying

Releasing

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages