Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding code to get sentence representations #5

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ducdauge
Copy link

@ducdauge ducdauge commented Mar 4, 2017

I modified 2 files and added a new one.

  1. desent.py has a new function, embedding, to get representations given a trained model and an input file with one sentence per line. Also, I fixed a minor problems with numpy.round(), which yielded a float rather than an integer.
  2. build_dictionary.py now uses codecs tho handle input and output files. Otherwise, it considers only ansi-encoded characters.
  3. sentence_representation.py is a wrapper for convenience's sake. It invokes the relevant functions in desent.py and allows to modify the configuration easily.

The function embedding (and its sub-routines) allow to get sentence representations given a trained model and an input file with one sentence per line.
A minor fix concerns numpy.round(), whose output is cast into an integer.
A wrapper file to invoke the embedding function in desent.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant