Skip to content

daandouwe/thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural language models with latent syntax

Code for my thesis, written in Python 3.6 using dynet.

The thesis can be found here.

Setup

Use make to obtain the data and install EVALB:

make data    # download ptb and unlabeled data
make evalb   # install EVALB

Usage

Use make to train a number of standard models:

make disc             # train discriminative rnng
make gen              # train generative rnng
make crf              # train crf
make fully-unsup-crf  # train rnng + crf (vi) fully unsupervised

You can list all the options with:

make list

Alternatively, you can use command line arguments:

python src/main.py train --model-type=disc-rnng --model-path-base=models/disc-rnng

For all available options use:

python src/main.py --help

To set the environment variables used in evaluation of trained models, e.g. CRF_PATH=models/crf_dev=90.01, use:

source scripts/best-models.sh

Models

Models are saved to folder models with their name and development scores. We have included our best models by development score as zip. To use them run unzip zipped/file.zip from the models directory.

Acknowledgements

I have relied on some excellent implementations for inspiration and help with my own implementation:

Make sure to check them out!