Skip to content

nkw1200/EmojiComplete

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EmojiComplete

Alexander Gedeon, Aniruddh Rao, Eli Urban, Ben Steinig, Nathan Wong

Project structure

data contains the data used for our code. The zip files are training/testing data which contain numerous csv files inside. The npy files are numpy pickle files which can be loaded as follows:

import numpy as np

map = numpy.load("data/some_file.npy", allow_pickle=True)

figures contains the graphs shown in the report generated by our code.

jobs contains baseline Greatlakes job scripts to run.

models is where our trained models are saved locally.

results contains the json files with the state of the trained models at the end of execution to be analyzed.

src contains the source code for our scripts.

src/notebooks contains the notebooks used for data generation and DeepMoji experiments.

Running the code

All scripts must be run from the root of the project.

Training

python src/bert.py <dataset name> <debug> <mapping filename>

Dataset name: dev | train | test

Debug: true | false

Mapping filename: Simply the filename of the mapping file to use. Ex: mapping.npy or foo_bar.p. This file must exist within the data folder

Example:

python src/bert.py train false mapping.npy

Collecting distribution information

python src/dist.py

Generating clustered map

python src/gen_cluster_mapping.py

Generating inference samples from trained model

python src/infer.py <model name> <mapping filename> <dataset name*> <number of samples*> <number of scores*>

Model name: Name of the folder containing the model. The model must be located in the model folder

Mapping filename: Filename of the mapping file. The mapping file must be located in the data folder

Dataset name: Name of the dataset file to use. Defaults to "test"

Number of samples: number of samples you wanted printed out. Defaults to 1

Number of scores: number of predictions to display. Defaults to 5

Examples:

python src/infer.py train-trained-bert mapping.npy

python src/infer.py some-model-name cluster_mapping.npy dev 3 10

Processing results from json state

python src/process_results.py <path to result file> <name of results>

Path to result file: Full path to json file containing final training state

Name of results: name for the output files and titles

Example:

python src/process_results.py results/some_file.json "Some results"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages