EmojiComplete

Alexander Gedeon, Aniruddh Rao, Eli Urban, Ben Steinig, Nathan Wong

Project structure

data contains the data used for our code. The zip files are training/testing data which contain numerous csv files inside. The npy files are numpy pickle files which can be loaded as follows:

import numpy as np

map = numpy.load("data/some_file.npy", allow_pickle=True)

figures contains the graphs shown in the report generated by our code.

jobs contains baseline Greatlakes job scripts to run.

models is where our trained models are saved locally.

results contains the json files with the state of the trained models at the end of execution to be analyzed.

src contains the source code for our scripts.

src/notebooks contains the notebooks used for data generation and DeepMoji experiments.

Running the code

All scripts must be run from the root of the project.

Training

python src/bert.py <dataset name> <debug> <mapping filename>

Dataset name: dev | train | test

Debug: true | false

Mapping filename: Simply the filename of the mapping file to use. Ex: mapping.npy or foo_bar.p. This file must exist within the data folder

Example:

python src/bert.py train false mapping.npy

Collecting distribution information

python src/dist.py

Generating clustered map

python src/gen_cluster_mapping.py

Generating inference samples from trained model

python src/infer.py <model name> <mapping filename> <dataset name*> <number of samples*> <number of scores*>

Model name: Name of the folder containing the model. The model must be located in the model folder

Mapping filename: Filename of the mapping file. The mapping file must be located in the data folder

Dataset name: Name of the dataset file to use. Defaults to "test"

Number of samples: number of samples you wanted printed out. Defaults to 1

Number of scores: number of predictions to display. Defaults to 5

Examples:

python src/infer.py train-trained-bert mapping.npy

python src/infer.py some-model-name cluster_mapping.npy dev 3 10

Processing results from json state

python src/process_results.py <path to result file> <name of results>

Path to result file: Full path to json file containing final training state

Name of results: name for the output files and titles

Example:

python src/process_results.py results/some_file.json "Some results"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EmojiComplete

Alexander Gedeon, Aniruddh Rao, Eli Urban, Ben Steinig, Nathan Wong

Project structure

Running the code

Training

Collecting distribution information

Generating clustered map

Generating inference samples from trained model

Processing results from json state

About

Releases

Packages

Contributors 5

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
data		data
figures		figures
jobs		jobs
model		model
results		results
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

nkw1200/EmojiComplete

Folders and files

Latest commit

History

Repository files navigation

EmojiComplete

Alexander Gedeon, Aniruddh Rao, Eli Urban, Ben Steinig, Nathan Wong

Project structure

Running the code

Training

Collecting distribution information

Generating clustered map

Generating inference samples from trained model

Processing results from json state

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages