Skip to content

ckittask/SimLex-999-est-eng

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 

Repository files navigation

EstSimLex-999

All the results from the models and the EstSimLex-999 data set can be accessed here.

Description of used models

Three families of computational models were evaluated in this thesis. Almost all the code can be accessed from this GitHUb repository, except code for convolutional autoencoder. Next, it is shown were the used resources can be found and how the similarity between concept pairs from EstSimLex-999 were calculated.

Distributional models

All the used distributional models can be downloaded online.

  1. Eleri Aedmaa's word and sense vectors can be downloaded here.
  2. Estnltk models can be downloaded here.
  3. Facebook research models can be downloaded here.

To use sense vectors it is necessary to install SenseGram.

Used similarity metric is cosine similarity between word ans sense vectors.

Semantic networks

Two semantic networks were used: Estonian Wordnet and Estonian Wikipedia bitaxonomy. Wordnet version 2.2 can be downloaded here. Wikipedia bitaxonomy is not publicly available, but this taxonomy can be accessed online - MultiWiBi.

Three path-based similarity measures were implemented: path similarity, Leacock & Chodorow and Wu & Palmer.

Computer vision models

Used code for implementing convolutional autoencoder can be accessed here. This code was changed a bit to met the needs of this work.

Pre-trained ResNet-18 was accessed through PyTorch subpackage torchvision.models. Documentation for that can be found here.

Similarity between embeddings were calculated as the cosine similarity.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published