Skip to content
Stanislaw Jastrzebski edited this page Dec 28, 2015 · 19 revisions

Results of different publicly available embeddings calculated using this script

  • Rows are sorted by summed ranking for each benchmark.

  • Please keep in mind that embeddings were trained on different corpuses (however most of them on some version of wikipedia dump with various preprocessing), this page doesn't claim to be any sort of serious benchmark of word embeddings. Please see for instance this paper by O. Levy et al. for a thorough exploratory analysis.

  • There are no good skip-gram or CBOW embeddings available online. I will include results once I train them myself.

Sources of embeddings:

MEN MTurk RG65 RW SimLex999 WS353 WS353R WS353S Google MSR SemEval2012_2 AP BLESS Battig ESSLI_1a ESSLI_2b ESSLI_2c
PDC dim=300 0.773 0.672 0.790 0.455 0.427 0.721 0.641 0.789 0.748 0.596 0.290 0.639 0.805 0.431 0.773 0.725 0.644
HDC dim=300 0.760 0.655 0.806 0.438 0.407 0.677 0.581 0.787 0.731 0.564 0.293 0.632 0.815 0.432 0.773 0.750 0.644
SG GoogleNews (word2vec) 0.741 0.670 0.761 0.471 0.442 0.700 0.635 0.772 0.402 0.712 0.335 0.649 0.795 0.406 0.750 0.800 0.644
PDC dim=100 0.755 0.710 0.774 0.421 0.361 0.690 0.606 0.779 0.704 0.543 0.280 0.632 0.760 0.431 0.727 0.750 0.622
GloVe dim=300 corpus=common-crawl-42B 0.736 0.645 0.817 0.376 0.374 0.553 0.473 0.669 0.750 0.702 0.306 0.622 0.785 0.451 0.795 0.750 0.578
GloVe dim=300 corpus=wiki-6B 0.737 0.633 0.770 0.359 0.371 0.522 0.446 0.653 0.718 0.616 0.280 0.637 0.820 0.410 0.773 0.825 0.644
HDC dim=100 0.738 0.648 0.804 0.388 0.324 0.617 0.523 0.753 0.667 0.497 0.260 0.619 0.825 0.432 0.773 0.750 0.622
GloVe dim=200 corpus=wiki-6B 0.710 0.620 0.713 0.331 0.340 0.489 0.418 0.615 0.698 0.596 0.274 0.634 0.810 0.423 0.773 0.725 0.622
PDC dim=50 0.720 0.700 0.763 0.390 0.309 0.637 0.543 0.741 0.579 0.369 0.241 0.617 0.760 0.426 0.682 0.750 0.556
GloVe dim=100 corpus=wiki-6B 0.681 0.619 0.676 0.310 0.298 0.451 0.380 0.587 0.632 0.551 0.279 0.644 0.780 0.435 0.705 0.750 0.644
HDC dim=50 0.708 0.649 0.723 0.361 0.281 0.575 0.472 0.713 0.534 0.347 0.243 0.555 0.730 0.429 0.705 0.775 0.578
GloVe dim=50 corpus=wiki-6B 0.652 0.619 0.595 0.285 0.265 0.419 0.348 0.554 0.462 0.356 0.251 0.634 0.725 0.391 0.773 0.750 0.600
GloVe dim=200 corpus=twitter-27B 0.594 0.555 0.698 0.197 0.130 0.451 0.373 0.590 0.534 0.503 0.246 0.515 0.690 0.326 0.773 0.700 0.578
NMT which=FR 0.492 0.464 0.590 0.301 0.460 0.488 0.444 0.572 0.212 0.434 0.251 0.420 0.445 0.165 0.568 0.700 0.644
GloVe dim=100 corpus=twitter-27B 0.577 0.559 0.677 0.210 0.122 0.442 0.364 0.592 0.429 0.428 0.250 0.500 0.675 0.315 0.727 0.675 0.600
NMT which=DE 0.492 0.464 0.590 0.301 0.460 0.488 0.444 0.572 0.212 0.434 0.251 0.415 0.445 0.165 0.568 0.700 0.622
GloVe dim=50 corpus=twitter-27B 0.531 0.515 0.574 0.196 0.098 0.392 0.325 0.540 0.260 0.271 0.223 0.458 0.665 0.308 0.705 0.675 0.511
GloVe dim=25 corpus=twitter-27B 0.444 0.481 0.503 0.173 0.073 0.307 0.235 0.458 0.111 0.116 0.209 0.453 0.545 0.267 0.659 0.700 0.489
GloVe dim=300 corpus=common-crawl-840B 0.017 0.129 -0.105 0.078 -0.067 -0.022 -0.064 0.043 0.001 0.008 0.042 0.192 0.225 0.103 0.409 0.525 0.400
Clone this wiki locally