You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I need to calculate similarities of article titles and I intended to use Relaxed Word Mover's Distance.
I will use RelaxedWordMoversDistance() function of text2vec R package.
After some trial, in my output matix which is showing similarities of titles,
I see that RMWD values were not symmetrical.
I checked the example of RelaxedWordMoversDistance function in text2vec is an R package vignette.
Then modified example ode and create a larger rwms matrix as follows.
rwms = rwmd_model$sim2(dtm)
The diagonals of the matrix are 1.
But the elements that are symmetrical with respect to the diagonal are not equal to each other.
Say that i and j are titles.
RelaxedWordMoversDistance[i,j] is not equal to RelaxedWordMoversDistance[j,i]
Is this difference normal or am I doing something wrong?
If you can help I would be grateful.
Below is coppied from Vignette: "Package ‘text2vec’ November 30, 2022"
Example
I need to calculate similarities of article titles and I intended to use Relaxed Word Mover's Distance.
I will use RelaxedWordMoversDistance() function of text2vec R package.
After some trial, in my output matix which is showing similarities of titles,
I see that RMWD values were not symmetrical.
As I was skeptical of the result I got using my own data, I also tested the example in the vignette.
I checked the example in the below adress.
https://search.r-project.org/CRAN/refmans/text2vec/html/00Index.html
I checked the example of RelaxedWordMoversDistance function in text2vec is an R package vignette.
Then modified example ode and create a larger rwms matrix as follows.
rwms = rwmd_model$sim2(dtm)
The diagonals of the matrix are 1.
But the elements that are symmetrical with respect to the diagonal are not equal to each other.
Say that i and j are titles.
RelaxedWordMoversDistance[i,j] is not equal to RelaxedWordMoversDistance[j,i]
Is this difference normal or am I doing something wrong?
If you can help I would be grateful.
Below is coppied from Vignette: "Package ‘text2vec’ November 30, 2022"
Example
Not run:
library(text2vec)
library(rsparse)
data("movie_review")
tokens = word_tokenizer(tolower(movie_review$review))
v = create_vocabulary(itoken(tokens))
v = prune_vocabulary(v, term_count_min = 5, doc_proportion_max = 0.5)
it = itoken(tokens)
vectorizer = vocab_vectorizer(v)
similarities 29
dtm = create_dtm(it, vectorizer)
tcm = create_tcm(it, vectorizer, skip_grams_window = 5)
glove_model = GloVe$new(rank = 50, x_max = 10)
wv = glove_model$fit_transform(tcm, n_iter = 5)
wv = wv + t(glove_model$components)
rwmd_model = RelaxedWordMoversDistance$new(dtm, wv)
rwms = rwmd_model$sim2(dtm[1:10, ])
head(sort(rwms[1, ], decreasing = T))
End(Not run)
The text was updated successfully, but these errors were encountered: