You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am having what I believe are multiple issues adapting the GloVe word embeddings tutorial to my project. I am starting with a tokens object created in Quanteda (TOK.Debates.2020.Full.Clean) to create the iterator. However, when I run that first line, I am greeted with this error:
_Error in if (cost/n_nnz > 1) stop("Cost is too big, probably something goes wrong... try smaller learning rate") :
missing value where TRUE/FALSE needed_
In order to troubleshoot this error, I have tried to do the following:
Change the learning rate in the glove environment down to .001, still receive the same calculation cost error message
Attempted to change the initial token object into a text file to simulate the example better, still receive the same coercion error
Attempted to use a Quanteda FCM to replace the TCM, but receive the following error:
_Error in glove$fit_transform(Debates2020.FCM, n_iter = 10, convergence_tol = 0.01, :
all(x@x > 0) is not TRUE_
I have been unable to proceed further and obviously one or more of these errors must be the culprit, but I have been unable to find documentation on these errors elsewhere, including past issues catalogued here.
Thank you in advance for any help in taking out this gremlin.
-Sello
The text was updated successfully, but these errors were encountered:
Hello there,
I am having what I believe are multiple issues adapting the GloVe word embeddings tutorial to my project. I am starting with a tokens object created in Quanteda (TOK.Debates.2020.Full.Clean) to create the iterator. However, when I run that first line, I am greeted with this error:
Tokenizer_Debates_2020 = space_tokenizer(TOK.Debates.2020.Full.Clean)
The tokenizer is created and looks like this:
I continue the example with no errors:
Iterator_Debates_2020 = itoken(Tokenizer_Debates_2020)
Vocab_Debates_2020 = create_vocabulary(Iterator_Debates_2020)
Vocab_Debates_2020 = prune_vocabulary(Vocab_Debates_2020, term_count_min = 10L)
Vectorizer_Debates_2020 = vocab_vectorizer(Vocab_Debates_2020)
TCM_Debates_2020 = create_tcm(Iterator_Debates_2020, Vectorizer_Debates_2020, skip_grams_window = 5L)
I check the dimensions of the TCM and see that I have rows and columns:
dim(TCM_Debates_2020)
I start to fit the model, creating the glove environment with no issue, but when I try to do the actual fitting I obtain the following error:
glove = GlobalVectors$new(rank = 50, x_max = 10)
WV_Debates_2020 = glove$fit_transform(TCM_Debates_2020, n_iter = 10, convergence_tol = 0.01, n_threads = 8)
In order to troubleshoot this error, I have tried to do the following:
WV_Debates_2020 = glove$fit_transform(Debates2020.FCM, n_iter = 10, convergence_tol = 0.01, n_threads = 8)
I have been unable to proceed further and obviously one or more of these errors must be the culprit, but I have been unable to find documentation on these errors elsewhere, including past issues catalogued here.
Thank you in advance for any help in taking out this gremlin.
-Sello
The text was updated successfully, but these errors were encountered: