Skip to content

TypeError: expected np.ndarray (got Tensor) #1431

Closed
@topl0305

Description

@topl0305

Describe the bug
Was trying to use pretrained model https://huggingface.co/stanfordnlp/stanza-lt
With a lot of issues, like stanza.download("lt") constantly crashing, I was forced to do it manually. So, installed and downloaded everything and used next piece of code to get the bug

import stanza
config = {
'processors': 'tokenize,pos',
'lang': 'lt',
'tokenize_model_path': './stanza_resources/lt/tokenize/alksnis.pt',
'pos_model_path': './stanza_resources/lt/pos/alksnis_nocharlm.pt',
'pos_pretrain_path': './stanza_resources/lt/pretrain/fasttextwiki.pt',
'tokenize_pretokenized': True,
'download_method': None
}

nlp = stanza.Pipeline(**config) # initialize neural pipeline
doc = nlp("Kur einam mes su Knysliuku, didžiulė paslaptis") # run annotation over a sentence
print(doc)

Expected behavior
The result shoud be obvious:

[
[
{
"id": 1,
"text": "Kur",
"upos": "ADV",
"xpos": "prm.l.lrgin.",
"feats": "Degree=Pos|PronType=Int,Rel",
"misc": "",
"start_char": 0,
"end_char": 3
},
...
]

Environment (please complete the following information):

  • OS: Windows 10
  • Python 3.10.5
  • stanza 1.9.2
  • numpy 2.1.2

Additional context
At least it works after patching code in file stanza/models/pos/model.py
~90 line self.add_unsaved_module('pretrained_emb', nn.Embedding.from_pretrained(torch.from_numpy(emb_matrix), freeze=True))
to

if type(emb_matrix) == torch.Tensor:
self.add_unsaved_module('pretrained_emb', nn.Embedding.from_pretrained(emb_matrix, freeze=True))
else:
self.add_unsaved_module('pretrained_emb', nn.Embedding.from_pretrained(torch.from_numpy(emb_matrix), freeze=True))

Not sure who is culprit - library or model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions