-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parsing German #221
Comments
Does this look any better? library("spacyr")
spacy_initialize(model = "de_core_news_lg")
#> Found 'spacy_condaenv'. spacyr will use this environment
#> successfully initialized (spaCy Version: 3.4.1, language model: de_core_news_lg)
#> (python options: type = "condaenv", value = "spacy_condaenv")
text_german<-c("R ist eine freie Programmiersprache für statistische Berechnungen und Grafiken. Sie wurde von Statistikern für Anwender mit statistischen Aufgaben entwickelt.")
results_german<-spacy_parse(text_german, dependency=F, lemma=F, tag=T)
results_german
#> doc_id sentence_id token_id token pos tag entity
#> 1 text1 1 1 R NOUN NN MISC_B
#> 2 text1 1 2 ist AUX VAFIN
#> 3 text1 1 3 eine DET ART
#> 4 text1 1 4 freie ADJ ADJA
#> 5 text1 1 5 Programmiersprache NOUN NN
#> 6 text1 1 6 für ADP APPR
#> 7 text1 1 7 statistische ADJ ADJA
#> 8 text1 1 8 Berechnungen NOUN NN
#> 9 text1 1 9 und CCONJ KON
#> 10 text1 1 10 Grafiken NOUN NN
#> 11 text1 1 11 . PUNCT $.
#> 12 text1 2 1 Sie PRON PPER
#> 13 text1 2 2 wurde AUX VAFIN
#> 14 text1 2 3 von ADP APPR
#> 15 text1 2 4 Statistikern NOUN NN
#> 16 text1 2 5 für ADP APPR
#> 17 text1 2 6 Anwender NOUN NN
#> 18 text1 2 7 mit ADP APPR
#> 19 text1 2 8 statistischen ADJ ADJA
#> 20 text1 2 9 Aufgaben NOUN NN
#> 21 text1 2 10 entwickelt VERB VVPP
#> 22 text1 2 11 . PUNCT $. Created on 2022-09-01 with reprex v2.0.2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I seem to have installed the German language model for spacyr/spacy properly, but the pos parsing is a chaos. See below: aux is parsed as NOUN, Articles are parsed as Nouns, etc.
Can somebody tell me what I am doing wrong?
Best,
Manfred
The text was updated successfully, but these errors were encountered: