Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable downloading Czech model #3

Merged
merged 4 commits into from
Oct 18, 2023
Merged

Enable downloading Czech model #3

merged 4 commits into from
Oct 18, 2023

Conversation

skvrnami
Copy link
Contributor

I modified the nametagger_download_model function so that it can download both English and Czech NER models available from ÚFAL.

@jwijffels
Copy link
Contributor

Thanks for the proposed changes. Some remarks.

  • Can you make sure the example with english / english-conll-140408 / czech-cnec-140304 work?
  • Can you remove the use of paste0 and replace it with an appropriate paste command (there is no need to depend on a more later version of R)
  • Can you remove the changes you did to src/RcppExports.cpp (there is no need to depend on a more later version of Rcpp)

@skvrnami
Copy link
Contributor Author

The last commit should put the code in line with your suggestions - I removed the changes to src/RcppExports.cpp and replaced paste0 with paste.

I checked that the example works by running this code:

library(nametagger)

x <- data.frame(doc_id      = c(1, 1, 2),
                sentence_id = c(1, 2, 1),
                text        = c("I\nlive\nin\nNew\nYork\nand\nI\nwork\nfor\nApple\nInc.", 
                                "Why\ndon't\nyou\ncome\nvisit\nme", 
                                "Good\nnews\nfrom\nAmazon\nas\nJohn\nworks\nthere\n."))

en_m1 <- nametagger_download_model("english-conll-140408")
predict(en_m1, x)                          

en_m2 <- nametagger_download_model("english")
predict(en_m2, x)

x_cz <- data.frame(doc_id = 1, 
                   sentence_id = 1, 
                   text = "Policajti\nkradou\nkočkám\nmléko.")

cz_m1 <- nametagger_download_model("czech-cnec-140304")
predict(cz_m1, x_cz)

It worked fine.

…nguage model, be consistent on paths within zip file, bump version + write change in news file
@jwijffels
Copy link
Contributor

I've made the changes to your repo which I think we required.
Looks ok now, I'll incorporate it. Thanks for the changes.

@jwijffels jwijffels merged commit a1c7541 into bnosac:master Oct 18, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants