You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this work, we publish ICON (Indonesian CONstituency treebank), a manually-annotated benchmark Indonesian constituency treebank with a size of 10,000 sentences and approximately 124,000 constituents and 182,000 tokens, which can support the training of state-of-the-art transformer-based models. We use 15 phrase level tags and 24 POS tags. The sentences were taken from Wikipedia (3000) and news articles (7000).
License
CC-BY-SA 4.0
The text was updated successfully, but these errors were encountered:
NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?icon
The text was updated successfully, but these errors were encountered: