Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does sinoparserd support character segmentation ? #7

Open
edouard-lopez opened this issue Jun 15, 2014 · 2 comments
Open

Does sinoparserd support character segmentation ? #7

edouard-lopez opened this issue Jun 15, 2014 · 2 comments

Comments

@edouard-lopez
Copy link

Your application was recommended as a good Chinese segmenter, yet the only segmentation that seems available is in the <romanization> element (space separated words):

<romanization>ren2ren2 ke3 bian1ji2 de5 zi4you2 bai3ke1quan2shu1</romanization>
<alternateScript>人人可编辑的自由百科全书</alternateScript>

As you already seems to be able to segment, why not provide an API or an option to do it on Chinese scripts ?

@allan-simon
Copy link
Owner

for a very simple reason: It was not required for my use case when I've coded it. It was made on a "weekend project" flavor and was later put on github "in case it may interest somebody"

but sure providing such an API shouldn't be difficult at all. I will try to add that tonight when I arrive home

@edouard-lopez
Copy link
Author

hi @allan-simon did you progress on this issue ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants