Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.ArrayIndexOutOfBoundsException when initializing VnCoreNLP with "wseg" annotator #51

Open
colossalpen12 opened this issue Dec 18, 2024 · 0 comments

Comments

@colossalpen12
Copy link

I encountered an issue when using the VnCoreNLP wrapper in the py_vncorenlp package. The error occurs specifically when the annotators list contains "wseg". The initialization fails with a java.lang.ArrayIndexOutOfBoundsException.

Steps to Reproduce:

  1. Initialize a VnCoreNLP object with the annotators list containing "wseg":
    annotators = ["wseg"]
    model = VnCoreNLP(annotators=annotators)
  2. The error occurs during the instantiation:
    self.model = javaclass_vncorenlp(annotators)
    Resulting in the following error:
    jnius.JavaException: JVM exception occurred: 1 java.lang.ArrayIndexOutOfBoundsException
    

Expected Behavior:

The VnCoreNLP object should initialize without errors, regardless of whether "wseg" is in the annotators list.

Actual Behavior:

When "wseg" is included in the annotators list, the following exception is raised:

jnius.JavaException: JVM exception occurred: 1 java.lang.ArrayIndexOutOfBoundsException

Environment:

  • OS: macOS Sequoia 15.1.1
  • JDK Version: 1.8.0

Additional Context:

  • The issue only occurs when "wseg" is included in the annotators list.
  • Other annotators like "pos", "ner", and "parse" work as expected without throwing an error.
  • I’ve tried initializing the class with different configurations, and the error only happens with "wseg".
  • The main jar file and models folder size are the same as described in README.md

It seems like there might be an issue with how the wseg annotator is being handled internally within the Java code.

Possible Solutions:

  • Investigate the handling of the "wseg" annotator in the Java class vn.pipeline.VnCoreNLP and ensure the correct indexing or initialization logic.
  • Check if there are any known issues related to this annotator in the library.

Related Issues/PRs:

(None at the moment)
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant