-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CPA not consistently lemmatized #571
Comments
Thanks! Our English validation script flags a few other potential lemma inconsistencies—do any of these look like they should be fixed? The abbreviations are OK I guess. ! rare lemma America for A/NNP in weblog-blogspot.com_rigorousintuition_20050518101500_ENG_20050518_101500-0070 (majority: A) |
actually yes, President is sus a couple times
also |
might i suggest having the script print out line numbers and filenames as well? |
You're welcome to edit https://github.com/UniversalDependencies/UD_English-EWT/blob/dev/not-to-release/tools/neaten.py—look for |
Examples from the dev set (a couple others in train):
There's also the Iraq CPA, but that seems orthogonal to the accounting job title
The text was updated successfully, but these errors were encountered: