Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make BratReader more forgiving #1443

Open
alaindesilets opened this issue Dec 13, 2019 · 2 comments
Open

Make BratReader more forgiving #1443

alaindesilets opened this issue Dec 13, 2019 · 2 comments
Assignees
Labels

Comments

@alaindesilets
Copy link
Contributor

BratReader is currently very unforgiving. I propose to make the following improvements to it.

  1. Create a BratRead without having to provide mappings.
  • BratReader would already know mappings for the Annotations defined in the standard dkpro-core type system (ex: Person)
  • If you provide a PARAM_MAPPING, those mappings would be ADDED to the default ones
  • If the .ann file contains a label that is defined in neither of the default or PARAM_MAPPING mappings, it will use a "catch-all" Annotation type (NamedEntity for the moment, but could be something else)
  1. Pass a directory or file to PARAM_SOURCE_LOCATION without having to worry about things like:
  • Adding *.ann at the end of the directory path (the Reader would add it automaticaly)
  • Making sure to pass the .ann file as opposed to the .txt file (the Reader would automatically converts it to .ann path)
  • Making sure that the single file, or all the the .txt files in the directory have a corresponding .ann file (the Reader would automatically creates empty .ann files for orphan .txt files)

If this seems appropriate, I will create a feature request followed by a Pull REquest.

@reckart reckart changed the title Improvement / Make BratReader more forgiving Make BratReader more forgiving Dec 22, 2019
@reckart reckart added Module-io.brat ⭐️ Enhancement New feature or request labels Dec 22, 2019
@reckart reckart modified the milestones: 2.2.0, 1.13.0 Dec 22, 2019
@reckart
Copy link
Member

reckart commented Dec 22, 2019

Since this is an enhancement and not a bug-fix, I'll target it to the 1.13.0 feature milestone and not to the 1.12.1 bug-fix milestone.

reckart added a commit to alaindesilets/dkpro-core that referenced this issue Dec 26, 2019
reckart added a commit to alaindesilets/dkpro-core that referenced this issue Jan 6, 2020
- Convert `test__SingleDirWithoutAnnFiles__AssumesEmptyAnnFiles` to use `ReaderAssert`

Merge branch '1.12.x' into Improvement/Make_BratReader_more_forgiving__Take2

* 1.12.x:
  dkpro#1453 - Better I/O testing facilities

% Conflicts:
%	dkpro-core-io-brat-asl/src/test/java/org/dkpro/core/io/brat/BratReaderWriterTest.java
@alaindesilets
Copy link
Contributor Author

I just realized that the latest changes I pushed break some tests in io-conll. I'll fix those and let you know.

@reckart reckart modified the milestones: 1.13.0, Feature backlog Jan 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants