Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Benchmarking Framework] Create a standardized form testing dataset #254

Closed
zdeveloper opened this issue Sep 26, 2024 · 2 comments · Fixed by #287
Closed

[Benchmarking Framework] Create a standardized form testing dataset #254

zdeveloper opened this issue Sep 26, 2024 · 2 comments · Fixed by #287
Assignees

Comments

@zdeveloper
Copy link
Collaborator

zdeveloper commented Sep 26, 2024

Create a standardized form testing dataset based on a readily available dataset, this will be used for building a standard benchmark for ReportVision

Acceptance Criteria

  • The dataset should include at least 100 different variations of the same form
  • The dataset should include around 10 different forms(ex: quest form,tb form, syph form etc.)
  • The dataset should include the ground truth of the data (label/value) present in each form
  • add dataset to the git repo, and whatever code was used to generate it
  • name this dataset: reportvision-dataset-1

Additional context
Example of datasets came be coming from NIST or huggingface
Please be very communicative and the AC can be adjusted

@zdeveloper zdeveloper added the OCR label Sep 26, 2024
@bora-skylight bora-skylight changed the title Create a standardized form testing dataset [Metrics Framework] Create a standardized form testing dataset Sep 26, 2024
@bora-skylight bora-skylight changed the title [Metrics Framework] Create a standardized form testing dataset [Benchmarking Framework] Create a standardized form testing dataset Sep 26, 2024
@zdeveloper
Copy link
Collaborator Author

zdeveloper commented Sep 27, 2024

@schreiaj
Copy link
Collaborator

Per discussion - the goal of this is to build a dataset to test the OCR not alignment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants