Skip to content

Commit

Permalink
Paper Revision: {2025.coling-main.690}, closes #4496.
Browse files Browse the repository at this point in the history
  • Loading branch information
anthology-assist committed Feb 7, 2025
1 parent 89ddfb8 commit 1e1b63b
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion data/xml/2025.coling.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8200,8 +8200,10 @@
<author><first>Veronique</first><last>Hoste</last></author>
<pages>10367–10374</pages>
<abstract>This paper explores the effectiveness of two types of transformer models — large generative models and sequence-to-sequence models — for automatically post-correcting Optical Character Recognition (OCR) output in early modern Dutch plays. To address the need for optimally aligned data, we create a parallel dataset based on the OCRed and ground truth versions from the EmDComF corpus using state-of-the-art alignment techniques. By combining character-based and semantic methods, we design and release a qualitative OCR-to-gold parallel dataset, selecting the alignment with the lowest Character Error Rate (CER) for all alignment pairs. We then fine-tune and evaluate five generative models and four sequence-to-sequence models on the OCR post-correction dataset. Results show that sequence-to-sequence models generally outperform generative models in this task, correcting more OCR errors and overgenerating and undergenerating less, with mBART as the best performing system.</abstract>
<url hash="c785a9d6">2025.coling-main.690</url>
<url hash="dab40eb0">2025.coling-main.690</url>
<bibkey>debaene-etal-2025-evaluating</bibkey>
<revision id="1" href="2025.coling-main.690v1" hash="c785a9d6"/>
<revision id="2" href="2025.coling-main.690v2" hash="dab40eb0" date="2025-02-06">Added citation to Dutch generative LLM Fietje.</revision>
</paper>
<paper id="691">
<title><fixed-case>BANER</fixed-case>: Boundary-Aware <fixed-case>LLM</fixed-case>s for Few-Shot Named Entity Recognition</title>
Expand Down

0 comments on commit 1e1b63b

Please sign in to comment.