You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Paper Title:
Quality Estimation via Backtranslation at the WMT 2022 Quality Estimation Task.
Introduction:
This paper describes the submission to the WMT 2022 shared task on sentence-level quality estimation (QE). The authors compare supervised and unsupervised methods to estimate the quality of machine translations, particularly focusing on back translation as an unsupervised method. They highlight the strengths and weaknesses of back translation for QE and explore how it can be combined with supervised QE methods to improve performance.
Main Problem:
Supervised quality estimation methods, often involving human judgments, are time-consuming and labor-intensive. This research investigates whether unsupervised methods like backtranslation can offer a viable alternative for quality estimation. The results show that backtranslation on its own performs worse than supervised QE systems, but when combined with off-the-shelf QE scorers, its performance improves, making it a complementary technique for supervised QE systems.
Illustrative Example:
Source sentence: "Arif Lohar briefly went into acting in Punjabi movies before returning to his music career at the age of 22."
Machine Translation (MT) Output (Czech): "Arif Lohar krátce začal hrát v Punjabi filmech, než se v roce 22 vrátil ke své hudební kariéře."
Back-Translated Source: "Arif Lohar briefly began acting in Punjabi films before returning to his musical career in the year 22." This example demonstrates how backtranslation can help compare the source sentence and MT output for estimating translation quality.
Input:
A sentence in the source language (English) and its MT output (Czech).
Output:
The quality of the translation, quantified by metrics such as BT-score (backtranslation score) and FT-score (forward translation score).
Motivation:
The motivation is to enhance unsupervised quality estimation methods, thereby reducing the reliance on human evaluators for machine translation quality assessment.
Related Works and Their Gaps:
Moon et al. (2020) revisited round-trip translation for quality estimation with neural systems but reported low correlation on segment-level tasks.
YISI-2 metric (Lo, 2019) performed worse than the approach used in this paper, but both were complementary in addressing different quality estimation needs.
Contribution of This Paper:
The paper proposes a method to enhance unsupervised machine translation quality estimation by combining back translation-based scores with off-the-shelf QE metrics, improving correlation with human judgments and enhancing translation quality estimation.
Proposed Methods:
The method involves using Facebook’s mBART-50 model to backtranslate the machine translation output into the source language and then comparing it to the original sentence using metrics such as BLEU, chrF, and BERTScore. These scores are combined with the existing QE metrics to improve overall performance.
Experiments:
Model: The authors used Facebook’s mBART-50 for back translation and evaluated their approach with BLEU, chrF, and BERTScore metrics.
Evaluation Metrics: They used Pearson and Spearman correlation scores to compare back translation-based metrics with human judgments for sentence-level quality estimation.
Gaps in This Work:
The authors note that their approach lacks comparative evaluation against other existing methods for machine translation quality estimation, which would provide a broader context for the effectiveness of their method.
The text was updated successfully, but these errors were encountered:
Paper Title:
Quality Estimation via Backtranslation at the WMT 2022 Quality Estimation Task.
Introduction:
This paper describes the submission to the WMT 2022 shared task on sentence-level quality estimation (QE). The authors compare supervised and unsupervised methods to estimate the quality of machine translations, particularly focusing on back translation as an unsupervised method. They highlight the strengths and weaknesses of back translation for QE and explore how it can be combined with supervised QE methods to improve performance.
Main Problem:
Supervised quality estimation methods, often involving human judgments, are time-consuming and labor-intensive. This research investigates whether unsupervised methods like backtranslation can offer a viable alternative for quality estimation. The results show that backtranslation on its own performs worse than supervised QE systems, but when combined with off-the-shelf QE scorers, its performance improves, making it a complementary technique for supervised QE systems.
Illustrative Example:
Source sentence: "Arif Lohar briefly went into acting in Punjabi movies before returning to his music career at the age of 22."
Machine Translation (MT) Output (Czech): "Arif Lohar krátce začal hrát v Punjabi filmech, než se v roce 22 vrátil ke své hudební kariéře."
Back-Translated Source: "Arif Lohar briefly began acting in Punjabi films before returning to his musical career in the year 22." This example demonstrates how backtranslation can help compare the source sentence and MT output for estimating translation quality.
Input:
A sentence in the source language (English) and its MT output (Czech).
Output:
The quality of the translation, quantified by metrics such as BT-score (backtranslation score) and FT-score (forward translation score).
Motivation:
The motivation is to enhance unsupervised quality estimation methods, thereby reducing the reliance on human evaluators for machine translation quality assessment.
Related Works and Their Gaps:
Moon et al. (2020) revisited round-trip translation for quality estimation with neural systems but reported low correlation on segment-level tasks.
YISI-2 metric (Lo, 2019) performed worse than the approach used in this paper, but both were complementary in addressing different quality estimation needs.
Contribution of This Paper:
The paper proposes a method to enhance unsupervised machine translation quality estimation by combining back translation-based scores with off-the-shelf QE metrics, improving correlation with human judgments and enhancing translation quality estimation.
Proposed Methods:
The method involves using Facebook’s mBART-50 model to backtranslate the machine translation output into the source language and then comparing it to the original sentence using metrics such as BLEU, chrF, and BERTScore. These scores are combined with the existing QE metrics to improve overall performance.
Experiments:
Model: The authors used Facebook’s mBART-50 for back translation and evaluated their approach with BLEU, chrF, and BERTScore metrics.
Evaluation Metrics: They used Pearson and Spearman correlation scores to compare back translation-based metrics with human judgments for sentence-level quality estimation.
Gaps in This Work:
The authors note that their approach lacks comparative evaluation against other existing methods for machine translation quality estimation, which would provide a broader context for the effectiveness of their method.
The text was updated successfully, but these errors were encountered: