diff --git a/data/xml/2021.acl.xml b/data/xml/2021.acl.xml
index 6fc5cf57b9..cbf1f5df9e 100644
--- a/data/xml/2021.acl.xml
+++ b/data/xml/2021.acl.xml
@@ -12689,6 +12689,15 @@
2021.acl-tutorials.1
10.18653/v1/2021.acl-tutorials.1
bar-haim-etal-2021-advances
+
+
+
+
+
+
+
+
+
Event-Centric Natural Language Processing
@@ -12704,6 +12713,7 @@
2021.acl-tutorials.2
10.18653/v1/2021.acl-tutorials.2
chen-etal-2021-event
+
Meta Learning and Its Applications to Natural Language Processing
@@ -12725,6 +12735,9 @@
2021.acl-tutorials.4
10.18653/v1/2021.acl-tutorials.4
wang-li-2021-pre
+
+
+
Prosody: Models, Methods, and Applications
@@ -12765,6 +12778,166 @@
+
+ 59th Annual Meeting of the Association for Computational Linguistics
+ Online
+ August 1–6, 2021
+
+
+ https://2021.aclweb.org
+
+
+ Keynote 1: Advancing Technological Equity in Speech and Language Processing
+ HelenMeng
+ 2021.acl.keynote1.mp4
+
+
+ Keynote 2: Learning and Processing Language from Wearables: Opportunities and Challenges
+ AlejandrinaCristia
+ 2021.acl.keynote2.mp4
+
+
+ Keynote 3: Reliable Characterizations of NLP Systems as a Social Responsibility
+ ChristopherPotts
+ 2021.acl.keynote3.mp4
+
+
+ Opening Session
+ ChengqingZong
+ RobertoNavigli
+ FeiXia
+ WenjieLi
+ 2021.acl.opening-session.mp4
+
+
+ Presidential Address
+ RadaMihalcea
+ 2021.acl.presidential-address.mp4
+
+
+ Business Meeting: Business Manager Report
+ PriscillaRasmussen
+ 2021.acl.business-meeting1.mp4
+
+
+ Business Meeting: Conference Officer Report
+ YusukeMiyao
+ 2021.acl.business-meeting2.mp4
+
+
+ Business Meeting: EACL Report
+ ShulyWintner
+ 2021.acl.business-meeting3.mp4
+
+
+ Business Meeting: NAACL Report
+ ColinCherry
+ 2021.acl.business-meeting4.mp4
+
+
+ Business Meeting: CL Report
+ Hwee TouNg
+ 2021.acl.business-meeting5.mp4
+
+
+ Business Meeting: TACL Report
+ BrianRoark
+ 2021.acl.business-meeting6.mp4
+
+
+ Business Meeting: ACL Rolling Review Report
+ AmandaStent
+ 2021.acl.business-meeting7.mp4
+
+
+ Business Meeting: ACL Anthology Report
+ MattPost
+ 2021.acl.business-meeting8.mp4
+
+
+ Business Meeting: Professional Conduct Committee Report
+ GraemeHirst
+ Emily M.Bender
+ 2021.acl.business-meeting9.mp4
+
+
+ Business Meeting: Publicity Director Report
+ BarbaraPlank
+ SarvnazKarimi
+ CarolineLawrence
+ 2021.acl.business-meeting10.mp4
+
+
+ Business Meeting: Secretary's Report
+ ShiqiZhao
+ 2021.acl.business-meeting11.mp4
+
+
+ Business Meeting: Treasurer's Report
+ DavidYarowsky
+ 2021.acl.business-meeting12.mp4
+
+
+ Business Meeting: Sponsorship Director Report
+ ChrisCallison-Burch
+ 2021.acl.business-meeting13.mp4
+
+
+ Business Meeting: ACL 2023 Call for Bids
+ TimothyBaldwin
+ 2021.acl.business-meeting111.mp4
+
+
+ Business Meeting: ACL 2024 Call for Bids
+ IrynaGurevych
+ 2021.acl.business-meeting121.mp4
+
+
+ Panel: Green NLP
+ RoySchwartz
+ ColinRaffel
+ PhilBlunsom
+ JesseDodge
+ YukiArase
+ Noah A.Smith
+ EmmaStrubell
+ PercyLiang
+ MonaDiab
+ IrynaGurevych
+ YueZhang
+ 2021.acl.panel.mp4
+
+
+ Distinguished Service Awards and Test of Time Awards Session
+ RadaMihalcea
+ 2021.acl.test-of-time-award.mp4
+
+
+ Lifetime Achievement Award Session
+ RadaMihalcea
+ JunichiTsujii
+ 2021.acl.lifetime-achievement-award.mp4
+
+
+ Best Papers Awards Session
+ TimothyBaldwin
+ 2021.acl.best-papers.mp4
+
+
+ Closing Session
+ ChengqingZong
+ BonnieWeber
+ AnnaKorhonen
+ VivekGupta
+ DavidTrye
+ Marie-FrancineMoens
+ DanRoth
+ Chu-RenHuang
+ RobertoNavigli
+ FeiXia
+ MaggieLi
+ 2021.acl.closing-session.mp4
+
2021.findings-acl
2021.bppf-1
diff --git a/data/xml/2021.emnlp.xml b/data/xml/2021.emnlp.xml
index 4600208ef9..227d1d963c 100644
--- a/data/xml/2021.emnlp.xml
+++ b/data/xml/2021.emnlp.xml
@@ -14140,6 +14140,49 @@
+
+ The 2021 Conference on Empirical Methods in Natural Language Processing
+ Punta Cana, Dominican Republic
+ November 7-11, 2021
+
+
+ https://2021.emnlp.org
+ 2021.emnlp.handbook.pdf
+
+
+ Keynote 1: Where next? Towards multi-text consumption via three inspired research lines
+ IdoDagan
+ 2021.emnlp.keynote1.mp4
+
+
+ Keynote 2: The language system in the human brain
+ EvelinaFedorenko
+ 2021.emnlp.keynote2.mp4
+
+
+ Keynote 3: LT4All!? Rethinking the Agenda
+ StevenBird
+ 2021.emnlp.keynote3.mp4
+
+
+ Opening Session
+ SienMoens
+ XuanjingHuang
+ LuciaSpecia
+ ScottYih
+ 2021.emnlp.opening-session.mp4
+
+
+ Best Papers Awards Session and Closing Session
+ SienMoens
+ LuciaSpecia
+ PreslavNakov
+ DanRoth
+ Key-SunChoi
+ Noah A.Smith
+ Chia-HuiChang
+ 2021.emnlp.closing-session.mp4
+
2021.findings-emnlp
2021.codi-main
diff --git a/data/xml/2021.naacl.xml b/data/xml/2021.naacl.xml
index b35ac7caab..bcb6a82f88 100644
--- a/data/xml/2021.naacl.xml
+++ b/data/xml/2021.naacl.xml
@@ -8093,6 +8093,11 @@
10.18653/v1/2021.naacl-tutorials.5
beltagy-etal-2021-beyond
allenai/naacl2021-longdoc-tutorial
+
+
+
+
+
Crowdsourcing Natural Language Data at Scale: A Hands-On Tutorial
@@ -8718,6 +8723,98 @@
+
+ 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics
+ Online
+ June 6-11, 2021
+
+
+ https://2021.naacl.org
+
+
+ Keynote 1: Humans Learn From Task Descriptions and So Should Our Models
+ HinrichSchütze
+ 2021.naacl.keynote1.mp4
+
+
+ Keynote 2: From Disembodied to Embodied Multimodal Learning
+ DhruvBatra
+ 2021.naacl.keynote2.mp4
+
+
+ Keynote 3: Generating Reality: Technical and Social Explorations in Generative Machine Learning Research
+ ShakirMohamed
+ 2021.naacl.keynote3.mp4
+
+
+ Keynote 4: Moving the Needle in NLP Technology for the Processing of Code-switched Language
+ ThamarSolorio
+ 2021.naacl.keynote4.mp4
+
+
+ Industry Track Keynote 1: Project Debater - from grand challenge to business applications, behind the scenes and lessons learned
+ AyaSoffer
+ 2021.naacl.keynote_industry1.mp4
+
+
+ Industry Track Keynote 2: Semantic Scholar - Advanced NLP to Accelerate Scientific Research
+ DanWeld
+ 2021.naacl.keynote_industry2.mp4
+
+
+ Opening Session
+ KristinaToutanova
+ AnnaRumshisky
+ LukeZettlemoyer
+ DilekHakkani-Tur
+ 2021.naacl.opening-session.mp4
+
+
+ Business Meeting
+ ColinCherry
+ JonMay
+ AmittaiAxelrod
+ LucianaBenotti
+ OwenRambow
+ VivekGupta
+ AnnaRumshisky
+ Marie-FrancineMoens
+ BernardoMagnini
+ 2021.naacl.business-meeting.mp4
+
+
+ Panel: Careers in NLP
+ IsabelleAugenstein
+ MonaDiab
+ JimmyLin
+ SebastianRuder
+ PhilipResnik
+ 2021.naacl.panel1.mp4
+
+
+ Panel: Startups in NLP
+ ApoorvAgarwal
+ SpenceGreen
+ NasrinMostafazadeh
+ KieranSnyder
+ AlonLavie
+ 2021.naacl.panel2.mp4
+
+
+ Best Papers Awards Session
+ AnnaRumshisky
+ 2021.naacl.best-papers.mp4
+
+
+ Closing Session
+ KristinaToutanova
+ AviSil
+ VictoriaLin
+ Marie-FrancineMoens
+ DanRoth
+ BernardoMagnini
+ 2021.naacl.closing-session.mp4
+
2021.alvr-1
2021.americasnlp-1
diff --git a/data/xml/2022.acl.xml b/data/xml/2022.acl.xml
index 16b60c47b5..4272f3e549 100644
--- a/data/xml/2022.acl.xml
+++ b/data/xml/2022.acl.xml
@@ -3537,6 +3537,7 @@
10.18653/v1/2022.acl-long.220
thomaslu2000/incremental-parsing-representations
Penn Treebank
+
Knowledge Enhanced Reflection Generation for Counseling Dialogues
@@ -4797,6 +4798,7 @@ in the Case of Unambiguous Gender
Various fixes throughout the paper.
10.18653/v1/2022.acl-long.298
+
Improving Word Translation via Two-Stage Contrastive Learning
@@ -8223,6 +8225,7 @@ in the Case of Unambiguous Gender
pine-etal-2022-requirements
10.18653/v1/2022.acl-long.507
roedoejet/fastspeech2
+
Sharpness-Aware Minimization Improves Language Model Generalization
@@ -12269,6 +12272,8 @@ in the Case of Unambiguous Gender
church-etal-2022-gentle
10.18653/v1/2022.acl-tutorials.1
GLUE
+
+
Towards Reproducible Machine Learning Research in Natural Language Processing
@@ -12285,6 +12290,8 @@ in the Case of Unambiguous Gender
2022.acl-tutorials.2
lucic-etal-2022-towards
10.18653/v1/2022.acl-tutorials.2
+
+
Knowledge-Augmented Methods for Natural Language Processing
@@ -12303,6 +12310,10 @@ in the Case of Unambiguous Gender
CommonsenseQA
ConceptNet
RiddleSense
+
+
+
+
Non-Autoregressive Sequence Generation
@@ -12313,6 +12324,8 @@ in the Case of Unambiguous Gender
2022.acl-tutorials.4
gu-tan-2022-non
10.18653/v1/2022.acl-tutorials.4
+
+
Learning with Limited Text Data
@@ -12324,6 +12337,7 @@ in the Case of Unambiguous Gender
2022.acl-tutorials.5
yang-etal-2022-learning
10.18653/v1/2022.acl-tutorials.5
+
Zero- and Few-Shot NLP with Pretrained Language Models
@@ -12337,6 +12351,8 @@ in the Case of Unambiguous Gender
2022.acl-tutorials.6
beltagy-etal-2022-zero
10.18653/v1/2022.acl-tutorials.6
+
+
Vision-Language Pretraining: Current Trends and the Future
@@ -12349,6 +12365,9 @@ in the Case of Unambiguous Gender
agrawal-etal-2022-vision
10.18653/v1/2022.acl-tutorials.7
Visual Question Answering
+
+
+
Natural Language Processing for Multilingual Task-Oriented Dialogue
@@ -12362,6 +12381,8 @@ in the Case of Unambiguous Gender
2022.acl-tutorials.8
razumovskaia-etal-2022-natural
10.18653/v1/2022.acl-tutorials.8
+
+
@@ -12386,13 +12407,119 @@ in the Case of Unambiguous Gender
2022.acl.keynote2.mp4
- ACL business meeting
+ The Next Big Ideas Talks
+ MarcoBaroni
+ EduardHovy
+ HengJi
+ MirellaLapata
+ HangLi
+ DanRoth
+ ThamarSolorio
+ 2022.acl.next-big-ideas.mp4
+
+
+ Spotlight Talks for Young Rising Stars
+ EunsolChoi
+ RyanCotterell
+ SebastianRuder
+ SwabhaSwayamdipta
+ DiyiYang
+ 2022.acl.spotlight-talks.mp4
+
+
+ Opening Session and Presidential Address
+ BernardoMagnini
+ AndyWay
+ JohnKelleher
+ PreslavNakov
+ AlineVillavicencio
+ SmarandaMuresan
+ MonaDiab
TimothyBaldwin
- DavidYarowsky
- RadaMihalcea
- Emily M.Bender
+ 2022.acl.opening-session.mp4
+
+
+ Business Meeting: TACL Report
+ BrianRoark
+ 2022.acl.business-meeting1.mp4
+
+
+ Business Meeting: Secretary's Report
+ ShiqiZhao
+ 2022.acl.business-meeting2.mp4
+
+
+ Business Meeting: EACL Report
+ ShulyWintner
+ 2022.acl.business-meeting3.mp4
+
+
+ Business Meeting: ACL 2024 Coordinating Committee Report
IrynaGurevych
- 2022.acl.business-meeting.mp4
+ 2022.acl.business-meeting4.mp4
+
+
+ Business Meeting: NAACL Chair Report
+ LucianaBenotti
+ 2022.acl.business-meeting5.mp4
+
+
+ Business Meeting: Sponsorship Director Report
+ ChrisCallison-Burch
+ 2022.acl.business-meeting6.mp4
+
+
+ Business Meeting: Professional Conduct Committee Report
+ GraemeHirst
+ Emily M.Bender
+ 2022.acl.business-meeting7.mp4
+
+
+ Business Meeting: Ethics Committee Report
+ Min-YenKan
+ 2022.acl.business-meeting8.mp4
+
+
+ Business Meeting: Conference Officer Report
+ YusukeMiyao
+ 2022.acl.business-meeting9.mp4
+
+
+ Business Meeting: Publicity Director Report
+ BarbaraPlank
+ 2022.acl.business-meeting10.mp4
+
+
+ Business Meeting: ACL 2025 Call for Bids
+ Emily M.Bender
+ 2022.acl.business-meeting11.mp4
+
+
+ Awards Ceremony
+ TimothyBaldwin
+ YusukeMiyao
+ 2022.acl.awards-ceremony.mp4
+
+
+ EMNLP 2022 Information Session
+ NizarHabash
+ 2022.acl.emnlp-2022.mp4
+
+
+ Best Papers Awards Session
+ BernardoMagnini
+ 2022.acl.best-papers.mp4
+
+
+ Closing Session
+ BernardoMagnini
+ DanRoth
+ Key-SunChoi
+ NizarHabash
+ AlessandroMoschitti
+ YulanHe
+ MonaDiab
+ 2022.acl.closing-session.mp4
2022.findings-acl
diff --git a/data/xml/2022.bea.xml b/data/xml/2022.bea.xml
index cd7e534504..d2e01871e8 100644
--- a/data/xml/2022.bea.xml
+++ b/data/xml/2022.bea.xml
@@ -74,6 +74,7 @@
2022.bea-1.4
chen-etal-2022-automatically
10.18653/v1/2022.bea-1.4
+
A Baseline Readability Model for Cebuano
diff --git a/data/xml/2022.bigscience.xml b/data/xml/2022.bigscience.xml
index 903aae6694..54462c559e 100644
--- a/data/xml/2022.bigscience.xml
+++ b/data/xml/2022.bigscience.xml
@@ -33,6 +33,7 @@
2022.bigscience-1.1
jin-etal-2022-lifelong
10.18653/v1/2022.bigscience-1.1
+
Using ASR-Generated Text for Spoken Language Modeling
diff --git a/data/xml/2022.cl.xml b/data/xml/2022.cl.xml
index 7f96c1b19f..adfb853ca7 100644
--- a/data/xml/2022.cl.xml
+++ b/data/xml/2022.cl.xml
@@ -397,6 +397,7 @@
849–886
2022.cl-4.13
nivre-etal-2022-nucleus
+
Effective Approaches to Neural Query Language Identification
@@ -412,6 +413,7 @@
887–906
2022.cl-4.14
ren-etal-2022-effective
+
Information Theory–based Compositional Distributional Semantics
@@ -424,6 +426,7 @@
907–948
2022.cl-4.15
amigo-etal-2022-information
+
Revise and Resubmit: An Intertextual Model of Text-based Collaboration in Peer Review
@@ -458,6 +461,7 @@
1021–1052
2022.cl-4.18
keya-etal-2022-neural
+
The Text Anonymization Benchmark (TAB): A Dedicated Corpus and Evaluation Framework for Text Anonymization
@@ -472,6 +476,7 @@
1053–1101
2022.cl-4.19
pilan-etal-2022-text
+
How Much Does Lookahead Matter for Disambiguation? Partial Arabic Diacritization Case Study
@@ -483,6 +488,7 @@
1103–1123
2022.cl-4.20
esmail-etal-2022-much
+
A Metrological Perspective on Reproducibility in NLP*
diff --git a/data/xml/2022.emnlp.xml b/data/xml/2022.emnlp.xml
index 7cbc1045ed..b0381e0022 100644
--- a/data/xml/2022.emnlp.xml
+++ b/data/xml/2022.emnlp.xml
@@ -28,6 +28,7 @@
2022.emnlp-main.1
ye-etal-2022-generative
10.18653/v1/2022.emnlp-main.1
+
CDConv: A Benchmark for Contradiction Detection in Chinese Conversations
@@ -45,6 +46,7 @@
2022.emnlp-main.2
zheng-etal-2022-cdconv
10.18653/v1/2022.emnlp-main.2
+
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
@@ -57,6 +59,7 @@
2022.emnlp-main.3
geva-etal-2022-transformer
10.18653/v1/2022.emnlp-main.3
+
Learning to Generate Question by Asking Question: A Primal-Dual Approach with Uncommon Word Generation
@@ -73,6 +76,7 @@
2022.emnlp-main.4
wang-etal-2022-learning-generate
10.18653/v1/2022.emnlp-main.4
+
Graph-based Model Generation for Few-Shot Relation Extraction
@@ -83,6 +87,7 @@
2022.emnlp-main.5
li-qian-2022-graph
10.18653/v1/2022.emnlp-main.5
+
Backdoor Attacks in Federated Learning by Rare Embeddings and Gradient Ensembling
@@ -93,6 +98,7 @@
2022.emnlp-main.6
yoo-kwak-2022-backdoor
10.18653/v1/2022.emnlp-main.6
+
Generating Natural Language Proofs with Verifier-Guided Search
@@ -104,6 +110,7 @@
2022.emnlp-main.7
yang-etal-2022-generating
10.18653/v1/2022.emnlp-main.7
+
Toward Unifying Text Segmentation and Long Document Summarization
@@ -117,6 +124,7 @@
2022.emnlp-main.8
cho-etal-2022-toward
10.18653/v1/2022.emnlp-main.8
+
The Geometry of Multilingual Language Model Representations
@@ -128,6 +136,7 @@
2022.emnlp-main.9
chang-etal-2022-geometry
10.18653/v1/2022.emnlp-main.9
+
Improving Complex Knowledge Base Question Answering via Question-to-Action and Question-to-Question Alignment
@@ -140,6 +149,7 @@
2022.emnlp-main.10.software.zip
tang-etal-2022-improving
10.18653/v1/2022.emnlp-main.10
+
PAIR: Prompt-Aware margIn Ranking for Counselor Reflection Scoring in Motivational Interviewing
@@ -152,6 +162,7 @@
2022.emnlp-main.11
min-etal-2022-pair
10.18653/v1/2022.emnlp-main.11
+
Co-guiding Net: Achieving Mutual Guidances between Multiple Intent Detection and Slot Filling via Heterogeneous Semantics-Label Graphs
@@ -162,6 +173,7 @@
2022.emnlp-main.12
xing-tsang-2022-co
10.18653/v1/2022.emnlp-main.12
+
The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains
@@ -173,6 +185,7 @@
2022.emnlp-main.13
xu-etal-2022-importance
10.18653/v1/2022.emnlp-main.13
+
Interpreting Language Models with Contrastive Explanations
@@ -183,6 +196,7 @@
2022.emnlp-main.14
yin-neubig-2022-interpreting
10.18653/v1/2022.emnlp-main.14
+
RankGen: Improving Text Generation with Large Ranking Models
@@ -195,6 +209,7 @@
2022.emnlp-main.15
krishna-etal-2022-rankgen
10.18653/v1/2022.emnlp-main.15
+
Learning a Grammar Inducer from Massive Uncurated Instructional Videos
@@ -210,6 +225,7 @@
2022.emnlp-main.16
zhang-etal-2022-learning-grammar
10.18653/v1/2022.emnlp-main.16
+
Normalized Contrastive Learning for Text-Video Retrieval
@@ -225,6 +241,7 @@
2022.emnlp-main.17
park-etal-2022-normalized
10.18653/v1/2022.emnlp-main.17
+
Estimating Soft Labels for Out-of-Domain Intent Detection
@@ -239,6 +256,7 @@
2022.emnlp-main.18
lang-etal-2022-estimating
10.18653/v1/2022.emnlp-main.18
+
Multi-VQG: Generating Engaging Questions for Multiple Images
@@ -251,6 +269,7 @@
2022.emnlp-main.19
yeh-etal-2022-multi
10.18653/v1/2022.emnlp-main.19
+
Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation
@@ -264,6 +283,7 @@
2022.emnlp-main.20
bulian-etal-2022-tomayto
10.18653/v1/2022.emnlp-main.20
+
Non-Parametric Domain Adaptation for End-to-End Speech Translation
@@ -291,6 +311,7 @@
2022.emnlp-main.22
cao-etal-2022-prompting
10.18653/v1/2022.emnlp-main.22
+
Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking
@@ -304,6 +325,7 @@
2022.emnlp-main.23
li-etal-2022-certified
10.18653/v1/2022.emnlp-main.23
+
Linearizing Transformer with Key-Value Memory
@@ -314,6 +336,7 @@
2022.emnlp-main.24
zhang-cai-2022-linearizing
10.18653/v1/2022.emnlp-main.24
+
Robustness of Fusion-based Multimodal Classifiers to Cross-Modal Content Dilutions
@@ -326,6 +349,7 @@
2022.emnlp-main.25
verma-etal-2022-robustness
10.18653/v1/2022.emnlp-main.25
+
Translation between Molecules and Natural Language
@@ -340,6 +364,7 @@
2022.emnlp-main.26
edwards-etal-2022-translation
10.18653/v1/2022.emnlp-main.26
+
What Makes Instruction Learning Hard? An Investigation and a New Challenge in a Synthetic Environment
@@ -352,6 +377,7 @@
2022.emnlp-main.27
finlayson-etal-2022-makes
10.18653/v1/2022.emnlp-main.27
+
Sentence-Incremental Neural Coreference Resolution
@@ -364,6 +390,7 @@
2022.emnlp-main.28.software.zip
grenander-etal-2022-sentence
10.18653/v1/2022.emnlp-main.28
+
SNaC: Coherence Error Detection for Narrative Summarization
@@ -375,6 +402,7 @@
2022.emnlp-main.29
goyal-etal-2022-snac
10.18653/v1/2022.emnlp-main.29
+
HydraSum: Disentangling Style Features in Text Summarization with Multi-Decoder Models
@@ -387,6 +415,7 @@
2022.emnlp-main.30
goyal-etal-2022-hydrasum
10.18653/v1/2022.emnlp-main.30
+
A Good Neighbor, A Found Treasure: Mining Treasured Neighbors for Knowledge Graph Entity Typing
@@ -400,6 +429,7 @@
2022.emnlp-main.31
jin-etal-2022-good-neighbor
10.18653/v1/2022.emnlp-main.31
+
Guiding Neural Entity Alignment with Compatibility
@@ -414,6 +444,7 @@
2022.emnlp-main.32
liu-etal-2022-guiding
10.18653/v1/2022.emnlp-main.32
+
InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning
@@ -428,6 +459,7 @@
2022.emnlp-main.33
gupta-etal-2022-instructdial
10.18653/v1/2022.emnlp-main.33
+
Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
@@ -443,6 +475,7 @@
2022.emnlp-main.34.software.zip
jiang-etal-2022-unsupervised
10.18653/v1/2022.emnlp-main.34
+
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
@@ -455,6 +488,7 @@
2022.emnlp-main.35
xiao-etal-2022-retromae
10.18653/v1/2022.emnlp-main.35
+
Aligning Recommendation and Conversation via Dual Imitation
@@ -470,6 +504,7 @@
2022.emnlp-main.36
zhou-etal-2022-aligning
10.18653/v1/2022.emnlp-main.36
+
QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware Relevance
@@ -483,6 +518,7 @@
2022.emnlp-main.37.software.zip
wang-etal-2022-qrelscore
10.18653/v1/2022.emnlp-main.37
+
Abstract Visual Reasoning with Tangram Shapes
@@ -498,6 +534,7 @@
2022.emnlp-main.38
ji-etal-2022-abstract
10.18653/v1/2022.emnlp-main.38
+
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models
@@ -529,6 +566,7 @@
2022.emnlp-main.39
xie-etal-2022-unifiedskg
10.18653/v1/2022.emnlp-main.39
+
Balanced Adversarial Training: Balancing Tradeoffs between Fickleness and Obstinacy in NLP Models
@@ -540,6 +578,7 @@
2022.emnlp-main.40
chen-etal-2022-balanced
10.18653/v1/2022.emnlp-main.40
+
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
@@ -552,6 +591,7 @@
2022.emnlp-main.41.software.zip
sikarwar-etal-2022-transformers
10.18653/v1/2022.emnlp-main.41
+
Generative Language Models for Paragraph-Level Question Generation
@@ -563,6 +603,7 @@
2022.emnlp-main.42
ushio-etal-2022-generative
10.18653/v1/2022.emnlp-main.42
+
A Unified Encoder-Decoder Framework with Entity Memory
@@ -575,6 +616,7 @@
2022.emnlp-main.43
zhang-etal-2022-unified
10.18653/v1/2022.emnlp-main.43
+
Segmenting Numerical Substitution Ciphers
@@ -585,6 +627,7 @@
2022.emnlp-main.44
aldarrab-may-2022-segmenting
10.18653/v1/2022.emnlp-main.44
+
Crossmodal-3600: A Massively Multilingual Multimodal Evaluation Dataset
@@ -597,6 +640,7 @@
2022.emnlp-main.45
thapliyal-etal-2022-crossmodal
10.18653/v1/2022.emnlp-main.45
+
ReSel: N-ary Relation Extraction from Scientific Text and Tables by Learning to Retrieve and Select
@@ -613,6 +657,7 @@
2022.emnlp-main.46
zhuang-etal-2022-resel
10.18653/v1/2022.emnlp-main.46
+
GammaE: Gamma Embeddings for Logical Queries on Knowledge Graphs
@@ -627,6 +672,7 @@
2022.emnlp-main.47.software.zip
yang-etal-2022-gammae
10.18653/v1/2022.emnlp-main.47
+
Reasoning Like Program Executors
@@ -644,6 +690,7 @@
2022.emnlp-main.48
pi-etal-2022-reasoning
10.18653/v1/2022.emnlp-main.48
+
SEM-F1: an Automatic Way for Semantic Evaluation of Multi-Narrative Overlap Summaries at Scale
@@ -655,6 +702,7 @@
2022.emnlp-main.49
bansal-etal-2022-sem
10.18653/v1/2022.emnlp-main.49
+
Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning
@@ -669,6 +717,7 @@
2022.emnlp-main.50
chen-etal-2022-inducer
10.18653/v1/2022.emnlp-main.50
+
DocInfer: Document-level Natural Language Inference using Optimal Evidence Selection
@@ -683,6 +732,7 @@
2022.emnlp-main.51
mathur-etal-2022-docinfer
10.18653/v1/2022.emnlp-main.51
+
LightEA: A Scalable, Robust, and Interpretable Entity Alignment Framework via Three-view Label Propagation
@@ -695,6 +745,7 @@
2022.emnlp-main.52
mao-etal-2022-lightea
10.18653/v1/2022.emnlp-main.52
+
Metric-guided Distillation: Distilling Knowledge from the Metric to Ranker and Retriever for Generative Commonsense Reasoning
@@ -713,6 +764,7 @@
2022.emnlp-main.53
he-etal-2022-metric
10.18653/v1/2022.emnlp-main.53
+
Efficient Document Retrieval by End-to-End Refining and Quantizing BERT Embedding with Contrastive Product Quantization
@@ -726,6 +778,7 @@
qiu-etal-2022-efficient
2022.emnlp-main.54.software.zip
10.18653/v1/2022.emnlp-main.54
+
Curriculum Knowledge Distillation for Emoji-supervised Cross-lingual Sentiment Analysis
@@ -739,6 +792,7 @@
2022.emnlp-main.55
zhang-etal-2022-curriculum
10.18653/v1/2022.emnlp-main.55
+
Correctable-DST: Mitigating Historical Context Mismatch between Training and Inference for Improved Dialogue State Tracking
@@ -756,6 +810,7 @@
2022.emnlp-main.56
xie-etal-2022-correctable
10.18653/v1/2022.emnlp-main.56
+
DropMix: A Textual Data Augmentation Combining Dropout with Mixup
@@ -769,6 +824,7 @@
2022.emnlp-main.57
kong-etal-2022-dropmix
10.18653/v1/2022.emnlp-main.57
+
Cross-document Event Coreference Search: Task, Dataset and Modeling
@@ -780,6 +836,7 @@
2022.emnlp-main.58
eirew-etal-2022-cross
10.18653/v1/2022.emnlp-main.58
+
VIRT: Improving Representation-based Text Matching via Virtual Interaction
@@ -797,6 +854,7 @@
2022.emnlp-main.59
li-etal-2022-virt
10.18653/v1/2022.emnlp-main.59
+
MAVEN-ERE: A Unified Large-scale Dataset for Event Coreference, Temporal, Causal, and Subevent Relation Extraction
@@ -817,6 +875,7 @@
2022.emnlp-main.60
wang-etal-2022-maven
10.18653/v1/2022.emnlp-main.60
+
Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models
@@ -829,6 +888,7 @@
2022.emnlp-main.61
mahapatra-etal-2022-entity
10.18653/v1/2022.emnlp-main.61
+
How Large Language Models are Transforming Machine-Paraphrase Plagiarism
@@ -841,6 +901,7 @@
2022.emnlp-main.62
wahle-etal-2022-large
10.18653/v1/2022.emnlp-main.62
+
M2D2: A Massively Multi-Domain Language Modeling Dataset
@@ -853,6 +914,7 @@
2022.emnlp-main.63
reid-etal-2022-m2d2
10.18653/v1/2022.emnlp-main.63
+
“Will You Find These Shortcuts?” A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification
@@ -866,6 +928,7 @@
2022.emnlp-main.64
bastings-etal-2022-will
10.18653/v1/2022.emnlp-main.64
+
Information-Transport-based Policy for Simultaneous Translation
@@ -876,6 +939,7 @@
2022.emnlp-main.65
zhang-feng-2022-information
10.18653/v1/2022.emnlp-main.65
+
Learning to Adapt to Low-Resource Paraphrase Generation
@@ -890,6 +954,7 @@
2022.emnlp-main.66
li-etal-2022-learning-adapt
10.18653/v1/2022.emnlp-main.66
+
A Distributional Lens for Multi-Aspect Controllable Text Generation
@@ -904,6 +969,7 @@
2022.emnlp-main.67
gu-etal-2022-distributional
10.18653/v1/2022.emnlp-main.67
+
ELMER: A Non-Autoregressive Pre-trained Language Model for Efficient and Effective Text Generation
@@ -917,6 +983,7 @@
2022.emnlp-main.68
li-etal-2022-elmer
10.18653/v1/2022.emnlp-main.68
+
Multilingual Relation Classification via Efficient and Effective Prompting
@@ -928,6 +995,7 @@
2022.emnlp-main.69
chen-etal-2022-multilingual
10.18653/v1/2022.emnlp-main.69
+
Topic-Regularized Authorship Representation Learning
@@ -940,6 +1008,7 @@
2022.emnlp-main.70
sawatphol-etal-2022-topic
10.18653/v1/2022.emnlp-main.70
+
Fine-grained Contrastive Learning for Relation Extraction
@@ -951,6 +1020,7 @@
2022.emnlp-main.71
hogan-etal-2022-fine
10.18653/v1/2022.emnlp-main.71
+
Curriculum Prompt Learning with Self-Training for Abstractive Dialogue Summarization
@@ -964,6 +1034,7 @@
2022.emnlp-main.72
li-etal-2022-curriculum
10.18653/v1/2022.emnlp-main.72
+
Zero-Shot Text Classification with Self-Training
@@ -978,6 +1049,7 @@
2022.emnlp-main.73
gera-etal-2022-zero
10.18653/v1/2022.emnlp-main.73
+
Deconfounding Legal Judgment Prediction for European Court of Human Rights Cases Towards Better Alignment with Experts
@@ -990,6 +1062,7 @@
2022.emnlp-main.74
santosh-etal-2022-deconfounding
10.18653/v1/2022.emnlp-main.74
+
SQuALITY: Building a Long-Document Summarization Dataset the Hard Way
@@ -1004,6 +1077,7 @@
2022.emnlp-main.75.dataset.zip
wang-etal-2022-squality
10.18653/v1/2022.emnlp-main.75
+
MetaASSIST: Robust Dialogue State Tracking with Meta Learning
@@ -1018,6 +1092,7 @@
2022.emnlp-main.76
ye-etal-2022-metaassist
10.18653/v1/2022.emnlp-main.76
+
Multilingual Machine Translation with Hyper-Adapters
@@ -1031,6 +1106,7 @@
baziotis-etal-2022-multilingual
2022.emnlp-main.77.data.zip
10.18653/v1/2022.emnlp-main.77
+
Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination
@@ -1045,6 +1121,7 @@
2022.emnlp-main.78
yang-etal-2022-z
10.18653/v1/2022.emnlp-main.78
+
Using Commonsense Knowledge to Answer Why-Questions
@@ -1060,6 +1137,7 @@
2022.emnlp-main.79
lal-etal-2022-using
10.18653/v1/2022.emnlp-main.79
+
Affective Idiosyncratic Responses to Music
@@ -1073,6 +1151,7 @@
2022.emnlp-main.80
ch-wang-etal-2022-affective
10.18653/v1/2022.emnlp-main.80
+
Successive Prompting for Decomposing Complex Questions
@@ -1085,6 +1164,7 @@
2022.emnlp-main.81
dua-etal-2022-successive
10.18653/v1/2022.emnlp-main.81
+
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations
@@ -1101,6 +1181,7 @@
2022.emnlp-main.82.software.zip
jung-etal-2022-maieutic
10.18653/v1/2022.emnlp-main.82
+
DANLI: Deliberative Agent for Following Natural Language Instructions
@@ -1118,6 +1199,7 @@
2022.emnlp-main.83
zhang-etal-2022-danli
10.18653/v1/2022.emnlp-main.83
+
Tracing Semantic Variation in Slang
@@ -1130,6 +1212,7 @@
2022.emnlp-main.84.dataset.zip
sun-xu-2022-tracing
10.18653/v1/2022.emnlp-main.84
+
Fine-grained Category Discovery under Coarse-grained supervision with Hierarchical Weighted Self-contrastive Learning
@@ -1144,6 +1227,7 @@
2022.emnlp-main.85
an-etal-2022-fine
10.18653/v1/2022.emnlp-main.85
+
PLM-based World Models for Text-based Games
@@ -1158,6 +1242,7 @@
2022.emnlp-main.86.software.zip
kim-etal-2022-plm
10.18653/v1/2022.emnlp-main.86
+
Prompt-Based Meta-Learning For Few-shot Text Classification
@@ -1171,6 +1256,7 @@
zhang-etal-2022-prompt-based
2022.emnlp-main.87.data.zip
10.18653/v1/2022.emnlp-main.87
+
How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?
@@ -1185,6 +1271,7 @@
bansal-etal-2022-well
2022.emnlp-main.88.software.zip
10.18653/v1/2022.emnlp-main.88
+
Geographic Citation Gaps in NLP Research
@@ -1197,6 +1284,7 @@
2022.emnlp-main.89
rungta-etal-2022-geographic
10.18653/v1/2022.emnlp-main.89
+
Language Models of Code are Few-Shot Commonsense Learners
@@ -1210,6 +1298,7 @@
2022.emnlp-main.90
madaan-etal-2022-language
10.18653/v1/2022.emnlp-main.90
+
Numerical Optimizations for Weighted Low-rank Estimation on Language Models
@@ -1225,6 +1314,7 @@
2022.emnlp-main.91.note.pdf
hua-etal-2022-numerical
10.18653/v1/2022.emnlp-main.91
+
Generative Multi-hop Retrieval
@@ -1237,6 +1327,7 @@
2022.emnlp-main.92
lee-etal-2022-generative
10.18653/v1/2022.emnlp-main.92
+
Visual Spatial Description: Controlled Spatial-Oriented Image-to-Text Generation
@@ -1251,6 +1342,7 @@
2022.emnlp-main.93
zhao-etal-2022-visual
10.18653/v1/2022.emnlp-main.93
+
M3: A Multi-View Fusion and Multi-Decoding Network for Multi-Document Reading Comprehension
@@ -1263,6 +1355,7 @@
2022.emnlp-main.94
wen-etal-2022-m3
10.18653/v1/2022.emnlp-main.94
+
COCO-DR: Combating the Distribution Shift in Zero-Shot Dense Retrieval with Contrastive and Distributionally Robust Learning
@@ -1276,6 +1369,7 @@
2022.emnlp-main.95
yu-etal-2022-coco
10.18653/v1/2022.emnlp-main.95
+
Language Model Pre-Training with Sparse Latent Typing
@@ -1290,6 +1384,7 @@
2022.emnlp-main.96
ren-etal-2022-language
10.18653/v1/2022.emnlp-main.96
+
On the Transformation of Latent Space in Fine-Tuned NLP Models
@@ -1302,6 +1397,7 @@
2022.emnlp-main.97
durrani-etal-2022-transformation
10.18653/v1/2022.emnlp-main.97
+
Watch the Neighbors: A Unified K-Nearest Neighbor Contrastive Learning Framework for OOD Intent Discovery
@@ -1317,6 +1413,7 @@
2022.emnlp-main.98
mou-etal-2022-watch
10.18653/v1/2022.emnlp-main.98
+
Extracted BERT Model Leaks More Information than You Think!
@@ -1329,6 +1426,7 @@
2022.emnlp-main.99
he-etal-2022-extracted
10.18653/v1/2022.emnlp-main.99
+
Do Vision-and-Language Transformers Learn Grounded Predicate-Noun Dependencies?
@@ -1342,6 +1440,7 @@
2022.emnlp-main.100
nikolaus-etal-2022-vision
10.18653/v1/2022.emnlp-main.100
+
A Multilingual Perspective Towards the Evaluation of Attribution Methods in Natural Language Inference
@@ -1352,6 +1451,7 @@
2022.emnlp-main.101
zaman-belinkov-2022-multilingual
10.18653/v1/2022.emnlp-main.101
+
Graph-Based Multilingual Label Propagation for Low-Resource Part-of-Speech Tagging
@@ -1365,6 +1465,7 @@
2022.emnlp-main.102
imanigooghari-etal-2022-graph
10.18653/v1/2022.emnlp-main.102
+
SubeventWriter: Iterative Sub-event Sequence Generation with Coherence Controller
@@ -1379,6 +1480,7 @@
2022.emnlp-main.103
wang-etal-2022-subeventwriter
10.18653/v1/2022.emnlp-main.103
+
Infinite SCAN: An Infinite Model of Diachronic Semantic Change
@@ -1392,6 +1494,7 @@
2022.emnlp-main.104
inoue-etal-2022-infinite
10.18653/v1/2022.emnlp-main.104
+
Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization
@@ -1405,6 +1508,7 @@
2022.emnlp-main.105.software.zip
gu-etal-2022-learning
10.18653/v1/2022.emnlp-main.105
+
Counterfactual Data Augmentation via Perspective Transition for Open-Domain Dialogues
@@ -1417,6 +1521,7 @@
2022.emnlp-main.106
ou-etal-2022-counterfactual
10.18653/v1/2022.emnlp-main.106
+
SQUIRE: A Sequence-to-sequence Framework for Multi-hop Knowledge Graph Reasoning
@@ -1432,6 +1537,7 @@
2022.emnlp-main.107
bai-etal-2022-squire
10.18653/v1/2022.emnlp-main.107
+
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training
@@ -1447,6 +1553,7 @@
2022.emnlp-main.108
zhang-etal-2022-speechut
10.18653/v1/2022.emnlp-main.108
+
Learning Label Modular Prompts for Text Classification in the Wild
@@ -1459,6 +1566,7 @@
2022.emnlp-main.109
chen-etal-2022-learning-label
10.18653/v1/2022.emnlp-main.109
+
Unbiased and Efficient Sampling of Dependency Trees
@@ -1468,6 +1576,7 @@
2022.emnlp-main.110
stanojevic-2022-unbiased
10.18653/v1/2022.emnlp-main.110
+
Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions
@@ -1480,6 +1589,7 @@
2022.emnlp-main.111.software.zip
gu-etal-2022-continual
10.18653/v1/2022.emnlp-main.111
+
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models
@@ -1495,6 +1605,7 @@
2022.emnlp-main.112.software.zip
shen-etal-2022-cost
10.18653/v1/2022.emnlp-main.112
+
Rescue Implicit and Long-tail Cases: Nearest Neighbor Relation Extraction
@@ -1509,6 +1620,7 @@
2022.emnlp-main.113
wan-etal-2022-rescue
10.18653/v1/2022.emnlp-main.113
+
StoryER: Automatic Story Evaluation via Ranking, Rating and Reasoning
@@ -1522,6 +1634,7 @@
2022.emnlp-main.114
chen-etal-2022-storyer
10.18653/v1/2022.emnlp-main.114
+
Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
@@ -1538,6 +1651,7 @@
2022.emnlp-main.115
mitchell-etal-2022-enhancing
10.18653/v1/2022.emnlp-main.115
+
Robustness of Demonstration-based Learning Under Limited Data Scenario
@@ -1551,6 +1665,7 @@
2022.emnlp-main.116.software.zip
zhang-etal-2022-robustness
10.18653/v1/2022.emnlp-main.116
+
Modeling Information Change in Science Communication with Semantically Matched Paraphrases
@@ -1563,6 +1678,7 @@
2022.emnlp-main.117
wright-etal-2022-modeling
10.18653/v1/2022.emnlp-main.117
+
Word Order Matters When You Increase Masking
@@ -1574,6 +1690,7 @@
2022.emnlp-main.118
lasri-etal-2022-word
10.18653/v1/2022.emnlp-main.118
+
An Empirical Analysis of Memorization in Fine-tuned Autoregressive Language Models
@@ -1587,6 +1704,7 @@
2022.emnlp-main.119
mireshghallah-etal-2022-empirical
10.18653/v1/2022.emnlp-main.119
+
Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition
@@ -1598,6 +1716,7 @@
2022.emnlp-main.120
chen-etal-2022-style
10.18653/v1/2022.emnlp-main.120
+
Linguistic Corpus Annotation for Automatic Text Simplification Evaluation
@@ -1614,6 +1733,7 @@
2022.emnlp-main.121
cardon-etal-2022-linguistic
10.18653/v1/2022.emnlp-main.121
+
Semantic Framework based Query Generation for Temporal Question Answering over Knowledge Graphs
@@ -1627,6 +1747,7 @@
2022.emnlp-main.122.dataset.zip
ding-etal-2022-semantic
10.18653/v1/2022.emnlp-main.122
+
There Is No Standard Answer: Knowledge-Grounded Dialogue Generation with Adversarial Activated Multi-Reference Learning
@@ -1639,6 +1760,7 @@
2022.emnlp-main.123
zhao-etal-2022-standard
10.18653/v1/2022.emnlp-main.123
+
Stop Measuring Calibration When Humans Disagree
@@ -1651,6 +1773,7 @@
2022.emnlp-main.124
baan-etal-2022-stop
10.18653/v1/2022.emnlp-main.124
+
Improving compositional generalization for multi-step quantitative reasoning in question answering
@@ -1663,6 +1786,7 @@
2022.emnlp-main.125
nourbakhsh-etal-2022-improving
10.18653/v1/2022.emnlp-main.125
+
A Comprehensive Comparison of Neural Networks as Cognitive Models of Inflection
@@ -1674,6 +1798,7 @@
2022.emnlp-main.126
wiemerslage-etal-2022-comprehensive
10.18653/v1/2022.emnlp-main.126
+
Can Visual Context Improve Automatic Speech Recognition for an Embodied Agent?
@@ -1684,6 +1809,7 @@
2022.emnlp-main.127
pramanick-sarkar-2022-visual
10.18653/v1/2022.emnlp-main.127
+
AfroLID: A Neural Language Identification Tool for African Languages
@@ -1696,6 +1822,7 @@
2022.emnlp-main.128
adebara-etal-2022-afrolid
10.18653/v1/2022.emnlp-main.128
+
EvEntS ReaLM: Event Reasoning of Entity States via Language Models
@@ -1708,6 +1835,7 @@
2022.emnlp-main.129
spiliopoulou-etal-2022-events
10.18653/v1/2022.emnlp-main.129
+
Large language models are few-shot clinical information extractors
@@ -1721,6 +1849,7 @@
2022.emnlp-main.130
agrawal-etal-2022-large
10.18653/v1/2022.emnlp-main.130
+
Towards a Unified Multi-Dimensional Evaluator for Text Generation
@@ -1738,6 +1867,7 @@
2022.emnlp-main.131
zhong-etal-2022-towards
10.18653/v1/2022.emnlp-main.131
+
GeoMLAMA: Geo-Diverse Commonsense Probing on Multilingual Pre-Trained Language Models
@@ -1751,6 +1881,7 @@
2022.emnlp-main.132
yin-etal-2022-geomlama
10.18653/v1/2022.emnlp-main.132
+
The (Undesired) Attenuation of Human Biases by Multilinguality
@@ -1761,6 +1892,7 @@
2022.emnlp-main.133
espana-bonet-barron-cedeno-2022-undesired
10.18653/v1/2022.emnlp-main.133
+
Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning
@@ -1772,6 +1904,7 @@
2022.emnlp-main.134
tafjord-etal-2022-entailer
10.18653/v1/2022.emnlp-main.134
+
Near-Negative Distinction: Giving a Second Life to Human Evaluation Datasets
@@ -1784,6 +1917,7 @@
2022.emnlp-main.135
laban-etal-2022-near
10.18653/v1/2022.emnlp-main.135
+
ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection
@@ -1802,6 +1936,7 @@
2022.emnlp-main.136
alkhamissi-etal-2022-token
10.18653/v1/2022.emnlp-main.136
+
Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations
@@ -1814,6 +1949,7 @@
2022.emnlp-main.137
saha-etal-2022-hard
10.18653/v1/2022.emnlp-main.137
+
Stanceosaurus: Classifying Stance Towards Multicultural Misinformation
@@ -1828,6 +1964,7 @@
2022.emnlp-main.138.software.zip
zheng-etal-2022-stanceosaurus
10.18653/v1/2022.emnlp-main.138
+
Gendered Mental Health Stigma in Masked Language Models
@@ -1843,6 +1980,7 @@
2022.emnlp-main.139
lin-etal-2022-gendered
10.18653/v1/2022.emnlp-main.139
+
Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization
@@ -1856,6 +1994,7 @@
2022.emnlp-main.140
yadav-etal-2022-efficient
10.18653/v1/2022.emnlp-main.140
+
Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models
@@ -1867,6 +2006,7 @@
2022.emnlp-main.141
suzgun-etal-2022-prompt
10.18653/v1/2022.emnlp-main.141
+
Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts
@@ -1879,6 +2019,7 @@
2022.emnlp-main.142
zhou-etal-2022-learning-decompose
10.18653/v1/2022.emnlp-main.142
+
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
@@ -1892,6 +2033,7 @@
2022.emnlp-main.143
diwan-etal-2022-winoground
10.18653/v1/2022.emnlp-main.143
+
Gradient-based Constrained Sampling from Language Models
@@ -1903,6 +2045,7 @@
2022.emnlp-main.144
kumar-etal-2022-gradient
10.18653/v1/2022.emnlp-main.144
+
TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data
@@ -1918,6 +2061,7 @@
2022.emnlp-main.145
zhou-etal-2022-tacube
10.18653/v1/2022.emnlp-main.145
+
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
@@ -1931,6 +2075,7 @@
2022.emnlp-main.146.dataset.zip
chen-etal-2022-rich
10.18653/v1/2022.emnlp-main.146
+
QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation
@@ -1944,6 +2089,7 @@
2022.emnlp-main.147
yue-etal-2022-qa
10.18653/v1/2022.emnlp-main.147
+
When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain
@@ -1962,6 +2108,7 @@
2022.emnlp-main.148
shah-etal-2022-flue
10.18653/v1/2022.emnlp-main.148
+
Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer
@@ -1977,6 +2124,7 @@
2022.emnlp-main.149
jiang-etal-2022-retrieval
10.18653/v1/2022.emnlp-main.149
+
Reproducibility in Computational Linguistics: Is Source Code Enough?
@@ -1989,6 +2137,7 @@
2022.emnlp-main.150.software.zip
arvan-etal-2022-reproducibility-computational
10.18653/v1/2022.emnlp-main.150
+
Generating Information-Seeking Conversations from Unlabeled Documents
@@ -2001,6 +2150,7 @@
2022.emnlp-main.151
kim-etal-2022-generating
10.18653/v1/2022.emnlp-main.151
+
Distill The Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine Translation
@@ -2012,6 +2162,7 @@
2022.emnlp-main.152
peng-etal-2022-distill
10.18653/v1/2022.emnlp-main.152
+
A Multifaceted Framework to Evaluate Evasion, Content Preservation, and Misattribution in Authorship Obfuscation Techniques
@@ -2024,6 +2175,7 @@
2022.emnlp-main.153
altakrori-etal-2022-multifaceted
10.18653/v1/2022.emnlp-main.153
+
SafeText: A Benchmark for Exploring Physical Safety in Language Models
@@ -2039,6 +2191,7 @@
2022.emnlp-main.154
levy-etal-2022-safetext
10.18653/v1/2022.emnlp-main.154
+
Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations
@@ -2055,6 +2208,7 @@
2022.emnlp-main.155
yoo-etal-2022-ground
10.18653/v1/2022.emnlp-main.155
+
D4: a Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat
@@ -2072,6 +2226,7 @@
2022.emnlp-main.156.note.pdf
yao-etal-2022-d4
10.18653/v1/2022.emnlp-main.156
+
Exploiting domain-slot related keywords description for Few-Shot Cross-Domain Dialogue State Tracking
@@ -2088,6 +2243,7 @@
2022.emnlp-main.157
qixiang-etal-2022-exploiting
10.18653/v1/2022.emnlp-main.157
+
CoCoa: An Encoder-Decoder Model for Controllable Code-switched Generation
@@ -2101,6 +2257,7 @@
2022.emnlp-main.158
mondal-etal-2022-cocoa
10.18653/v1/2022.emnlp-main.158
+
Towards Climate Awareness in NLP Research
@@ -2114,6 +2271,7 @@
2022.emnlp-main.159
hershcovich-etal-2022-towards
10.18653/v1/2022.emnlp-main.159
+
Navigating Connected Memories with a Task-oriented Dialog System
@@ -2126,6 +2284,7 @@
2022.emnlp-main.160
kottur-etal-2022-navigating
10.18653/v1/2022.emnlp-main.160
+
Language Model Decomposition: Quantifying the Dependency and Correlation of Language Models
@@ -2135,6 +2294,7 @@
2022.emnlp-main.161
zhang-2022-language
10.18653/v1/2022.emnlp-main.161
+
SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser
@@ -2149,6 +2309,7 @@
2022.emnlp-main.162
zhang-etal-2022-syngec
10.18653/v1/2022.emnlp-main.162
+
Varifocal Question Generation for Fact-checking
@@ -2160,6 +2321,7 @@
2022.emnlp-main.163
ousidhoum-etal-2022-varifocal
10.18653/v1/2022.emnlp-main.163
+
Bilingual Lexicon Induction for Low-Resource Languages using Graph Matching via Optimal Transport
@@ -2173,6 +2335,7 @@
2022.emnlp-main.164
marchisio-etal-2022-bilingual
10.18653/v1/2022.emnlp-main.164
+
Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection
@@ -2189,6 +2352,7 @@
2022.emnlp-main.165
gururangan-etal-2022-whose
10.18653/v1/2022.emnlp-main.165
+
ConReader: Exploring Implicit Relations in Contracts for Contract Clause Extraction
@@ -2203,6 +2367,7 @@
2022.emnlp-main.166
xu-etal-2022-conreader
10.18653/v1/2022.emnlp-main.166
+
Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU
@@ -2214,6 +2379,7 @@
2022.emnlp-main.167
christopoulou-etal-2022-training
10.18653/v1/2022.emnlp-main.167
+
Revisiting Parameter-Efficient Tuning: Are We Really There Yet?
@@ -2226,6 +2392,7 @@
2022.emnlp-main.168
chen-etal-2022-revisiting
10.18653/v1/2022.emnlp-main.168
+
Transfer Learning from Semantic Role Labeling to Event Argument Extraction with Template-based Slot Querying
@@ -2237,6 +2404,7 @@
2022.emnlp-main.169
zhang-etal-2022-transfer
10.18653/v1/2022.emnlp-main.169
+
Calibrating Zero-shot Cross-lingual (Un-)structured Predictions
@@ -2248,6 +2416,7 @@
2022.emnlp-main.170
jiang-etal-2022-calibrating
10.18653/v1/2022.emnlp-main.170
+
PRINCE: Prefix-Masked Decoding for Knowledge Enhanced Sequence-to-Sequence Pre-Training
@@ -2262,6 +2431,7 @@
2022.emnlp-main.171.software.zip
xu-etal-2022-prince
10.18653/v1/2022.emnlp-main.171
+
How Far are We from Robust Long Abstractive Summarization?
@@ -2276,6 +2446,7 @@
2022.emnlp-main.172.dataset.zip
koh-etal-2022-far
10.18653/v1/2022.emnlp-main.172
+
Measuring Context-Word Biases in Lexical Semantic Datasets
@@ -2288,6 +2459,7 @@
liu-etal-2022-measuring
2022.emnlp-main.173.software.zip
10.18653/v1/2022.emnlp-main.173
+
Iteratively Prompt Pre-trained Language Models for Chain of Thought
@@ -2299,6 +2471,7 @@
2022.emnlp-main.174
wang-etal-2022-iteratively
10.18653/v1/2022.emnlp-main.174
+
Unobserved Local Structures Make Compositional Generalization Hard
@@ -2310,6 +2483,7 @@
2022.emnlp-main.175
bogin-etal-2022-unobserved
10.18653/v1/2022.emnlp-main.175
+
Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning
@@ -2403,6 +2577,7 @@
2022.emnlp-main.183
madaan-etal-2022-memory
10.18653/v1/2022.emnlp-main.183
+
LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation
@@ -2511,6 +2686,7 @@
2022.emnlp-main.191
xi-etal-2022-musied
10.18653/v1/2022.emnlp-main.191
+
Reproducibility Issues for BERT-based Evaluation Metrics
@@ -2685,6 +2861,7 @@
2022.emnlp-main.204
wang-etal-2022-r2f
10.18653/v1/2022.emnlp-main.204
+
Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Processing
@@ -2737,6 +2914,7 @@
2022.emnlp-main.207
wang-etal-2022-knowledge
10.18653/v1/2022.emnlp-main.207
+
On the Evaluation Metrics for Paraphrase Generation
@@ -2789,6 +2967,7 @@
2022.emnlp-main.211
qi-etal-2022-rasat
10.18653/v1/2022.emnlp-main.211
+
COM-MRC: A COntext-Masked Machine Reading Comprehension Framework for Aspect Sentiment Triplet Extraction
@@ -2892,6 +3071,7 @@
2022.emnlp-main.219
yang-etal-2022-face
10.18653/v1/2022.emnlp-main.219
+
FineD-Eval: Fine-grained Automatic Dialogue-Level Evaluation
@@ -2934,6 +3114,7 @@
2022.emnlp-main.222.software.zip
deng-etal-2022-rlprompt
10.18653/v1/2022.emnlp-main.222
+
DisCup: Discriminator Cooperative Unlikelihood Prompt-tuning for Controllable Text Generation
@@ -2963,6 +3144,7 @@
2022.emnlp-main.224.software.zip
he-etal-2022-cpl
10.18653/v1/2022.emnlp-main.224
+
Red Teaming Language Models with Language Models
@@ -3009,6 +3191,7 @@
2022.emnlp-main.227
wang-etal-2022-spanproto
10.18653/v1/2022.emnlp-main.227
+
Discovering Differences in the Representation of People using Contextualized Semantic Axes
@@ -3020,6 +3203,7 @@
2022.emnlp-main.228
lucy-etal-2022-discovering
10.18653/v1/2022.emnlp-main.228
+
Generating Literal and Implied Subquestions to Fact-check Complex Claims
@@ -3032,6 +3216,7 @@
2022.emnlp-main.229
chen-etal-2022-generating
10.18653/v1/2022.emnlp-main.229
+
Machine Translation Robustness to Natural Asemantic Variation
@@ -3066,6 +3251,7 @@
2022.emnlp-main.232
sultan-shahaf-2022-life
10.18653/v1/2022.emnlp-main.232
+
Language Contamination Helps Explains the Cross-lingual Capabilities of English Pretrained Models
@@ -3112,6 +3298,7 @@
2022.emnlp-main.236
zheng-etal-2022-distilling
10.18653/v1/2022.emnlp-main.236
+
Exploring the Secrets Behind the Learning Difficulty of Meaning Representations for Semantic Parsing
@@ -3137,6 +3324,7 @@
2022.emnlp-main.238
mcinerney-etal-2022-thats
10.18653/v1/2022.emnlp-main.238
+
Unsupervised Tokenization Learning
@@ -3147,6 +3335,7 @@
2022.emnlp-main.239
kolonin-ramesh-2022-unsupervised
10.18653/v1/2022.emnlp-main.239
+
A Template-based Method for Constrained Neural Machine Translation
@@ -3303,6 +3492,7 @@
2022.emnlp-main.251
stacey-etal-2022-logical
10.18653/v1/2022.emnlp-main.251
+
How to disagree well: Investigating the dispute tactics used on Wikipedia
@@ -3421,6 +3611,7 @@
2022.emnlp-main.261
white-etal-2022-mixed
10.18653/v1/2022.emnlp-main.261
+
On Measuring the Intrinsic Few-Shot Hardness of Datasets
@@ -3599,6 +3790,7 @@
2022.emnlp-main.274
dai-etal-2022-cgodial
10.18653/v1/2022.emnlp-main.274
+
Kernel-Whitening: Overcome Dataset Bias with Isotropic Sentence Embedding
@@ -3638,6 +3830,7 @@
2022.emnlp-main.277
shridhar-etal-2022-automatic
10.18653/v1/2022.emnlp-main.277
+
Mixture of Attention Heads: Selecting Attention Heads Per Token
@@ -3681,6 +3874,7 @@
2022.emnlp-main.280
yoon-etal-2022-information
10.18653/v1/2022.emnlp-main.280
+
DSM: Question Generation over Knowledge Base via Modeling Diverse Subgraphs with Meta-learner
@@ -3695,6 +3889,7 @@
2022.emnlp-main.281
guo-etal-2022-dsm
10.18653/v1/2022.emnlp-main.281
+
RelU-Net: Syntax-aware Graph U-Net for Relational Triple Extraction
@@ -3718,6 +3913,7 @@
2022.emnlp-main.283
bassignana-etal-2022-evidence
10.18653/v1/2022.emnlp-main.283
+
Chunk-based Nearest Neighbor Machine Translation
@@ -3757,6 +3953,7 @@
2022.emnlp-main.286.software.zip
pan-etal-2022-inductive
10.18653/v1/2022.emnlp-main.286
+
Improving Chinese Spelling Check by Character Pronunciation Prediction: The Effects of Adaptivity and Granularity
@@ -3852,6 +4049,7 @@
2022.emnlp-main.293
friedman-etal-2022-finding
10.18653/v1/2022.emnlp-main.293
+
Retrieval Augmentation for Commonsense Reasoning: A Unified Approach
@@ -3896,6 +4094,7 @@
2022.emnlp-main.296
yang-etal-2022-re3
10.18653/v1/2022.emnlp-main.296
+
Does Joint Training Really Help Cascaded Speech Translation?
@@ -3909,6 +4108,7 @@
2022.emnlp-main.297
tran-etal-2022-joint
10.18653/v1/2022.emnlp-main.297
+
MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition
@@ -3962,6 +4162,7 @@
2022.emnlp-main.299
benotti-blackburn-2022-ethics
10.18653/v1/2022.emnlp-main.299
+
Continued Pretraining for Better Zero- and Few-Shot Promptability
@@ -3977,6 +4178,7 @@
2022.emnlp-main.300
wu-etal-2022-continued
10.18653/v1/2022.emnlp-main.300
+
Less is More: Summary of Long Instructions is Better for Program Synthesis
@@ -3989,6 +4191,7 @@
2022.emnlp-main.301
kuznia-etal-2022-less
10.18653/v1/2022.emnlp-main.301
+
Is a Question Decomposition Unit All We Need?
@@ -4108,6 +4311,7 @@
huang-etal-2022-metalogic
2022.emnlp-main.310.software.zip
10.18653/v1/2022.emnlp-main.310
+
Explicit Query Rewriting for Conversational Dense Retrieval
@@ -4216,6 +4420,7 @@
2022.emnlp-main.318
zheng-etal-2022-candidate
10.18653/v1/2022.emnlp-main.318
+
Evaluating Parameter Efficient Learning for Generation
@@ -4268,6 +4473,7 @@
2022.emnlp-main.322
yang-ma-2022-improving
10.18653/v1/2022.emnlp-main.322
+
Differentially Private Language Models for Secure Data Sharing
@@ -4454,6 +4660,7 @@
2022.emnlp-main.335
peng-etal-2022-copen
10.18653/v1/2022.emnlp-main.335
+
Capturing Global Structural Information in Long Document Question Answering with Compressive Graph Selector Network
@@ -4557,6 +4764,7 @@
2022.emnlp-main.341
liu-etal-2022-metafill
10.18653/v1/2022.emnlp-main.341
+
DRLK: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graph for Question Answering
@@ -4569,6 +4777,7 @@
2022.emnlp-main.342
zhang-etal-2022-drlk
10.18653/v1/2022.emnlp-main.342
+
AEG: Argumentative Essay Generation via A Dual-Decoder Model with Content Planning
@@ -4613,6 +4822,7 @@
2022.emnlp-main.345
ma-etal-2022-wider
10.18653/v1/2022.emnlp-main.345
+
An Efficient Memory-Augmented Transformer for Knowledge-Intensive NLP Tasks
@@ -4653,6 +4863,7 @@
2022.emnlp-main.348
mikhailov-etal-2022-rucola
10.18653/v1/2022.emnlp-main.348
+
Complex Hyperbolic Knowledge Graph Embeddings with Fast Fourier Transform
@@ -4815,6 +5026,7 @@
2022.emnlp-main.360
kumar-etal-2022-indicnlg
10.18653/v1/2022.emnlp-main.360
+
Improving Machine Translation with Phrase Pair Injection and Corpus Filtering
@@ -4825,6 +5037,7 @@
2022.emnlp-main.361
batheja-bhattacharyya-2022-improving
10.18653/v1/2022.emnlp-main.361
+
An Anchor-based Relative Position Embedding Method for Cross-Modal Tasks
@@ -4867,6 +5080,7 @@
2022.emnlp-main.364
ju-etal-2022-telemelody
10.18653/v1/2022.emnlp-main.364
+
SEEN: Structured Event Enhancement Network for Explainable Need Detection of Information Recall Assistance
@@ -4879,6 +5093,7 @@
2022.emnlp-main.365
lin-etal-2022-seen
10.18653/v1/2022.emnlp-main.365
+
Rethinking Style Transformer with Energy-based Interpretation: Adversarial Unsupervised Style Transfer using a Pretrained Model
@@ -4960,6 +5175,7 @@
2022.emnlp-main.371
wang-etal-2022-helping
10.18653/v1/2022.emnlp-main.371
+
RACE: Retrieval-augmented Commit Message Generation
@@ -5013,6 +5229,7 @@
2022.emnlp-main.375
chen-etal-2022-murag
10.18653/v1/2022.emnlp-main.375
+
PHEE: A Dataset for Pharmacovigilance Event Extraction from Text
@@ -5050,6 +5267,7 @@
2022.emnlp-main.378
han-etal-2022-simqa
10.18653/v1/2022.emnlp-main.378
+
Discovering Low-rank Subspaces for Language-agnostic Multilingual Representations
@@ -5097,6 +5315,7 @@
2022.emnlp-main.382
zhong-etal-2022-training
10.18653/v1/2022.emnlp-main.382
+
Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages
@@ -5136,6 +5355,7 @@
2022.emnlp-main.385
slobodkin-etal-2022-controlled
10.18653/v1/2022.emnlp-main.385
+
Questioning the Validity of Summarization Datasets and Improving Their Factual Consistency
@@ -5192,6 +5412,7 @@
2022.emnlp-main.389
wolhandler-etal-2022-multi
10.18653/v1/2022.emnlp-main.389
+
BioReader: a Retrieval-Enhanced Text-to-Text Transformer for Biomedical Literature
@@ -5204,6 +5425,7 @@
2022.emnlp-main.390
frisoni-etal-2022-bioreader
10.18653/v1/2022.emnlp-main.390
+
T-Modules: Translation Modules for Zero-Shot Cross-Modal Machine Translation
@@ -5245,6 +5467,7 @@
2022.emnlp-main.393
hossain-blanco-2022-leveraging
10.18653/v1/2022.emnlp-main.393
+
GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation
@@ -5296,6 +5519,7 @@
2022.emnlp-main.397
glockner-etal-2022-missing
10.18653/v1/2022.emnlp-main.397
+
A Framework for Adapting Pre-Trained Language Models to Knowledge Graph Completion
@@ -5382,6 +5606,7 @@
2022.emnlp-main.404
marchisio-etal-2022-isovec
10.18653/v1/2022.emnlp-main.404
+
Adversarial Concept Erasure in Kernel Space
@@ -5470,6 +5695,7 @@
2022.emnlp-main.411
aly-vlachos-2022-natural
10.18653/v1/2022.emnlp-main.411
+
AX-MABSA: A Framework for Extremely Weakly Supervised Multi-label Aspect Based Sentiment Analysis
@@ -5492,6 +5718,7 @@
2022.emnlp-main.413
mirzaee-kordjamshidi-2022-transfer
10.18653/v1/2022.emnlp-main.413
+
A Survey of Active Learning for Natural Language Processing
@@ -5651,6 +5878,7 @@
2022.emnlp-main.425
shi-etal-2022-just
10.18653/v1/2022.emnlp-main.425
+
Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents
@@ -5701,6 +5929,7 @@
2022.emnlp-main.429
feng-etal-2022-uln
10.18653/v1/2022.emnlp-main.429
+
Federated Model Decomposition with Private Vocabulary for Text Classification
@@ -5768,6 +5997,7 @@
2022.emnlp-main.434
wang-etal-2022-breaking
10.18653/v1/2022.emnlp-main.434
+
Boundary-Driven Table-Filling for Aspect Sentiment Triplet Extraction
@@ -5822,6 +6052,7 @@
2022.emnlp-main.437
deng-etal-2022-title2event
10.18653/v1/2022.emnlp-main.437
+
Cascading Biases: Investigating the Effect of Heuristic Annotation Strategies on Data and Models
@@ -5833,6 +6064,7 @@
2022.emnlp-main.438
malaviya-etal-2022-cascading
10.18653/v1/2022.emnlp-main.438
+
Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts
@@ -5845,6 +6077,7 @@
2022.emnlp-main.439
trivedi-etal-2022-teaching
10.18653/v1/2022.emnlp-main.439
+
ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation
@@ -5857,6 +6090,7 @@
2022.emnlp-main.440
yin-etal-2022-addmu
10.18653/v1/2022.emnlp-main.440
+
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
@@ -5922,6 +6156,7 @@
2022.emnlp-main.445
sun-etal-2022-reduce
10.18653/v1/2022.emnlp-main.445
+
ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts
@@ -5971,6 +6206,7 @@
2022.emnlp-main.449
sun-etal-2022-meta
10.18653/v1/2022.emnlp-main.449
+
Understanding and Improving Knowledge Distillation for Quantization Aware Training of Large Transformer Encoders
@@ -5985,6 +6221,7 @@
2022.emnlp-main.450.software.zip
kim-etal-2022-understanding
10.18653/v1/2022.emnlp-main.450
+
Exploring Mode Connectivity for Pre-trained Language Models
@@ -6037,6 +6274,7 @@
2022.emnlp-main.454
xu-etal-2022-improving
10.18653/v1/2022.emnlp-main.454
+
Vector-Quantized Input-Contextualized Soft Prompts for Natural Language Understanding
@@ -6084,6 +6322,7 @@
2022.emnlp-main.458
huang-etal-2022-unifying
10.18653/v1/2022.emnlp-main.458
+
Modeling Label Correlations for Ultra-Fine Entity Typing with Neural Pairwise Conditional Random Field
@@ -6108,6 +6347,7 @@
2022.emnlp-main.460
chakrabarty-etal-2022-help
10.18653/v1/2022.emnlp-main.460
+
Open Relation and Event Type Discovery with Type Abstraction
@@ -6147,6 +6387,7 @@
2022.emnlp-main.463.software.zip
gong-etal-2022-revisiting
10.18653/v1/2022.emnlp-main.463
+
R2D2: Robust Data-to-Text with Replacement Detection
@@ -6173,6 +6414,7 @@
2022.emnlp-main.465.dataset.zip
putri-oh-2022-idk
10.18653/v1/2022.emnlp-main.465
+
XLM-D: Decorate Cross-lingual Pre-training Model as Non-Autoregressive Neural Machine Translation
@@ -6264,6 +6506,7 @@
2022.emnlp-main.472
kim-etal-2022-break
10.18653/v1/2022.emnlp-main.472
+
The Devil in Linear Transformer
@@ -6279,6 +6522,7 @@
2022.emnlp-main.473
qin-etal-2022-devil
10.18653/v1/2022.emnlp-main.473
+
Zero-Shot Learners for Natural Language Understanding via a Unified Multiple Choice Perspective
@@ -6448,6 +6692,7 @@
2022.emnlp-main.485
kongyoung-etal-2022-monoqa
10.18653/v1/2022.emnlp-main.485
+
Composing Ci with Reinforced Non-autoregressive Text Generation
@@ -6538,6 +6783,7 @@
2022.emnlp-main.491
xie-etal-2022-wr
10.18653/v1/2022.emnlp-main.491
+
Eeny, meeny, miny, moe. How to choose data for morphological inflection.
@@ -6602,6 +6848,7 @@
2022.emnlp-main.496.software.zip
senge-etal-2022-one
10.18653/v1/2022.emnlp-main.496
+
Counterfactual Recipe Generation: Exploring Compositional Generalization in a Realistic Scenario
@@ -6629,6 +6876,7 @@
2022.emnlp-main.498
kim-etal-2022-tutoring
10.18653/v1/2022.emnlp-main.498
+
Does Corpus Quality Really Matter for Low-Resource Languages?
@@ -6654,6 +6902,7 @@
2022.emnlp-main.500
plepi-etal-2022-unifying
10.18653/v1/2022.emnlp-main.500
+
Does Self-Rationalization Improve Robustness to Spurious Correlations?
@@ -6665,6 +6914,7 @@
2022.emnlp-main.501
ross-etal-2022-self
10.18653/v1/2022.emnlp-main.501
+
Efficient Pre-training of Masked Language Model via Concept-based Curriculum Masking
@@ -6678,6 +6928,7 @@
2022.emnlp-main.502
lee-etal-2022-efficient-pre
10.18653/v1/2022.emnlp-main.502
+
Subword Evenness (SuE) as a Predictor of Cross-lingual Transfer to Low-resource Languages
@@ -6819,6 +7070,7 @@
2022.emnlp-main.513
foroutan-etal-2022-discovering
10.18653/v1/2022.emnlp-main.513
+
Parameter-Efficient Tuning Makes a Good Classification Head
@@ -6895,6 +7147,7 @@
2022.emnlp-main.519
aumiller-etal-2022-eur
10.18653/v1/2022.emnlp-main.519
+
Differentiable Data Augmentation for Contrastive Sentence Representation Learning
@@ -7002,6 +7255,7 @@
2022.emnlp-main.527
muller-eberstein-etal-2022-spectral
10.18653/v1/2022.emnlp-main.527
+
QASem Parsing: Text-to-text Modeling of QA-based Semantics
@@ -7016,6 +7270,7 @@
2022.emnlp-main.528
klein-etal-2022-qasem
10.18653/v1/2022.emnlp-main.528
+
Keyphrase Generation via Soft and Hard Semantic Corrections
@@ -7028,6 +7283,7 @@
2022.emnlp-main.529
zhao-etal-2022-keyphrase
10.18653/v1/2022.emnlp-main.529
+
Modal-specific Pseudo Query Generation for Video Corpus Moment Retrieval
@@ -7041,6 +7297,7 @@
2022.emnlp-main.530
jung-etal-2022-modal
10.18653/v1/2022.emnlp-main.530
+
DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models
@@ -7070,6 +7327,7 @@
2022.emnlp-main.532
sarti-etal-2022-divemt
10.18653/v1/2022.emnlp-main.532
+
Bridging Fairness and Environmental Sustainability in Natural Language Processing
@@ -7118,6 +7376,7 @@
2022.emnlp-main.536
xue-aletras-2022-hashformers
10.18653/v1/2022.emnlp-main.536
+
MatchPrompt: Prompt-based Open Relation Extraction with Semantic Consistency Guided Clustering
@@ -7193,6 +7452,7 @@
2022.emnlp-main.542
xu-etal-2022-towards-robust
10.18653/v1/2022.emnlp-main.542
+
Enhancing Joint Multiple Intent Detection and Slot Filling with Global Intent-Slot Co-occurrence
@@ -7218,6 +7478,7 @@
2022.emnlp-main.544
giulianelli-2022-towards
10.18653/v1/2022.emnlp-main.544
+
LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal Modeling
@@ -7303,6 +7564,7 @@
2022.emnlp-main.551
wu-mooney-2022-entity
10.18653/v1/2022.emnlp-main.551
+
Cross-Linguistic Syntactic Difference in Multilingual BERT: How Good is It and How Does It Affect Transfer?
@@ -7318,6 +7580,7 @@
2022.emnlp-main.552
xu-etal-2022-cross
10.18653/v1/2022.emnlp-main.552
+
“It’s Not Just Hate”: A Multi-Dimensional Perspective on Detecting Harmful Speech Online
@@ -7332,6 +7595,7 @@
2022.emnlp-main.553
bianchi-etal-2022-just
10.18653/v1/2022.emnlp-main.553
+
Long Text Generation with Topic-aware Discrete Latent Variable Model
@@ -7361,6 +7625,7 @@
2022.emnlp-main.555
shu-etal-2022-tiara
10.18653/v1/2022.emnlp-main.555
+
Structure-Unified M-Tree Coding Solver for Math Word Problem
@@ -7391,6 +7656,7 @@
2022.emnlp-main.557
shao-etal-2022-formlm
10.18653/v1/2022.emnlp-main.557
+
Generate, Discriminate and Contrast: A Semi-Supervised Sentence Representation Learning Framework
@@ -7461,6 +7727,7 @@
2022.emnlp-main.562
chen-etal-2022-towards-table
10.18653/v1/2022.emnlp-main.562
+
Hierarchical Phrase-Based Sequence-to-Sequence Learning
@@ -7485,6 +7752,7 @@
2022.emnlp-main.564
sprague-etal-2022-natural
10.18653/v1/2022.emnlp-main.564
+
Character-centric Story Visualization via Visual Planning and Token Alignment
@@ -7582,6 +7850,7 @@
2022.emnlp-main.571
mohammadshahi-etal-2022-small
10.18653/v1/2022.emnlp-main.571
+
TextFusion: Privacy-Preserving Pre-trained Model Inference via Token Fusion
@@ -7611,6 +7880,7 @@
2022.emnlp-main.573
feng-boyd-graber-2022-learning
10.18653/v1/2022.emnlp-main.573
+
ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation
@@ -7624,6 +7894,7 @@
2022.emnlp-main.574
li-etal-2022-consisttl
10.18653/v1/2022.emnlp-main.574
+
Better Hit the Nail on the Head than Beat around the Bush: Removing Protected Attributes with a Single Projection
@@ -7648,6 +7919,7 @@
2022.emnlp-main.576
wang-etal-2022-ielm
10.18653/v1/2022.emnlp-main.576
+
ConNER: Consistency Training for Cross-lingual Named Entity Recognition
@@ -7701,6 +7973,7 @@
2022.emnlp-main.580
tan-etal-2022-revisiting
10.18653/v1/2022.emnlp-main.580
+
Towards Summary Candidates Fusion
@@ -7722,6 +7995,7 @@
2022.emnlp-main.582
zhao-calapodescu-2022-multimodal
10.18653/v1/2022.emnlp-main.582
+
TranSHER: Translating Knowledge Graph Embedding with Hyper-Ellipsoidal Restriction
@@ -7784,6 +8058,7 @@
2022.emnlp-main.587.dataset.zip
fei-etal-2022-beyond
10.18653/v1/2022.emnlp-main.587
+
Generalizing over Long Tail Concepts for Medical Term Normalization
@@ -7811,6 +8086,7 @@
2022.emnlp-main.589
song-etal-2022-unsupervised
10.18653/v1/2022.emnlp-main.589
+
Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks
@@ -7825,6 +8101,7 @@
2022.emnlp-main.590
leong-etal-2022-bloom
10.18653/v1/2022.emnlp-main.590
+
Disentangling Uncertainty in Machine Translation Evaluation
@@ -7837,6 +8114,7 @@
2022.emnlp-main.591
zerva-etal-2022-disentangling
10.18653/v1/2022.emnlp-main.591
+
Does Your Model Classify Entities Reasonably? Diagnosing and Mitigating Spurious Correlations in Entity Typing
@@ -7876,6 +8154,7 @@
2022.emnlp-main.594
vallurupalli-etal-2022-poque
10.18653/v1/2022.emnlp-main.594
+
Measuring the Mixing of Contextual Information in the Transformer
@@ -7922,6 +8201,7 @@
2022.emnlp-main.598
ravichander-etal-2022-condaqa
10.18653/v1/2022.emnlp-main.598
+
Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer
@@ -7950,6 +8230,7 @@
2022.emnlp-main.600
mohamed-etal-2022-artelingo
10.18653/v1/2022.emnlp-main.600
+
Decoding a Neural Retriever’s Latent Space for Query Suggestion
@@ -8090,6 +8371,7 @@
2022.emnlp-main.610
sadat-caragea-2022-hierarchical
10.18653/v1/2022.emnlp-main.610
+
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
@@ -8203,6 +8485,7 @@
2022.emnlp-main.618
chong-etal-2022-detecting
10.18653/v1/2022.emnlp-main.618
+
Intriguing Properties of Compression on Multilingual Models
@@ -8230,6 +8513,7 @@
2022.emnlp-main.620.dataset.zip
born-etal-2022-sequence
10.18653/v1/2022.emnlp-main.620
+
English Contrastive Learning Can Learn Universal Cross-lingual Sentence Embeddings
@@ -8282,6 +8566,7 @@
2022.emnlp-main.624
qiu-etal-2022-evaluating
10.18653/v1/2022.emnlp-main.624
+
“I’m sorry to hear that”: Finding New Biases in Language Models with a Holistic Descriptor Dataset
@@ -8309,6 +8594,7 @@
2022.emnlp-main.626
wang-etal-2022-understanding-multimodal
10.18653/v1/2022.emnlp-main.626
+
Semantic Novelty Detection and Characterization in Factual Text Involving Named Entities
@@ -8372,6 +8658,7 @@
2022.emnlp-main.631.dataset.zip
dou-etal-2022-improving
10.18653/v1/2022.emnlp-main.631
+
Entropy- and Distance-Based Predictors From GPT-2 Attention Patterns Predict Reading Times Over and Above GPT-2 Surprisal
@@ -8382,6 +8669,7 @@
2022.emnlp-main.632
oh-schuler-2022-entropy
10.18653/v1/2022.emnlp-main.632
+
A Survey of Computational Framing Analysis Approaches
@@ -8416,6 +8704,7 @@
2022.emnlp-main.635
min-etal-2022-dont
10.18653/v1/2022.emnlp-main.635
+
ALFRED-L: Investigating the Role of Language for Action Learning in Interactive Visual Environments
@@ -8432,6 +8721,7 @@
akula-etal-2022-alfred
2022.emnlp-main.636.software.zip
10.18653/v1/2022.emnlp-main.636
+
Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence
@@ -8459,6 +8749,7 @@
2022.emnlp-main.638.software.zip
cho-etal-2022-unsupervised
10.18653/v1/2022.emnlp-main.638
+
Weakly-Supervised Temporal Article Grounding
@@ -8491,6 +8782,7 @@
2022.emnlp-main.640
dong-etal-2022-exploring
10.18653/v1/2022.emnlp-main.640
+
arXivEdits: Understanding the Human Revision Process in Scientific Writing
@@ -8502,6 +8794,7 @@
2022.emnlp-main.641
jiang-etal-2022-arxivedits
10.18653/v1/2022.emnlp-main.641
+
Why Do You Feel This Way? Summarizing Triggers of Emotions in Social Media Posts
@@ -8514,6 +8807,7 @@
2022.emnlp-main.642
zhan-etal-2022-feel
10.18653/v1/2022.emnlp-main.642
+
Analogical Math Word Problems Solving with Enhanced Problem-Solution Association
@@ -8536,6 +8830,7 @@
2022.emnlp-main.644
dalvi-mishra-etal-2022-towards
10.18653/v1/2022.emnlp-main.644
+
Knowledge Transfer from Answer Ranking to Answer Generation
@@ -8549,6 +8844,7 @@
2022.emnlp-main.645
gabburo-etal-2022-knowledge
10.18653/v1/2022.emnlp-main.645
+
Perturbation Augmentation for Fairer NLP
@@ -8563,6 +8859,7 @@
2022.emnlp-main.646
qian-etal-2022-perturbation
10.18653/v1/2022.emnlp-main.646
+
Automatic Document Selection for Efficient Encoder Pretraining
@@ -8670,6 +8967,7 @@
2022.emnlp-main.654
wan-bansal-2022-evaluating
10.18653/v1/2022.emnlp-main.654
+
Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation
@@ -8683,6 +8981,7 @@
2022.emnlp-main.655
sclar-etal-2022-referee
10.18653/v1/2022.emnlp-main.655
+
Algorithms for Weighted Pushdown Automata
@@ -8709,6 +9008,7 @@
2022.emnlp-main.657.software.zip
he-etal-2022-mabel
10.18653/v1/2022.emnlp-main.657
+
Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs
@@ -8723,6 +9023,7 @@
2022.emnlp-main.658
richardson-etal-2022-breakpoint
10.18653/v1/2022.emnlp-main.658
+
Late Fusion with Triplet Margin Objective for Multimodal Ideology Prediction and Analysis
@@ -8735,6 +9036,7 @@
2022.emnlp-main.659
qiu-etal-2022-late
10.18653/v1/2022.emnlp-main.659
+
Leveraging QA Datasets to Improve Generative Data Augmentation
@@ -8747,6 +9049,7 @@
2022.emnlp-main.660
mekala-etal-2022-leveraging
10.18653/v1/2022.emnlp-main.660
+
Meta-Learning Fast Weight Language Models
@@ -8831,6 +9134,7 @@
2022.emnlp-main.667
balachandran-etal-2022-correcting
10.18653/v1/2022.emnlp-main.667
+
Coordinated Topic Modeling
@@ -8873,6 +9177,7 @@
2022.emnlp-main.670
patel-etal-2022-cripp
10.18653/v1/2022.emnlp-main.670
+
Entity-centered Cross-document Relation Extraction
@@ -8944,6 +9249,7 @@
2022.emnlp-main.675
gatti-etal-2022-vistot
10.18653/v1/2022.emnlp-main.675
+
Generative Entity-to-Entity Stance Detection with Knowledge Graph Augmentation
@@ -8955,6 +9261,7 @@
2022.emnlp-main.676
zhang-etal-2022-generative
10.18653/v1/2022.emnlp-main.676
+
Symptom Identification for Interpretable Detection of Multiple Mental Disorders on Social Media
@@ -8980,6 +9287,7 @@
2022.emnlp-main.678
kim-etal-2022-improving
10.18653/v1/2022.emnlp-main.678
+
CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning
@@ -9008,6 +9316,7 @@
2022.emnlp-main.680
lee-etal-2022-specializing
10.18653/v1/2022.emnlp-main.680
+
A Simple Contrastive Learning Framework for Interactive Argument Pair Identification via Argument-Context Extraction
@@ -9088,6 +9397,7 @@
2022.emnlp-main.686
nguyen-etal-2022-adaptive
10.18653/v1/2022.emnlp-main.686
+
Adaptive Token-level Cross-lingual Feature Mixing for Multilingual Neural Machine Translation
@@ -9126,6 +9436,7 @@
2022.emnlp-main.689
yang-etal-2022-low
10.18653/v1/2022.emnlp-main.689
+
Prompt-based Distribution Alignment for Domain Generalization in Text Classification
@@ -9188,6 +9499,7 @@
2022.emnlp-main.694
li-etal-2022-human
10.18653/v1/2022.emnlp-main.694
+
Continual Training of Language Models for Few-Shot Learning
@@ -9214,6 +9526,7 @@
2022.emnlp-main.696
wu-etal-2022-dictionary
10.18653/v1/2022.emnlp-main.696
+
Fine-Tuning Pre-trained Transformers into Decaying Fast Weights
@@ -9249,6 +9562,7 @@
2022.emnlp-main.699
shen-etal-2022-sentbs
10.18653/v1/2022.emnlp-main.699
+
A Fine-grained Chinese Software Privacy Policy Dataset for Sequence Labeling and Regulation Compliant Identification
@@ -9288,6 +9602,7 @@
2022.emnlp-main.702
hong-etal-2022-graph
10.18653/v1/2022.emnlp-main.702
+
DiscoSense: Commonsense Reasoning with Discourse Connectives
@@ -9321,6 +9636,7 @@
2022.emnlp-main.705
hu-etal-2022-mocha
10.18653/v1/2022.emnlp-main.705
+
Variational Autoencoder with Disentanglement Priors for Low-Resource Task-Specific Natural Language Generation
@@ -9365,6 +9681,7 @@
2022.emnlp-main.708
shen-etal-2022-mask
10.18653/v1/2022.emnlp-main.708
+
AMAL: Meta Knowledge-Driven Few-Shot Adapter Learning
@@ -9412,6 +9729,7 @@
2022.emnlp-main.712
kuribayashi-etal-2022-context
10.18653/v1/2022.emnlp-main.712
+
A Generative Model for End-to-End Argument Mining with Reconstructed Positional Encoding and Constrained Pointer Mechanism
@@ -9477,6 +9795,7 @@
2022.emnlp-main.716
mirchandani-etal-2022-fad
10.18653/v1/2022.emnlp-main.716
+
MM-Align: Learning Optimal Transport-based Alignment Dynamics for Fast and Accurate Inference on Missing Modality Sequences
@@ -9506,6 +9825,7 @@
2022.emnlp-main.718
moon-etal-2022-evaluating
10.18653/v1/2022.emnlp-main.718
+
MoSE: Modality Split and Ensemble for Multimodal Knowledge Graph Completion
@@ -9569,6 +9889,7 @@
2022.emnlp-main.723
sun-etal-2022-reorder
10.18653/v1/2022.emnlp-main.723
+
Making Science Simple: Corpora for the Lay Summarisation of Scientific Literature
@@ -9615,6 +9936,7 @@
2022.emnlp-main.727
zerveas-etal-2022-coder
10.18653/v1/2022.emnlp-main.727
+
AdapterShare: Task Correlation Modeling with Adapter Differentiation
@@ -9689,6 +10011,7 @@
2022.emnlp-main.733.software.zip
lee-etal-2022-pneg
10.18653/v1/2022.emnlp-main.733
+
Facilitating Contrastive Learning of Discourse Relational Senses by Exploiting the Hierarchy of Sense Relations
@@ -9722,6 +10045,7 @@
2022.emnlp-main.736
schmidt-etal-2022-dont
10.18653/v1/2022.emnlp-main.736
+
Towards Compositional Generalization in Code Search
@@ -9764,6 +10088,7 @@
2022.emnlp-main.739
raghu-etal-2022-structural
10.18653/v1/2022.emnlp-main.739
+
SLICER: Sliced Fine-Tuning for Low-Resource Cross-Lingual Transfer for Named Entity Recognition
@@ -9775,6 +10100,7 @@
2022.emnlp-main.740
schmidt-etal-2022-slicer
10.18653/v1/2022.emnlp-main.740
+
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
@@ -9827,6 +10153,7 @@
2022.emnlp-main.744.software.zip
jeong-etal-2022-kold
10.18653/v1/2022.emnlp-main.744
+
Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention
@@ -9840,6 +10167,7 @@
2022.emnlp-main.745
li-etal-2022-evade
10.18653/v1/2022.emnlp-main.745
+
The better your Syntax, the better your Semantics? Probing Pretrained Language Models for the English Comparative Correlative
@@ -9852,6 +10180,7 @@
2022.emnlp-main.746
weissweiler-etal-2022-better
10.18653/v1/2022.emnlp-main.746
+
ProofInfer: Generating Proof via Iterative Hierarchical Inference
@@ -9884,6 +10213,7 @@
2022.emnlp-main.748
mukherjee-etal-2022-ectsum
10.18653/v1/2022.emnlp-main.748
+
Cross-domain Generalization for AMR Parsing
@@ -9925,6 +10255,7 @@
2022.emnlp-main.751
albalak-etal-2022-feta
10.18653/v1/2022.emnlp-main.751
+
Do Children Texts Hold The Key To Commonsense Knowledge?
@@ -9947,6 +10278,7 @@
2022.emnlp-main.753
deutsch-etal-2022-limitations
10.18653/v1/2022.emnlp-main.753
+
Sampling-Based Approximations to Minimum Bayes Risk Decoding for Neural Machine Translation
@@ -9968,6 +10300,7 @@
2022.emnlp-main.755
aggarwal-etal-2022-indicxnli
10.18653/v1/2022.emnlp-main.755
+
Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems
@@ -9989,6 +10322,7 @@
2022.emnlp-main.757
jiang-etal-2022-semantic
10.18653/v1/2022.emnlp-main.757
+
XPrompt: Exploring the Extreme of Prompt Tuning
@@ -10030,6 +10364,7 @@
2022.emnlp-main.760
stengel-eskin-van-durme-2022-curious
10.18653/v1/2022.emnlp-main.760
+
SHARE: a System for Hierarchical Assistive Recipe Editing
@@ -10042,6 +10377,7 @@
2022.emnlp-main.761
li-etal-2022-share
10.18653/v1/2022.emnlp-main.761
+
IM2: an Interpretable and Multi-category Integrated Metric Framework for Automatic Dialogue Evaluation
@@ -10057,6 +10393,7 @@
2022.emnlp-main.762.software.zip
jiang-etal-2022-im2
10.18653/v1/2022.emnlp-main.762
+
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
@@ -10072,6 +10409,7 @@
2022.emnlp-main.763
yao-etal-2022-pevl
10.18653/v1/2022.emnlp-main.763
+
Pre-training Language Models with Deterministic Factual Knowledge
@@ -10088,6 +10426,7 @@
2022.emnlp-main.764
li-etal-2022-pre-training
10.18653/v1/2022.emnlp-main.764
+
Finding Skill Neurons in Pre-trained Transformer-based Language Models
@@ -10102,6 +10441,7 @@
2022.emnlp-main.765
wang-etal-2022-finding-skill
10.18653/v1/2022.emnlp-main.765
+
Prompt Conditioned VAE: Enhancing Generative Replay for Lifelong Learning in Task-Oriented Dialogue
@@ -10129,6 +10469,7 @@
2022.emnlp-main.767
don-yehiya-etal-2022-prequel
10.18653/v1/2022.emnlp-main.767
+
Can Transformers Reason in Fragments of Natural Language?
@@ -10140,6 +10481,7 @@
2022.emnlp-main.768
schlegel-etal-2022-transformers
10.18653/v1/2022.emnlp-main.768
+
Textless Speech Emotion Conversion using Discrete & Decomposed Representations
@@ -10196,6 +10538,7 @@
2022.emnlp-main.772
lin-byrne-2022-retrieval
10.18653/v1/2022.emnlp-main.772
+
Instance Regularization for Discriminative Language Model Pre-training
@@ -10235,6 +10578,7 @@
2022.emnlp-main.775
wang-etal-2022-scienceworld
10.18653/v1/2022.emnlp-main.775
+
Improving Embeddings Representations for Comparing Higher Education Curricula: A Use Case in Computing
@@ -10283,6 +10627,7 @@
2022.emnlp-main.779.software.zip
han-etal-2022-balancing
10.18653/v1/2022.emnlp-main.779
+
Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models
@@ -10297,6 +10642,7 @@
2022.emnlp-main.780.software.zip
xia-etal-2022-prompting
10.18653/v1/2022.emnlp-main.780
+
Identifying Physical Object Use in Sentences
@@ -10393,6 +10739,7 @@
2022.emnlp-main.788
pimentel-etal-2022-attentional
10.18653/v1/2022.emnlp-main.788
+
When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems
@@ -10409,6 +10756,7 @@
2022.emnlp-main.789
stengel-eskin-etal-2022-data
10.18653/v1/2022.emnlp-main.789
+
Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
@@ -10435,6 +10783,7 @@
2022.emnlp-main.791
pujari-etal-2022-three
10.18653/v1/2022.emnlp-main.791
+
Topic Modeling With Topological Data Analysis
@@ -10459,6 +10808,7 @@
2022.emnlp-main.793.software.zip
zhu-etal-2022-predicting
10.18653/v1/2022.emnlp-main.793
+
Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers
@@ -10482,6 +10832,7 @@
2022.emnlp-main.795
sancheti-etal-2022-agent
10.18653/v1/2022.emnlp-main.795
+
COLD: A Benchmark for Chinese Offensive Language Detection
@@ -10578,6 +10929,7 @@
2022.emnlp-main.802.software.zip
ostendorff-etal-2022-neighborhood
10.18653/v1/2022.emnlp-main.802
+
SPE: Symmetrical Prompt Enhancement for Fact Probing
@@ -10592,6 +10944,7 @@
2022.emnlp-main.803
li-etal-2022-spe
10.18653/v1/2022.emnlp-main.803
+
Efficient Large Scale Language Modeling with Mixtures of Experts
@@ -10651,6 +11004,7 @@
2022.emnlp-main.806
ko-etal-2022-discourse
10.18653/v1/2022.emnlp-main.806
+
Learning to Generate Overlap Summaries through Noisy Synthetic Data
@@ -10673,6 +11027,7 @@
2022.emnlp-main.808
jiang-etal-2022-mutual
10.18653/v1/2022.emnlp-main.808
+
Directions for NLP Practices Applied to Online Hate Speech Detection
@@ -10685,6 +11040,7 @@
2022.emnlp-main.809
fortuna-etal-2022-directions
10.18653/v1/2022.emnlp-main.809
+
Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection
@@ -10697,6 +11053,7 @@
2022.emnlp-main.810
di-liello-etal-2022-pre
10.18653/v1/2022.emnlp-main.810
+
OpenCQA: Open-ended Question Answering with Charts
@@ -10711,6 +11068,7 @@
2022.emnlp-main.811
kantharaj-etal-2022-opencqa
10.18653/v1/2022.emnlp-main.811
+
A Systematic Investigation of Commonsense Knowledge in Large Language Models
@@ -10725,6 +11083,7 @@
2022.emnlp-main.812
li-etal-2022-systematic
10.18653/v1/2022.emnlp-main.812
+
Transforming Sequence Tagging Into A Seq2Seq Task
@@ -10739,6 +11098,7 @@
2022.emnlp-main.813
raman-etal-2022-transforming
10.18653/v1/2022.emnlp-main.813
+
CycleKQR: Unsupervised Bidirectional Keyword-Question Rewriting
@@ -10855,6 +11215,7 @@
2022.emnlp-main.822
hangya-etal-2022-improving
10.18653/v1/2022.emnlp-main.822
+
SCROLLS: Standardized CompaRison Over Long Language Sequences
@@ -10874,6 +11235,7 @@
2022.emnlp-main.823
shaham-etal-2022-scrolls
10.18653/v1/2022.emnlp-main.823
+
PAR: Political Actor Representation Learning with Social Context and Expert Knowledge
@@ -10972,6 +11334,7 @@
2022.emnlp-tutorials.1
flanigan-etal-2022-meaning
10.18653/v1/2022.emnlp-tutorials.1
+
Arabic Natural Language Processing
@@ -10981,6 +11344,7 @@
2022.emnlp-tutorials.2
habash-2022-arabic
10.18653/v1/2022.emnlp-tutorials.2
+
Emergent Language-Based Coordination In Deep Multi-Agent Systems
@@ -10992,6 +11356,7 @@
2022.emnlp-tutorials.3
baroni-etal-2022-emergent
10.18653/v1/2022.emnlp-tutorials.3
+
CausalNLP Tutorial: An Introduction to Causality for Natural Language Processing
@@ -11003,6 +11368,7 @@
2022.emnlp-tutorials.4
jin-etal-2022-causalnlp
10.18653/v1/2022.emnlp-tutorials.4
+
Modular and Parameter-Efficient Fine-Tuning for NLP Models
@@ -11014,6 +11380,7 @@
2022.emnlp-tutorials.5
ruder-etal-2022-modular
10.18653/v1/2022.emnlp-tutorials.5
+
Non-Autoregressive Models for Fast Sequence Generation
@@ -11024,6 +11391,7 @@
2022.emnlp-tutorials.6
feng-shao-2022-non
10.18653/v1/2022.emnlp-tutorials.6
+
@@ -11058,6 +11426,7 @@
2022.emnlp-demos.1
jin-etal-2022-cogktr
10.18653/v1/2022.emnlp-demos.1
+
LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models
@@ -11074,6 +11443,7 @@
2022.emnlp-demos.2
geva-etal-2022-lm
10.18653/v1/2022.emnlp-demos.2
+
EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing
@@ -11091,6 +11461,7 @@
2022.emnlp-demos.3
wang-etal-2022-easynlp
10.18653/v1/2022.emnlp-demos.3
+
An Explainable Toolbox for Evaluating Pre-trained Vision-Language Models
@@ -11106,6 +11477,7 @@
2022.emnlp-demos.4
zhao-etal-2022-explainable
10.18653/v1/2022.emnlp-demos.4
+
TweetNLP: Cutting-Edge Natural Language Processing for Social Media
@@ -11124,6 +11496,7 @@
2022.emnlp-demos.5
camacho-collados-etal-2022-tweetnlp
10.18653/v1/2022.emnlp-demos.5
+
JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT
@@ -11135,6 +11508,7 @@
ohta-etal-2022-joeys2t
JoeyS2T is a JoeyNMT extension for speech-to-text tasks such as automatic speech recognition and end-to-end speech translation. It inherits the core philosophy of JoeyNMT, a minimalist NMT toolkit built on PyTorch, seeking simplicity and accessibility. JoeyS2T’s workflow is self-contained, starting from data pre-processing, over model training and prediction to evaluation, and is seamlessly integrated into JoeyNMT’s compact and simple code base. On top of JoeyNMT’s state-of-the-art Transformer-based Encoder-Decoder architecture, JoeyS2T provides speech-oriented components such as convolutional layers, SpecAugment, CTC-loss, and WER evaluation. Despite its simplicity compared to prior implementations, JoeyS2T performs competitively on English speech recognition and English-to-German speech translation benchmarks. The implementation is accompanied by a walk-through tutorial and available on https://github.com/may-/joeys2t.
10.18653/v1/2022.emnlp-demos.6
+
FairLib: A Unified Framework for Assessing and Improving Fairness
@@ -11149,6 +11523,7 @@
2022.emnlp-demos.7
han-etal-2022-fairlib
10.18653/v1/2022.emnlp-demos.7
+
ELEVANT: A Fully Automatic Fine-Grained Entity Linking Evaluation and Analysis Tool
@@ -11160,6 +11535,7 @@
2022.emnlp-demos.8
bast-etal-2022-elevant
10.18653/v1/2022.emnlp-demos.8
+
A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering
@@ -11172,6 +11548,7 @@
2022.emnlp-demos.9
maufe-etal-2022-pipeline
10.18653/v1/2022.emnlp-demos.9
+
DeepKE: A Deep Learning Based Knowledge Extraction Toolkit for Knowledge Base Population
@@ -11190,6 +11567,7 @@
2022.emnlp-demos.10
zhang-etal-2022-deepke
10.18653/v1/2022.emnlp-demos.10
+
AnEMIC: A Framework for Benchmarking ICD Coding Models
@@ -11203,6 +11581,7 @@
kim-etal-2022-anemic
Diagnostic coding, or ICD coding, is the task of assigning diagnosis codes defined by the ICD (International Classification of Diseases) standard to patient visits based on clinical notes. The current process of manual ICD coding is time-consuming and often error-prone, which suggests the need for automatic ICD coding. However, despite the long history of automatic ICD coding, there have been no standardized frameworks for benchmarking ICD coding models. We open-source an easy-to-use tool named AnEMIC, which provides a streamlined pipeline for preprocessing, training, and evaluating for automatic ICD coding. We correct errors in preprocessing by existing works, and provide key models and weights trained on the correctly preprocessed datasets. We also provide an interactive demo performing real-time inference from custom inputs, and visualizations drawn from explainable AI to analyze the models. We hope the framework helps move the research of ICD coding forward and helps professionals explore the potential of ICD coding. The framework and the associated code are available here.
10.18653/v1/2022.emnlp-demos.11
+
SPEAR : Semi-supervised Data Programming in Python
@@ -11218,6 +11597,7 @@
2022.emnlp-demos.12
abhishek-etal-2022-spear
10.18653/v1/2022.emnlp-demos.12
+
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
@@ -11236,6 +11616,7 @@
2022.emnlp-demos.13
von-werra-etal-2022-evaluate
10.18653/v1/2022.emnlp-demos.13
+
KeywordScape: Visual Document Exploration using Contextualized Keyword Embeddings
@@ -11248,6 +11629,7 @@
2022.emnlp-demos.14
voigt-etal-2022-keywordscape
10.18653/v1/2022.emnlp-demos.14
+
MedConQA: Medical Conversational Question Answering System based on Knowledge Graphs
@@ -11264,6 +11646,7 @@
2022.emnlp-demos.15
xia-etal-2022-medconqa
10.18653/v1/2022.emnlp-demos.15
+
Label Sleuth: From Unlabeled Text to a Classifier in a Few Hours
@@ -11282,6 +11665,7 @@
2022.emnlp-demos.16
shnarch-etal-2022-label
10.18653/v1/2022.emnlp-demos.16
+
AGReE: A system for generating Automated Grammar Reading Exercises
@@ -11294,6 +11678,7 @@
2022.emnlp-demos.17
chan-etal-2022-agree
10.18653/v1/2022.emnlp-demos.17
+
BotSIM: An End-to-End Bot Simulation Framework for Commercial Task-Oriented Dialog Systems
@@ -11308,6 +11693,7 @@
2022.emnlp-demos.18
wang-etal-2022-botsim
10.18653/v1/2022.emnlp-demos.18
+
DeepGen: Diverse Search Ad Generation and Real-Time Customization
@@ -11355,6 +11741,7 @@
zhang-etal-2022-automatic-comment
Automatic essay evaluation can help reduce teachers’ workload and enable students to refine their works rapidly. Previous studies focus mainly on giving discrete scores for either the holistic quality orseveral distinct traits. However, real-world teachers usually provide detailed comments in natural language, which are more informative than single scores. In this paper, we present the comment generation task, which aims to generate commentsfor specified segments from given student narrative essays. To tackle this task, we propose a planning-based generation model, which first plans a sequence of keywords, and then expands these keywords into a complete comment. To improve the correctness and informativeness of generated comments, we adopt two following techniques: (1) training an error correction module to filter out incorrect keywords, and (2) recognizing fine-grained structured features from source essays to enrich the keywords. To support the evaluation of the task, we collect a human-written Chinese dataset, which contains 22,399 essay-comment pairs. Extensive experiments show that our model outperforms strong baselines significantly. Moreover, we exert explicit control on our model to generate comments to describe the strengths or weaknesses of inputs with a 91% success rate. We deploy the model at http://coai.cs.tsinghua.edu.cn/static/essayComment/. A demo video is available at https://youtu.be/IuFVk8dUxbI. Our code and data are available at https://github.com/thu-coai/EssayCommentGen.
10.18653/v1/2022.emnlp-demos.21
+
MIC: A Multi-task Interactive Curation Tool
@@ -11367,6 +11754,7 @@
2022.emnlp-demos.22
yu-etal-2022-mic
10.18653/v1/2022.emnlp-demos.22
+
SUMMARY WORKBENCH: Unifying Application and Evaluation of Text Summarization Models
@@ -11378,6 +11766,7 @@
2022.emnlp-demos.23
syed-etal-2022-summary
10.18653/v1/2022.emnlp-demos.23
+
Arabic Word-level Readability Visualization for Assisted Text Simplification
@@ -11391,6 +11780,7 @@
2022.emnlp-demos.24
hazim-etal-2022-arabic
10.18653/v1/2022.emnlp-demos.24
+
LogiTorch: A PyTorch-based library for logical reasoning on natural language
@@ -11402,6 +11792,7 @@
2022.emnlp-demos.25
helwe-etal-2022-logitorch
10.18653/v1/2022.emnlp-demos.25
+
stopes - Modular Machine Translation Pipelines
@@ -11420,6 +11811,7 @@
2022.emnlp-demos.26
andrews-etal-2022-stopes
10.18653/v1/2022.emnlp-demos.26
+
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
@@ -11529,6 +11921,7 @@
2022.emnlp-demos.29
bianchi-etal-2022-twitter
10.18653/v1/2022.emnlp-demos.29
+
Azimuth: Systematic Error Analysis for Text Classification
@@ -11545,6 +11938,7 @@
2022.emnlp-demos.30
gauthier-melancon-etal-2022-azimuth
10.18653/v1/2022.emnlp-demos.30
+
SynKB: Semantic Search for Synthetic Procedures
@@ -11558,6 +11952,7 @@
2022.emnlp-demos.31
bai-etal-2022-synkb
10.18653/v1/2022.emnlp-demos.31
+
Camelira: An Arabic Multi-Dialect Morphological Disambiguator
@@ -11569,6 +11964,7 @@
2022.emnlp-demos.32
obeid-etal-2022-camelira
10.18653/v1/2022.emnlp-demos.32
+
POTATO: The Portable Text Annotation Tool
@@ -11584,6 +11980,7 @@
2022.emnlp-demos.33
pei-etal-2022-potato
10.18653/v1/2022.emnlp-demos.33
+
KGxBoard: Explainable and Interactive Leaderboard for Evaluation of Knowledge Graph Completion Models
@@ -11600,6 +11997,7 @@
2022.emnlp-demos.34
widjaja-etal-2022-kgxboard
10.18653/v1/2022.emnlp-demos.34
+
FALTE: A Toolkit for Fine-grained Annotation for Long Text Evaluation
@@ -11611,6 +12009,7 @@
2022.emnlp-demos.35
goyal-etal-2022-falte
10.18653/v1/2022.emnlp-demos.35
+
SEAL: Interactive Tool for Systematic Error Analysis and Labeling
@@ -11624,6 +12023,7 @@
2022.emnlp-demos.36
rajani-etal-2022-seal
10.18653/v1/2022.emnlp-demos.36
+
Hands-On Interactive Neuro-Symbolic NLP with DRaiL
@@ -11635,6 +12035,7 @@
2022.emnlp-demos.37
pacheco-etal-2022-hands
10.18653/v1/2022.emnlp-demos.37
+
Paraphrastic Representations at Scale
@@ -11660,6 +12061,7 @@
2022.emnlp-demos.39
razeghi-etal-2022-snoopy
10.18653/v1/2022.emnlp-demos.39
+
BMCook: A Task-agnostic Compression Toolkit for Big Models
@@ -11677,6 +12079,7 @@
zhang-etal-2022-bmcook
Recently, pre-trained language models (PLMs) have achieved great success on various NLP tasks and have shown a trend of exponential growth in model size. To alleviate the unaffordable computational costs brought by the size growth, model compression has been widely explored. Existing efforts have achieved promising results in compressing medium-sized models for specific tasks, while task-agnostic compression for big models with over billions of parameters is rarely studied. Task-agnostic compression can provide an efficient and versatile big model for both prompting and delta tuning, leading to a more general impact than task-specific compression. Hence, we introduce a task-agnostic compression toolkit BMCook for big models. In BMCook, we implement four representative compression methods, including quantization, pruning, distillation, and MoEfication. Developers can easily combine these methods towards better efficiency. To evaluate BMCook, we apply it to compress T5-3B (a PLM with 3 billion parameters). We achieve nearly 12x efficiency improvement while maintaining over 97% of the original T5-3B performance on three typical NLP benchmarks. Moreover, the final compressed model also significantly outperforms T5-base (a PLM with 220 million parameters), which has a similar computational cost. BMCook is publicly available at https://github.com/OpenBMB/BMCook.
10.18653/v1/2022.emnlp-demos.40
+
ALToolbox: A Set of Tools for Active Learning Annotation of Natural Language Texts
@@ -11698,6 +12101,7 @@
2022.emnlp-demos.41
tsvigun-etal-2022-altoolbox
10.18653/v1/2022.emnlp-demos.41
+
TextBox 2.0: A Text Generation Library with Pre-trained Language Models
@@ -11715,6 +12119,7 @@
2022.emnlp-demos.42
tang-etal-2022-textbox
10.18653/v1/2022.emnlp-demos.42
+
@@ -11743,6 +12148,7 @@
2022.emnlp-industry.1
fusco-etal-2022-unsupervised
10.18653/v1/2022.emnlp-industry.1
+
DynaMaR: Dynamic Prompt with Mask Token Representation
@@ -11759,6 +12165,7 @@
2022.emnlp-industry.2
sun-etal-2022-dynamar
10.18653/v1/2022.emnlp-industry.2
+
A Hybrid Approach to Cross-lingual Product Review Summarization
@@ -11771,6 +12178,7 @@
2022.emnlp-industry.3
soltan-etal-2022-hybrid
10.18653/v1/2022.emnlp-industry.3
+
Augmenting Operations Research with Auto-Formulation of Optimization Models From Problem Descriptions
@@ -11787,6 +12195,7 @@
2022.emnlp-industry.4
ramamonjison-etal-2022-augmenting
10.18653/v1/2022.emnlp-industry.4
+
Knowledge Distillation based Contextual Relevance Matching for E-commerce Product Search
@@ -11800,6 +12209,7 @@
2022.emnlp-industry.5
liu-etal-2022-knowledge
10.18653/v1/2022.emnlp-industry.5
+
Accelerating the Discovery of Semantic Associations from Medical Literature: Mining Relations Between Diseases and Symptoms
@@ -11811,6 +12221,7 @@
2022.emnlp-industry.6
purpura-etal-2022-accelerating
10.18653/v1/2022.emnlp-industry.6
+
PENTATRON: PErsonalized coNText-Aware Transformer for Retrieval-based cOnversational uNderstanding
@@ -11826,6 +12237,7 @@
2022.emnlp-industry.7
uma-naresh-etal-2022-pentatron
10.18653/v1/2022.emnlp-industry.7
+
Machine translation impact in E-commerce multilingual search
@@ -11836,6 +12248,7 @@
2022.emnlp-industry.8
zhang-misra-2022-machine
10.18653/v1/2022.emnlp-industry.8
+
Ask-and-Verify: Span Candidate Generation and Verification for Attribute Value Extraction
@@ -11850,6 +12263,7 @@
2022.emnlp-industry.9
ding-etal-2022-ask
10.18653/v1/2022.emnlp-industry.9
+
Consultation Checklists: Standardising the Human Evaluation of Medical Note Generation
@@ -11864,6 +12278,7 @@
2022.emnlp-industry.10
savkov-etal-2022-consultation
10.18653/v1/2022.emnlp-industry.10
+
Towards Need-Based Spoken Language Understanding Model Updates: What Have We Learned?
@@ -11876,6 +12291,7 @@
2022.emnlp-industry.11
do-etal-2022-towards
10.18653/v1/2022.emnlp-industry.11
+
Knowledge Distillation Transfer Sets and their Impact on Downstream NLU Tasks
@@ -11890,6 +12306,7 @@
2022.emnlp-industry.12
peris-etal-2022-knowledge
10.18653/v1/2022.emnlp-industry.12
+
Exploiting In-Domain Bilingual Corpora for Zero-Shot Transfer Learning in NLU of Intra-Sentential Code-Switching Chatbot Interactions
@@ -11904,6 +12321,7 @@
2022.emnlp-industry.13
aguirre-etal-2022-exploiting
10.18653/v1/2022.emnlp-industry.13
+
Calibrating Imbalanced Classifiers with Focal Loss: An Empirical Study
@@ -11918,6 +12336,7 @@
wang-etal-2022-calibrating
Imbalanced data distribution is a practical and common challenge in building production-level machine learning (ML) models in industry, where data usually exhibits long-tail distributions. For instance, in virtual AI Assistants, such as Google Assistant, Amazon Alexa and Apple Siri, the “play music” or “set timer” utterance is exposed to an order of magnitude more traffic than other skills. This can easily cause trained models to overfit to the majority classes, categories or intents, lead to model miscalibration. The uncalibrated models output unreliable (mostly overconfident) predictions, which are at high risk of affecting downstream decision-making systems. In this work, we study the calibration of production models in the industry use-case of predicting product return reason codes in customer service conversations of an online retail store; The returns reasons also exhibit class imbalance. To alleviate the resulting miscalibration in the production ML model, we streamline the model development and deployment using focal loss (CITATION).We empirically show the effectiveness of model training with focal loss in learning better calibrated models, as compared to standard cross-entropy loss. Better calibration, in turn, enables better control of the precision-recall trade-off for the models deployed in production.
10.18653/v1/2022.emnlp-industry.14
+
Unsupervised training data re-weighting for natural language understanding with local distribution approximation
@@ -11931,6 +12350,7 @@
2022.emnlp-industry.15
garrido-ramas-etal-2022-unsupervised
10.18653/v1/2022.emnlp-industry.15
+
Cross-Encoder Data Annotation for Bi-Encoder Based Product Matching
@@ -11941,6 +12361,7 @@
2022.emnlp-industry.16
chiu-shinzato-2022-cross
10.18653/v1/2022.emnlp-industry.16
+
Deploying a Retrieval based Response Model for Task Oriented Dialogues
@@ -11955,6 +12376,7 @@
2022.emnlp-industry.17
poddar-etal-2022-deploying
10.18653/v1/2022.emnlp-industry.17
+
Tackling Temporal Questions in Natural Language Interface to Databases
@@ -11967,6 +12389,7 @@
2022.emnlp-industry.18
vo-etal-2022-tackling
10.18653/v1/2022.emnlp-industry.18
+
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
@@ -11979,6 +12402,7 @@
2022.emnlp-industry.19
vishwanathan-etal-2022-multi
10.18653/v1/2022.emnlp-industry.19
+
Iterative Stratified Testing and Measurement for Automated Model Updates
@@ -11995,6 +12419,7 @@
2022.emnlp-industry.20
dekeyser-etal-2022-iterative
10.18653/v1/2022.emnlp-industry.20
+
SLATE: A Sequence Labeling Approach for Task Extraction from Free-form Inked Content
@@ -12013,6 +12438,7 @@
2022.emnlp-industry.21
gandhi-etal-2022-slate
10.18653/v1/2022.emnlp-industry.21
+
Gaining Insights into Unrecognized User Utterances in Task-Oriented Dialog Systems
@@ -12027,6 +12453,7 @@
2022.emnlp-industry.22
rabinovich-etal-2022-gaining
10.18653/v1/2022.emnlp-industry.22
+
CoCoID: Learning Contrastive Representations and Compact Clusters for Semi-Supervised Intent Discovery
@@ -12039,6 +12466,7 @@
2022.emnlp-industry.23
cao-etal-2022-cocoid
10.18653/v1/2022.emnlp-industry.23
+
Tractable & Coherent Multi-Document Summarization: Discrete Optimization of Multiple Neural Modeling Streams via Integer Linear Programming
@@ -12049,6 +12477,7 @@
2022.emnlp-industry.24
j-kurisinkel-chen-2022-tractable
10.18653/v1/2022.emnlp-industry.24
+
Grafting Pre-trained Models for Multimodal Headline Generation
@@ -12063,6 +12492,7 @@
2022.emnlp-industry.25
qiao-etal-2022-grafting
10.18653/v1/2022.emnlp-industry.25
+
Semi-supervised Adversarial Text Generation based on Seq2Seq models
@@ -12078,6 +12508,7 @@
2022.emnlp-industry.26
le-etal-2022-semi
10.18653/v1/2022.emnlp-industry.26
+
Is it out yet? Automatic Future Product Releases Extraction from Web Data
@@ -12089,6 +12520,7 @@
2022.emnlp-industry.27
fuchs-etal-2022-yet
10.18653/v1/2022.emnlp-industry.27
+
Automatic Scene-based Topic Channel Construction System for E-Commerce
@@ -12103,6 +12535,7 @@
2022.emnlp-industry.28
lin-etal-2022-automatic-scene
10.18653/v1/2022.emnlp-industry.28
+
SpeechNet: Weakly Supervised, End-to-End Speech Recognition at Industrial Scale
@@ -12121,6 +12554,7 @@
2022.emnlp-industry.29
tang-etal-2022-speechnet
10.18653/v1/2022.emnlp-industry.29
+
Controlled Language Generation for Language Learning Items
@@ -12132,6 +12566,7 @@
2022.emnlp-industry.30
stowe-etal-2022-controlled
10.18653/v1/2022.emnlp-industry.30
+
Improving Text-to-SQL Semantic Parsing with Fine-grained Query Understanding
@@ -12148,6 +12583,7 @@
2022.emnlp-industry.31
wang-etal-2022-improving-text
10.18653/v1/2022.emnlp-industry.31
+
Unsupervised Dense Retrieval for Scientific Articles
@@ -12160,6 +12596,7 @@
2022.emnlp-industry.32
li-etal-2022-unsupervised-dense
10.18653/v1/2022.emnlp-industry.32
+
Learning Geolocations for Cold-Start and Hard-to-Resolve Addresses via Deep Metric Learning
@@ -12170,6 +12607,7 @@
2022.emnlp-industry.33
govind-sohoney-2022-learning
10.18653/v1/2022.emnlp-industry.33
+
Meta-learning Pathologies from Radiology Reports using Variance Aware Prototypical Networks
@@ -12183,6 +12621,7 @@
2022.emnlp-industry.34
sehanobish-etal-2022-meta
10.18653/v1/2022.emnlp-industry.34
+
Named Entity Recognition in Industrial Tables using Tabular Language Models
@@ -12196,6 +12635,7 @@
2022.emnlp-industry.35
koleva-etal-2022-named
10.18653/v1/2022.emnlp-industry.35
+
Reinforced Question Rewriting for Conversational Question Answering
@@ -12210,6 +12650,7 @@
2022.emnlp-industry.36
chen-etal-2022-reinforced
10.18653/v1/2022.emnlp-industry.36
+
Improving Large-Scale Conversational Assistants using Model Interpretation based Training Sample Selection
@@ -12226,6 +12667,7 @@
2022.emnlp-industry.37
schroedl-etal-2022-improving
10.18653/v1/2022.emnlp-industry.37
+
Improving Precancerous Case Characterization via Transformer-based Ensemble Learning
@@ -12242,6 +12684,7 @@
2022.emnlp-industry.38
zhong-etal-2022-improving-precancerous
10.18653/v1/2022.emnlp-industry.38
+
Developing Prefix-Tuning Models for Hierarchical Text Classification
@@ -12253,6 +12696,7 @@
2022.emnlp-industry.39
chen-etal-2022-developing
10.18653/v1/2022.emnlp-industry.39
+
PAIGE: Personalized Adaptive Interactions Graph Encoder for Query Rewriting in Dialogue Systems
@@ -12266,6 +12710,7 @@
2022.emnlp-industry.40
bis-etal-2022-paige
10.18653/v1/2022.emnlp-industry.40
+
Fast Vocabulary Transfer for Language Model Compression
@@ -12278,6 +12723,7 @@
2022.emnlp-industry.41
gee-etal-2022-fast
10.18653/v1/2022.emnlp-industry.41
+
Multimodal Context Carryover
@@ -12296,6 +12742,7 @@
2022.emnlp-industry.42
wanigasekara-etal-2022-multimodal
10.18653/v1/2022.emnlp-industry.42
+
Distilling Multilingual Transformers into CNNs for Scalable Intent Classification
@@ -12308,6 +12755,7 @@
2022.emnlp-industry.43
fetahu-etal-2022-distilling
10.18653/v1/2022.emnlp-industry.43
+
Bringing the State-of-the-Art to Customers: A Neural Agent Assistant Framework for Customer Service Support
@@ -12326,6 +12774,7 @@
2022.emnlp-industry.44
obadinma-etal-2022-bringing
10.18653/v1/2022.emnlp-industry.44
+
Zero-Shot Dynamic Quantization for Transformer Inference
@@ -12337,6 +12786,7 @@
2022.emnlp-industry.45
el-kurdi-etal-2022-zero
10.18653/v1/2022.emnlp-industry.45
+
Fact Checking Machine Generated Text with Dependency Trees
@@ -12350,6 +12800,7 @@
2022.emnlp-industry.46
estes-etal-2022-fact
10.18653/v1/2022.emnlp-industry.46
+
Prototype-Representations for Training Data Filtering in Weakly-Supervised Information Extraction
@@ -12377,6 +12828,7 @@
2022.emnlp-industry.48
hao-etal-2022-cgf
10.18653/v1/2022.emnlp-industry.48
+
Entity-level Sentiment Analysis in Contact Center Telephone Conversations
@@ -12391,6 +12843,7 @@
2022.emnlp-industry.49
fu-etal-2022-entity
10.18653/v1/2022.emnlp-industry.49
+
QUILL: Query Intent with Large Language Models using Retrieval Augmentation and Multi-stage Distillation
@@ -12405,6 +12858,7 @@
2022.emnlp-industry.50
srinivasan-etal-2022-quill
10.18653/v1/2022.emnlp-industry.50
+
Distinguish Sense from Nonsense: Out-of-Scope Detection for Virtual Assistants
@@ -12418,6 +12872,7 @@
2022.emnlp-industry.51
qian-etal-2022-distinguish
10.18653/v1/2022.emnlp-industry.51
+
PLATO-Ad: A Unified Advertisement Text Generation Framework with Multi-Task Prompt Learning
@@ -12435,6 +12890,7 @@
2022.emnlp-industry.52
lei-etal-2022-plato
10.18653/v1/2022.emnlp-industry.52
+
Dense Feature Memory Augmented Transformers for COVID-19 Vaccination Search Classification
@@ -12451,6 +12907,7 @@
2022.emnlp-industry.53
gupta-etal-2022-dense
10.18653/v1/2022.emnlp-industry.53
+
Full-Stack Information Extraction System for Cybersecurity Intelligence
@@ -12461,6 +12918,7 @@
2022.emnlp-industry.54
park-lee-2022-full
10.18653/v1/2022.emnlp-industry.54
+
Deploying Unified BERT Moderation Model for E-Commerce Reviews
@@ -12471,6 +12929,7 @@
2022.emnlp-industry.55
nayak-garera-2022-deploying
10.18653/v1/2022.emnlp-industry.55
+
SimANS: Simple Ambiguous Negatives Sampling for Dense Text Retrieval
@@ -12489,6 +12948,7 @@
zhou-etal-2022-simans
Sampling proper negatives from a large document pool is vital to effectively train a dense retrieval model. However, existing negative sampling strategies suffer from the uninformative or false negative problem. In this work, we empirically show that according to the measured relevance scores, the negatives ranked around the positives are generally more informative and less likely to be false negatives. Intuitively, these negatives are not too hard (may be false negatives) or too easy (uninformative). They are the ambiguous negatives and need more attention during training.Thus, we propose a simple ambiguous negatives sampling method, SimANS, which incorporates a new sampling probability distribution to sample more ambiguous negatives.Extensive experiments on four public and one industry datasets show the effectiveness of our approach.We made the code and models publicly available in https://github.com/microsoft/SimXNS.
10.18653/v1/2022.emnlp-industry.56
+
Revisiting and Advancing Chinese Natural Language Understanding with Accelerated Heterogeneous Knowledge Pre-training
@@ -12506,6 +12966,7 @@
2022.emnlp-industry.57
zhang-etal-2022-revisiting
10.18653/v1/2022.emnlp-industry.57
+
A Stacking-based Efficient Method for Toxic Language Detection on Live Streaming Chat
@@ -12517,6 +12978,7 @@
2022.emnlp-industry.58
oikawa-etal-2022-stacking
10.18653/v1/2022.emnlp-industry.58
+
End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English
@@ -12528,6 +12990,7 @@
2022.emnlp-industry.59
goyal-etal-2022-end
10.18653/v1/2022.emnlp-industry.59
+
PILE: Pairwise Iterative Logits Ensemble for Multi-Teacher Labeled Distillation
@@ -12545,6 +13008,7 @@
2022.emnlp-industry.60
cai-etal-2022-pile
10.18653/v1/2022.emnlp-industry.60
+
A Comprehensive Evaluation of Biomedical Entity-centric Search
@@ -12568,6 +13032,7 @@
2022.emnlp-industry.62
morishita-etal-2022-domain
10.18653/v1/2022.emnlp-industry.62
+
Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework
@@ -12581,6 +13046,7 @@
2022.emnlp-industry.63
yoon-etal-2022-biomedical
10.18653/v1/2022.emnlp-industry.63
+
Large-scale Machine Translation for Indian Languages in E-commerce under Low Resource Constraints
@@ -12591,6 +13057,7 @@
2022.emnlp-industry.64
patil-garera-2022-large
10.18653/v1/2022.emnlp-industry.64
+
Topic Modeling by Clustering Language Model Embeddings: Human Validation on an Industry Dataset
@@ -12601,6 +13068,7 @@
2022.emnlp-industry.65
eklund-forsman-2022-topic
10.18653/v1/2022.emnlp-industry.65
+
@@ -12609,6 +13077,71 @@
Abu Dhabi
December 7–11, 2022
+
+ https://2022.emnlp.org
+ 2022.emnlp.handbook.pdf
+
+
+ Keynote 1: The multimodal language faculty and the visual languages of comics
+ NeilCohn
+ 2022.emnlp.keynote1.mp4
+
+
+ Keynote 2: Towards a Foundation for AGI
+ GaryMarcus
+ 2022.emnlp.keynote2.mp4
+
+
+ Industry Track Keynote: Takeaways from a systematic study of 75K models on Hugging Face
+ NazneenRajani
+ 2022.emnlp.keynote-industry.mp4
+
+
+ Opening Session
+ Noah A.Smith
+ YoavGoldberg
+ NizarHabash
+ 2022.emnlp.opening-session.mp4
+
+
+ Business Meeting
+ HangLi
+ MonaDiab
+ SienMoens
+ Noah A.Smith
+ AliceOh
+ MonojitChoudhury
+ NizarHabash
+ 2022.emnlp.business-meeting.mp4
+
+
+ Panel: Careers in NLP
+ BingXiang
+ HatemHaddad
+ AsliCelikyilmaz
+ YunyaoLi
+ 2022.emnlp.panel1.mp4
+
+
+ Panel: Conference Theme and Beyond
+ DipanjanDas
+ Marie-Catherinede Marneffe
+ Mausam
+ YuliaTsvetkov
+ AndreasVlachos
+ Noah A.Smith
+ 2022.emnlp.panel2.mp4
+
+
+ Best Papers Awards Session and Closing Session
+ Noah A.Smith
+ YoavGoldberg
+ AlessandroMoschitti
+ YangLiu
+ TimBaldwin
+ NizarHabash
+ 2022.emnlp.closing-session.mp4
+
2022.findings-emnlp
2022.conll-1
diff --git a/data/xml/2022.findings.xml b/data/xml/2022.findings.xml
index f4877e595e..e2bb697b6b 100644
--- a/data/xml/2022.findings.xml
+++ b/data/xml/2022.findings.xml
@@ -2407,6 +2407,7 @@
dugan-etal-2022-feasibility
10.18653/v1/2022.findings-acl.151
liamdugan/summary-qg
+
Relevant CommonSense Subgraphs for “What if...” Procedural Reasoning
@@ -2716,6 +2717,7 @@
2022.findings-acl.171
ponomareva-etal-2022-training
10.18653/v1/2022.findings-acl.171
+
Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers
@@ -3360,6 +3362,7 @@
Natural Questions
SNLI
SVHN
+
ASSIST: Towards Label Noise-Robust Dialogue State Tracking
@@ -9954,6 +9957,7 @@
2022.findings-emnlp.54
le-scao-etal-2022-language
10.18653/v1/2022.findings-emnlp.54
+
Enhancing Out-of-Distribution Detection in Natural Language Understanding via Implicit Layer Ensemble
@@ -16567,6 +16571,7 @@ Faster and Smaller Speech Translation without Quality Compromise
The shift of public debate to the digital sphere has been accompanied by a rise in online hate speech. While many promising approaches for hate speech classification have been proposed, studies often focus only on a single language, usually English, and do not address three key concerns: post-deployment performance, classifier maintenance and infrastructural limitations. In this paper, we introduce a new human-in-the-loop BERT-based hate speech classification pipeline and trace its development from initial data collection and annotation all the way to post-deployment. Our classifier, trained using data from our original corpus of over 422k examples, is specifically developed for the inherently multilingual setting of Switzerland and outperforms with its F1 score of 80.5 the currently best-performing BERT-based multilingual classifier by 5.8 F1 points in German and 3.6 F1 points in French. Our systematic evaluations over a 12-month period further highlight the vital importance of continuous, human-in-the-loop classifier maintenance to ensure robust hate speech classification post-deployment.
2022.findings-emnlp.548
kotarcic-etal-2022-human
+
diff --git a/data/xml/2022.naacl.xml b/data/xml/2022.naacl.xml
index 4d004f2cc2..543bfe077b 100644
--- a/data/xml/2022.naacl.xml
+++ b/data/xml/2022.naacl.xml
@@ -7981,6 +7981,7 @@
2022.naacl-tutorials.1
malmi-etal-2022-text
10.18653/v1/2022.naacl-tutorials.1
+
Self-supervised Representation Learning for Speech Processing
@@ -7998,6 +7999,7 @@
lee-etal-2022-self
10.18653/v1/2022.naacl-tutorials.2
s3prl/s3prl
+
New Frontiers of Information Extraction
@@ -8013,6 +8015,7 @@
chen-etal-2022-new
10.18653/v1/2022.naacl-tutorials.3
CoNLL++
+
Human-Centered Evaluation of Explanations
@@ -8028,6 +8031,7 @@
2022.naacl-tutorials.4
boyd-graber-etal-2022-human
10.18653/v1/2022.naacl-tutorials.4
+
Tutorial on Multimodal Machine Learning
@@ -8040,6 +8044,13 @@
morency-etal-2022-tutorial
10.18653/v1/2022.naacl-tutorials.5
Visual Question Answering
+
+
+
+
+
+
+
Contrastive Data and Learning for Natural Language Processing
@@ -8052,6 +8063,7 @@
2022.naacl-tutorials.6
zhang-etal-2022-contrastive-data
10.18653/v1/2022.naacl-tutorials.6
+
@@ -8675,6 +8687,61 @@
+
+ 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics
+ Seattle, Washington
+ July 10-15, 2022
+
+
+ https://2022.naacl.org
+ 2022.naacl.handbook.pdf
+
+
+ Keynote 1: Shaping Technology with Moral Imagination: Leveraging the Machinery of Value Sensitive Design
+ BatyaFriedman
+ 2022.naacl.keynote1.mp4
+
+
+ Keynote 2: NLP in Mexican Spanish: One of many stories
+ ManuelMontes-y-Gómez
+ 2022.naacl.keynote2.mp4
+
+
+ Opening Session
+ DanRoth
+ MarineCarpuat
+ IvanMeza-Ruiz
+ Marie-Catherinede Marneffe
+ 2022.naacl.opening-session.mp4
+
+
+ Business Meeting: Diversity and Inclusion Financial Accessibility Report
+ NedjmaOusidhoum
+ 2022.naacl.business-meeting.mp4
+
+
+ Panel: The Place of Linguistics and Symbolic Structures
+ Emily M.Bender
+ DilekHakkani-Tür
+ ChittaBaral
+ ChristopherManning
+ DanRoth
+ 2022.naacl.panel.mp4
+
+
+ Best Papers Awards Session
+ Marie-Catherinede Marneffe
+ 2022.naacl.best-papers.mp4
+
+
+ Closing Session
+ DanRoth
+ ChittaBaral
+ Emily M.Bender
+ DilekHakkani-Tür
+ ChristopherManning
+ 2022.naacl.closing-session.mp4
+
2022.findings-naacl
2022.autosimtrans-1
diff --git a/data/xml/2022.semeval.xml b/data/xml/2022.semeval.xml
index 60783b1363..319e67e9dd 100644
--- a/data/xml/2022.semeval.xml
+++ b/data/xml/2022.semeval.xml
@@ -1113,6 +1113,7 @@
10.18653/v1/2022.semeval-1.87
Hateful Memes
Hateful Memes Challenge
+
TeamOtter at SemEval-2022 Task 5: Detecting Misogynistic Content in Multimodal Memes
@@ -1984,6 +1985,7 @@
2022.semeval-1.155
chen-etal-2022-semeval
10.18653/v1/2022.semeval-1.155
+
EMBEDDIA at SemEval-2022 Task 8: Investigating Sentence, Image, and Knowledge Graph Representations for Multilingual News Article Similarity
diff --git a/data/xml/2022.tacl.xml b/data/xml/2022.tacl.xml
index a7ea87a051..12b0b5382f 100644
--- a/data/xml/2022.tacl.xml
+++ b/data/xml/2022.tacl.xml
@@ -321,6 +321,7 @@
307–324
2022.tacl-1.18
raifer-etal-2022-designing
+
Towards General Natural Language Understanding with Probabilistic Worldbuilding
@@ -455,6 +456,7 @@
484–502
2022.tacl-1.28
sarwar-etal-2022-neighborhood
+
Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval
@@ -574,6 +576,7 @@
639–658
2022.tacl-1.37
morio-etal-2022-end
+
Is My Model Using the Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning
@@ -1079,6 +1082,7 @@
1341–1356
2022.tacl-1.77
lachmy-etal-2022-draw
+
Investigating Reasons for Disagreement in Natural Language Inference
@@ -1089,6 +1093,7 @@
1357–1374
2022.tacl-1.78
jiang-marneffe-2022-investigating
+
The Emergence of Argument Structure in Artificial Languages
@@ -1125,6 +1130,7 @@
1423–1439
2022.tacl-1.81
sartran-etal-2022-transformer
+
Explainable Abuse Detection as Intent Classification and Slot Filling
@@ -1146,6 +1152,7 @@
1455–1472
2022.tacl-1.83
goldman-tsarfaty-2022-morphology
+
FaithDial: A Faithful Benchmark for Information-Seeking Dialogue
diff --git a/data/xml/2023.acl.xml b/data/xml/2023.acl.xml
index bd34dc77aa..03ad180cb3 100644
--- a/data/xml/2023.acl.xml
+++ b/data/xml/2023.acl.xml
@@ -17840,6 +17840,97 @@
+
+ 61st Annual Meeting of the Association for Computational Linguistics
+ Toronto, Canada
+ July 9-14, 2023
+
+
+ https://2023.aclweb.org
+ 2023.acl.handbook.pdf
+
+
+ Keynote 1: Two Paths to Intelligence
+ GeoffreyHinton
+ 2023.acl.keynote1.mp4
+
+
+ Keynote 2: Large Language Models as Cultural Technologies: Imitation and Innovation in Children and Models
+ AlisonGopnik
+ 2023.acl.keynote2.mp4
+
+
+ Lifetime Achievement Award: My Big, Fat 50 Year Journey
+ MarthaPalmer
+ 2023.acl.keynote3.mp4
+
+
+ Opening Session and Presidential Address
+ YangLiu
+ NaoakiOkazaki
+ AnnaRogers
+ JordanBoyd-Graber
+ IrynaGurevych
+ 2023.acl.opening-session.mp4
+
+
+ Business Meeting
+ IrynaGurevych
+ DavidYarowski
+ Emily M.Bender
+ 2023.acl.business-meeting.mp4
+
+
+ ARR Info Session
+ Mausam
+ 2023.acl.arr-info-session.mp4
+
+
+ Memorial for Dragomir Radev
+ KathyMcKeown
+ SmarandaMuresan
+ RadaMihalcea
+ LoriLevin
+ CathyFinegan-Dollak
+ AlexanderFabbri
+ 2023.acl.memorial-dragomir-radev.mp4
+
+
+ Panel: Large Language Models
+ DanKlein
+ MargaretMitchell
+ RoySchwartz
+ DiyiYang
+ IrynaGurevych
+ 2023.acl.panel.mp4
+
+
+ Lifetime Achievement Award Session
+ IrynaGurevych
+ JoakimNivre
+ YusukeMiyao
+ ChristopherManning
+ RichardSocher
+ IreneLangkilde
+ HinrichSchütze
+ 2023.acl.lifetime-achievement-award.mp4
+
+
+ Best Papers Awards Session
+ AnnaRogers
+ 2023.acl.best-papers.mp4
+
+
+ Closing Session
+ YangLiu
+ JongPark
+ BoxingChen
+ MichaelStrube
+ NaokiOkazaki
+ KevinDuh
+ ClaireGardent
+ 2023.acl.closing-session.mp4
+
2023.findings-acl
2023.americasnlp-1
diff --git a/data/xml/2023.cl.xml b/data/xml/2023.cl.xml
index 820954d873..0ff603cb3a 100644
--- a/data/xml/2023.cl.xml
+++ b/data/xml/2023.cl.xml
@@ -21,6 +21,7 @@
1–72
2023.cl-1.1
troiano-etal-2023-dimensional
+
Transformers and the Representation of Biomedical Background Knowledge
@@ -36,6 +37,7 @@
73–115
2023.cl-1.2
wysocki-etal-2023-transformers
+
It Takes Two Flints to Make a Fire: Multitask Learning of Neural Relation and Explanation Classifiers
@@ -46,6 +48,7 @@
117–156
2023.cl-1.3
tang-surdeanu-2023-takes
+
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future
@@ -57,6 +60,7 @@
157–198
2023.cl-1.4
klie-etal-2023-annotation
+
Curing the SICK and Other NLI Maladies
@@ -340,6 +344,7 @@
777–840
2023.cl-4.2
rashkin-etal-2023-measuring
+
Generation and Polynomial Parsing of Graph Languages with Non-Structural Reentrancies
@@ -351,6 +356,7 @@
841–882
2023.cl-4.3
bjorklund-etal-2023-generation
+
Capturing Fine-Grained Regional Differences in Language Use through Voting Precinct Embeddings
@@ -361,6 +367,7 @@
883–942
2023.cl-4.4
rosenfeld-hinrichs-2023-capturing
+
Languages Through the Looking Glass of BPE Compression
@@ -372,6 +379,7 @@
943–1001
2023.cl-4.5
gutierrez-vasques-etal-2023-languages
+
Language Embeddings Sometimes Contain Typological Generalizations
@@ -382,6 +390,7 @@
1003–1051
2023.cl-4.6
ostling-kurfali-2023-language
+
diff --git a/data/xml/2023.eacl.xml b/data/xml/2023.eacl.xml
index ef68a7341f..f555bcac6f 100644
--- a/data/xml/2023.eacl.xml
+++ b/data/xml/2023.eacl.xml
@@ -4487,6 +4487,8 @@
With NLP research now quickly being transferred into real-world applications, it is important to be aware of and think through the consequences of our scientific investigation. Such ethical considerations are important in both authoring and reviewing. This tutorial will equip participants with basic guidelines for thinking deeply about ethical issues and review common considerations that recur in NLP research. The methodology is interactive and participatory, including case studies and working in groups. Importantly, the participants will be co-building the tutorial outcomes and will be working to create further tutorial materials to share as public outcomes.
2023.eacl-tutorials.4
benotti-etal-2023-understanding
+
+
10.18653/v1/2023.eacl-tutorials.4
@@ -4525,6 +4527,56 @@
https://2023.eacl.org
2023.eacl.handbook.pdf
+
+ Keynote 1: Going beyond the benefits of scale by reasoning about data
+ EdwardGrefenstette
+ 2023.eacl.keynote1.mp4
+
+
+ Keynote 2: Chatbots for Good and Evil
+ KevinMunger
+ 2023.eacl.keynote2.mp4
+
+
+ Keynote 3: Language Use in Embodied AI
+ JoyceChai
+ 2023.eacl.keynote3.mp4
+
+
+ Opening Session
+ AlessandroMoschitti
+ IsabelleAugenstein
+ AndreasVlachos
+ KathleenMcKeown
+ RobertoBasili
+ MarkoTadić
+ 2023.eacl.opening-session.mp4
+
+
+ Business Meeting
+ PreslavNakov
+ Jochen L.Leidner
+ HaimingLiu
+ AndreasVlachos
+ IsabelleAugenstein
+ 2023.eacl.business-meeting.mp4
+
+
+ Panel: Low-resource Languages in NLP Products
+ ViktoriaKolomiets
+ MarianaRomanyshyn
+ OleksiiMolchanovskyi
+ OlesDobosevych
+ MarianaRomanyshyn
+ 2023.eacl.panel.mp4
+
+
+ Best Papers Awards Session and Closing Session
+ AlessandroMoschitti
+ IsabelleAugenstein
+ AndreasVlachos
+ 2023.eacl.closing-session.mp4
+
2023.findings-eacl
2023.bsnlp-1
diff --git a/data/xml/2023.emnlp.xml b/data/xml/2023.emnlp.xml
index f1a51d0ae6..54b991b813 100644
--- a/data/xml/2023.emnlp.xml
+++ b/data/xml/2023.emnlp.xml
@@ -7707,6 +7707,7 @@
2023.emnlp-main.550
cao-etal-2023-unnatural
10.18653/v1/2023.emnlp-main.550
+
Detecting and Mitigating Hallucinations in Multilingual Summarisation
@@ -15444,6 +15445,7 @@
2023.emnlp-demo.44
zhang-etal-2023-zhujiu
10.18653/v1/2023.emnlp-demo.44
+
PaperMage: A Unified Toolkit for Processing, Representing, and Manipulating Visually-Rich Scientific Documents
@@ -16716,6 +16718,59 @@
Singapore
December 6–10, 2023
+
+ https://2023.emnlp.org
+ 2023.emnlp.handbook.pdf
+
+
+ Keynote 1: Human-Centric Natural Language Processing
+ JongPark
+ 2023.emnlp.keynote1.mp4
+
+
+ Keynote 2: From Speech to Emotion to Mood: Mental Health Modeling in Real-World Environments.
+ Emily MowerProvost
+ 2023.emnlp.keynote2.mp4
+
+
+ Keynote 3: Academic NLP research in the Age of LLMs: Nothing but blue skies!
+ ChristopherManning
+ 2023.emnlp.keynote3.mp4
+
+
+ Opening Session
+ YujiMatsumoto
+ HoudaBouamor
+ JuanPino
+ KalikaBali
+ HaizhouLi
+ 2023.emnlp.opening-session.mp4
+
+
+ Business Meeting
+ MonaDiab
+ DavidYarowsky
+ JenniferRachford
+ IsabelleAugenstein
+ 2023.emnlp.business-meeting.mp4
+
+
+ Panel: Beyond Text: Inclusive Human Communication with Language Technology
+ Lourdesde Rioja
+ AbrahamGlasser
+ ChengkuoLee
+ María InésTorres
+ MonojitChoudhury
+ 2023.emnlp.panel.mp4
+
+
+ Best Papers Awards Session and Closing Session
+ KalikaBali
+ MichaelStrube
+ YujiMatsumoto
+ JordanBoyd-Graber
+ 2023.emnlp.closing-session.mp4
+
2023.findings-emnlp
2023.arabicnlp-1
diff --git a/data/xml/2023.gem.xml b/data/xml/2023.gem.xml
index 1a7ac29d58..3188c1bba0 100644
--- a/data/xml/2023.gem.xml
+++ b/data/xml/2023.gem.xml
@@ -116,6 +116,7 @@
In the dynamic field of eCommerce, the quality and comprehensiveness of product descriptions are pivotal for enhancing search visibility and customer engagement. Effective product descriptions can address the ‘cold start’ problem, align with market trends, and ultimately lead to increased click-through rates. Traditional methods for crafting these descriptions often involve significant human effort and may lack both consistency and scalability. This paper introduces a novel methodology for automating product description generation using the LLAMA 2.0 7B language model. We train the model on a dataset of authentic product descriptions from Walmart, one of the largest eCommerce platforms. The model is then fine-tuned for domain-specific language features and eCommerce nuances to enhance its utility in sales and user engagement. We employ multiple evaluation metrics—including NDCG, customer click-through rates, and human assessments—to validate the effectiveness of our approach. Our findings reveal that the system is not only scalable but also significantly reduces the human workload involved in creating product descriptions. This study underscores the considerable potential of large language models like LLAMA 2.0 7B in automating and optimizing various facets of eCommerce platforms, offering significant business impact, including improved search functionality and increased sales.
2023.gem-1.8
zhou-etal-2023-leveraging
+
QAMPARI: A Benchmark for Open-domain Questions with Many Answers
@@ -274,6 +275,7 @@
At the staggering pace with which the capabilities of large language models (LLMs) are increasing, creating future-proof evaluation sets to assess their understanding becomes more and more challenging. In this paper, we propose a novel paradigm for evaluating LLMs which leverages the idea that correct world understanding should be consistent across different (Fregean) senses of the same meaning. Accordingly, we measure understanding not in terms of correctness but by evaluating consistency across multiple senses that are generated by the model itself. We showcase our approach by instantiating a test where the different senses are different languages, hence using multilingual self-consistency as a litmus test for the model’s understanding and simultaneously addressing the important topic of multilingualism. Taking one of the latest versions of ChatGPT as our object of study, we evaluate multilingual consistency for two different tasks across three different languages. We show that its multilingual consistency is still lacking, and that its task and world understanding are thus not language-independent. As our approach does not require any static evaluation corpora in languages other than English, it can easily and cheaply be extended to different languages and tasks and could become an integral part of future benchmarking efforts.
2023.gem-1.22
ohmer-etal-2023-separating
+
Text Encoders Lack Knowledge: Leveraging Generative LLMs for Domain-Specific Semantic Textual Similarity
@@ -286,6 +288,7 @@
Amidst the sharp rise in the evaluation of large language models (LLMs) on various tasks, we find that semantic textual similarity (STS) has been under-explored. In this study, we show that STS can be cast as a text generation problem while maintaining strong performance on multiple STS benchmarks. Additionally, we show generative LLMs significantly outperform existing encoder-based STS models when characterizing the semantic similarity between two texts with complex semantic relationships dependent on world knowledge. We validate this claim by evaluating both generative LLMs and existing encoder-based STS models on three newly-collected STS challenge sets which require world knowledge in the domains of Health, Politics, and Sports. All newly-collected data is sourced from social media content posted after May 2023 to ensure the performance of closed-source models like ChatGPT cannot be credited to memorization. Our results show that, on average, generative LLMs outperform the best encoder-only baselines by an average of 22.3% on STS tasks requiring world knowledge. Our results suggest generative language models with STS-specific prompting strategies achieve state-of-the-art performance in complex, domain-specific STS tasks.
2023.gem-1.23
gatto-etal-2023-text
+
To Burst or Not to Burst: Generating and Quantifying Improbable Text
@@ -327,6 +330,7 @@
Data quality is a problem that perpetually resurfaces throughout the field of NLP, regardless of task, domain, or architecture, and remains especially severe for lower-resource languages. A typical and insidious issue, affecting both training data and model output, is data that is repetitive and dominated by linguistically uninteresting boilerplate, such as price catalogs or computer-generated log files. Though this problem permeates many web-scraped corpora, there has yet to be a benchmark to test against, or a systematic study to find simple metrics that generalize across languages and agree with human judgements of data quality. In the present work, we create and release BREAD, a human-labeled benchmark on repetitive boilerplate vs. plausible linguistic content, spanning 360 languages. We release several baseline CRED (Character REDundancy) scores along with it, and evaluate their effectiveness on BREAD. We hope that the community will use this resource to develop better filtering methods, and that our reference implementations of CRED scores can become standard corpus evaluation tools, driving the development of cleaner language modeling corpora, especially in low-resource languages.
2023.gem-1.27
caswell-etal-2023-separating
+
Elo Uncovered: Robustness and Best Practices in Language Model Evaluation
@@ -370,6 +374,7 @@
In the rapidly evolving landscape of Large Language Models (LLMs), introduction of well-defined and standardized evaluation methodologies remains a crucial challenge. This paper traces the historical trajectory of LLM evaluations, from the foundational questions posed by Alan Turing to the modern era of AI research. We categorize the evolution of LLMs into distinct periods, each characterized by its unique benchmarks and evaluation criteria. As LLMs increasingly mimic human-like behaviors, traditional evaluation proxies, such as the Turing test, have become less reliable. We emphasize the pressing need for a unified evaluation system, given the broader societal implications of these models. Through an analysis of common evaluation methodologies, we advocate for a qualitative shift in assessment approaches, underscoring the importance of standardization and objective criteria. This work serves as a call for the AI community to collaboratively address the challenges of LLM evaluation, ensuring their reliability, fairness, and societal benefit.
2023.gem-1.31
tikhonov-yamshchikov-2023-post
+
A Simple yet Efficient Ensemble Approach for AI-generated Text Detection
diff --git a/data/xml/2023.insights.xml b/data/xml/2023.insights.xml
index de647bb05f..801c09f2b4 100644
--- a/data/xml/2023.insights.xml
+++ b/data/xml/2023.insights.xml
@@ -221,6 +221,7 @@
We probe structural and discourse aspects of coreferential relationships in a fine-tuned Dutch BERT event coreference model. Previous research has suggested that no such knowledge is encoded in BERT-based models and the classification of coreferential relationships ultimately rests on outward lexical similarity. While we show that BERT can encode a (very) limited number of these discourse aspects (thus disproving assumptions in earlier research), we also note that knowledge of many structural features of coreferential relationships is absent from the encodings generated by the fine-tuned BERT model.
2023.insights-1.13
de-langhe-etal-2023-bert
+
Estimating Numbers without Regression
@@ -231,6 +232,7 @@
Despite recent successes in language models, their ability to represent numbers is insufficient. Humans conceptualize numbers based on their magnitudes, effectively projecting them on a number line; whereas subword tokenization fails to explicitly capture magnitude by splitting numbers into arbitrary chunks. To alleviate this shortcoming, alternative approaches have been proposed that modify numbers at various stages of the language modeling pipeline. These methods change either the (1) notation in which numbers are written (eg scientific vs decimal), the (2) vocabulary used to represent numbers or the entire (3) architecture of the underlying language model, to directly regress to a desired number. Previous work suggests that architectural change helps achieve state-of-the-art on number estimation but we find an insightful ablation - changing the model”s vocabulary instead (eg introduce a new token for numbers in range 10-100) is a far better trade-off. In the context of masked number prediction, a carefully designed tokenization scheme is both the simplest to implement and sufficient, ie with similar performance to the state-of-the-art approach that requires making significant architectural changes. Finally, we report similar trends on the downstream task of numerical fact estimation (for Fermi Problems) and discuss reasons behind our findings.
2023.insights-1.14
thawani-etal-2023-estimating
+
diff --git a/data/xml/2023.latechclfl.xml b/data/xml/2023.latechclfl.xml
index 6798865ba6..a56bcd8cd5 100644
--- a/data/xml/2023.latechclfl.xml
+++ b/data/xml/2023.latechclfl.xml
@@ -106,6 +106,7 @@
The subject of this article is the application of NLP and text-mining methods to the analysis of two large bibliographies: Polish one, based on the catalogs of the National Library in Warsaw, and the other German one, created by Deutsche Nationalbibliothek. The data in both collections are stored in MARC 21 format, allowing the selection of relevant fields that are used for further processing (basically author, title, and date). The volume of the Polish corpus (after filtering out non-relevant or incomplete items) includes 1.4 mln of records, and that of the German corpus 7.5 mln records. The time span of both bibliographies extends from 1801 to 2021. The aim of the study is to compare the gender distribution of book authors in Polish and German databases over more than two centuries. The proportions of male and female authors since 1801 were calculated automatically, and NLP methods such as document vector embedding based on deep BERT networks were used to extract topics from titles. The gender of the Polish authors was recognized based on the morphology of the first names, and that of the German authors based on a predefined list. The study found that the proportion of female authors has been steadily increasing both in Poland and in German countries (currently around 43%). However, the topics of women’s and men’s writings invariably remain different since 1801.
2023.latechclfl-1.7
pawlowski-walkowiak-2023-great
+
10.18653/v1/2023.latechclfl-1.7
@@ -157,6 +158,7 @@
Development funds are essential to finance climate change adaptation and are thus an important part of international climate policy. How ever, the absence of a common reporting practice makes it difficult to assess the amount and distribution of such funds. Research has questioned the credibility of reported figures, indicating that adaptation financing is in fact lower than published figures suggest. Projects claiming a greater relevance to climate change adaptation than they target are referred to as “overreported”. To estimate realistic rates of overreporting in large data sets over times, we propose an approach based on state-of-the-art text classification. To date, assessments of credibility have relied on small, manually evaluated samples. We use such a sample data set to train a classifier with an accuracy of 89.81%±0.83% (tenfold cross-validation) and extrapolate to larger data sets to identify overreporting. Additionally, we propose a method that incorporates evidence of smaller, higher-quality data to correct predicted rates using Bayes’ theorem. This enables a comparison of different annotation schemes to estimate the degree of overreporting in climate change adaptation. Our results support findings that indicate extensive overreporting of 32.03% with a credible interval of [19.81%; 48.34%].
2023.latechclfl-1.11
borst-etal-2023-constructing
+
10.18653/v1/2023.latechclfl-1.11
@@ -168,6 +170,7 @@
In this paper, we present approaches for the automated extraction and disambiguation of a part of the stylistic device Vossian Antonomasia (VA), namely the target entity that is described by the expression. We model the problem as a coreference resolution task and a question answering task and also combine both tasks. To tackle these tasks, we utilize state-of-the-art models in these areas. In addition, we visualize the connection between the source and target entities of VA in a web demo to get a deeper understanding of the interaction of entities used in VA expressions.
2023.latechclfl-1.12
schwab-etal-2023-madonna
+
10.18653/v1/2023.latechclfl-1.12
diff --git a/data/xml/2023.nlposs.xml b/data/xml/2023.nlposs.xml
index bc20d15adc..02ef7a3e2b 100644
--- a/data/xml/2023.nlposs.xml
+++ b/data/xml/2023.nlposs.xml
@@ -53,6 +53,7 @@
2023.nlposs-1.3
beauchemin-2023-deepparse
10.18653/v1/2023.nlposs-1.3
+
PyThaiNLP: Thai Natural Language Processing in Python
diff --git a/data/xml/2023.rail.xml b/data/xml/2023.rail.xml
index 3d10068dd5..03e3228f1e 100644
--- a/data/xml/2023.rail.xml
+++ b/data/xml/2023.rail.xml
@@ -66,6 +66,7 @@
This paper describes the SpeechReporting database, an online collection of corpora annotated for a range of discourse phenomena. The corpora contain folktales from 7 lesser-studied West African languages. Apart from its value for theoretical linguistics, especially for the study of reported speech, the database is an important resource for the preservation of intangible cultural heritage of minority languages and the development and testing of cross-linguistically applicable computational tools.
2023.rail-1.4
aplonova-etal-2023-speechreporting
+
10.18653/v1/2023.rail-1.4
diff --git a/data/xml/2023.tacl.xml b/data/xml/2023.tacl.xml
index 5154e6037f..7ee3677dee 100644
--- a/data/xml/2023.tacl.xml
+++ b/data/xml/2023.tacl.xml
@@ -22,6 +22,7 @@
1–17
2023.tacl-1.1
siriwardhana-etal-2023-improving
+
Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement
@@ -44,6 +45,7 @@
34–48
2023.tacl-1.3
valvoda-etal-2023-role
+
Meta-Learning a Cross-lingual Manifold for Semantic Parsing
@@ -68,6 +70,7 @@
68–84
2023.tacl-1.5
chen-etal-2023-opal
+
Helpful Neighbors: Leveraging Neighbors in Geographic Feature Pronunciation
@@ -92,6 +95,7 @@
102–121
2023.tacl-1.7
meister-etal-2023-locally
+
Improving Low-Resource Cross-lingual Parsing with Expected Statistic Regularization
@@ -115,6 +119,7 @@
139–156
2023.tacl-1.9
majewska-etal-2023-cross
+
Modeling Emotion Dynamics in Song Lyrics with State Space Models
@@ -125,6 +130,7 @@
157–175
2023.tacl-1.10
song-beck-2023-modeling
+
FeelingBlue: A Corpus for Understanding the Emotional Connotation of Color in Context
@@ -203,6 +209,7 @@
267–283
2023.tacl-1.16
chen-komachi-2023-discontinuous
+
Efficient Long-Text Understanding with Short-Text Models
@@ -292,6 +299,7 @@
384–403
2023.tacl-1.23
amini-etal-2023-naturalistic
+
Tracking Brand-Associated Polarity-Bearing Topics in User Reviews
@@ -420,6 +428,7 @@
565–581
2023.tacl-1.33
hong-etal-2023-visual-writing
+
Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing
@@ -875,6 +884,7 @@
1147–1161
2023.tacl-1.65
unanue-etal-2023-t3l
+
Introduction to Mathematical Language Processing: Informal Proofs, Word Problems, and Supporting Tasks
@@ -885,6 +895,7 @@
1162–1184
2023.tacl-1.66
meadows-freitas-2023-introduction
+
Evaluating a Century of Progress on the Cognitive Science of Adjective Ordering
@@ -960,6 +971,7 @@
1265–1282
2023.tacl-1.72
huang-etal-2023-2
+
PASTA: A Dataset for Modeling PArticipant STAtes in Narratives
@@ -1063,6 +1075,7 @@
1396–1415
2023.tacl-1.79
hu-etal-2023-multi-3
+
Can Authorship Representation Learning Capture Stylistic Features?
@@ -1103,6 +1116,7 @@
1451–1470
2023.tacl-1.82
wilcox-etal-2023-testing
+
Shared Lexical Items as Triggers of Code Switching
@@ -1116,6 +1130,7 @@
1471–1484
2023.tacl-1.83
wintner-etal-2023-shared
+
Learning More from Mixed Emotions: A Label Refinement Method for Emotion Recognition in Conversations
@@ -1144,6 +1159,7 @@
1500–1517
2023.tacl-1.85
guerreiro-etal-2023-hallucinations
+
PaniniQA: Enhancing Patient Education Through Interactive Question Answering
@@ -1164,6 +1180,7 @@
1518–1536
2023.tacl-1.86
cai-etal-2023-paniniqa
+
Discover, Explain, Improve: An Automatic Slice Detection Benchmark for Natural Language Processing
@@ -1178,6 +1195,7 @@
1537–1552
2023.tacl-1.87
hua-etal-2023-discover
+
Pre-train, Prompt, and Recommendation: A Comprehensive Survey of Language Modeling Paradigm Adaptations in Recommender Systems
@@ -1189,6 +1207,7 @@
1553–1571
2023.tacl-1.88
liu-etal-2023-pre
+
An Efficient Self-Supervised Cross-View Training For Sentence Embedding
@@ -1203,6 +1222,7 @@
1572–1587
2023.tacl-1.89
limkonchotiwat-etal-2023-efficient
+
General then Personal: Decoupling and Pre-training for Personalized Headline Generation
@@ -1215,6 +1235,7 @@
1588–1607
2023.tacl-1.90
song-etal-2023-general
+
Removing Backdoors in Pre-trained Models by Regularized Continual Pre-training
@@ -1233,6 +1254,7 @@
1608–1623
2023.tacl-1.91
zhu-etal-2023-removing
+
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
@@ -1252,6 +1274,7 @@
1643–1668
2023.tacl-1.92
fernandes-etal-2023-bridging
+
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR
@@ -1271,6 +1294,7 @@
1669–1685
2023.tacl-1.93
olatunji-etal-2023-afrispeech
+
MissModal: Increasing Robustness to Missing Modality in Multimodal Sentiment Analysis
@@ -1281,6 +1305,7 @@
1686–1702
2023.tacl-1.94
lin-hu-2023-missmodal
+
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision
@@ -1298,6 +1323,7 @@
1703–1718
2023.tacl-1.95
kharitonov-etal-2023-speak
+
ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of Semantic Interpretation
@@ -1309,6 +1335,7 @@
1719–1733
2023.tacl-1.96
wu-etal-2023-recogs
+
Data-driven Parsing Evaluation for Child-Parent Interactions
@@ -1336,6 +1363,7 @@
1754–1771
2023.tacl-1.98
agrawal-etal-2023-qameleon
+
diff --git a/data/xml/2024.cl.xml b/data/xml/2024.cl.xml
index 522df380d4..016a34b47c 100644
--- a/data/xml/2024.cl.xml
+++ b/data/xml/2024.cl.xml
@@ -19,6 +19,7 @@
1–24
2024.cl-1.1
palmer-2024-big
+
Rethinking the Exploitation of Monolingual Data for Low-Resource Neural Machine Translation
@@ -34,6 +35,7 @@
25–47
2024.cl-1.2
pang-etal-2024-rethinking
+
How Is a “Kitchen Chair” like a “Farm Horse”? Exploring the Representation of Noun-Noun Compound Semantics in Transformer-based Language Models
@@ -45,6 +47,7 @@
49–81
2024.cl-1.3
ormerod-etal-2024-kitchen
+
Universal Generation for Optimality Theory Is PSPACE-Complete
@@ -54,6 +57,7 @@
83–117
2024.cl-1.4
hao-2024-universal
+
Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question Answering
@@ -67,6 +71,7 @@
119–155
2024.cl-1.5
chaturvedi-etal-2024-analyzing
+
On the Role of Morphological Information for Contextual Lemmatization
diff --git a/data/xml/2024.eacl.xml b/data/xml/2024.eacl.xml
index 5c0b3e2e01..a12319790a 100644
--- a/data/xml/2024.eacl.xml
+++ b/data/xml/2024.eacl.xml
@@ -3627,11 +3627,58 @@
The 18th Conference of the European Chapter of the Association for Computational Linguistics
St. Julian’s, Malta
- March, 2024
+ March 17-22, 2024
https://2024.eacl.org
+ 2024.eacl.handbook.pdf
+
+
+ Karen Spärck Jones Award Lecture: Human vs. Generative AI in Content Creation Competition: Symbiosis or Conflict?
+ HongningWang
+ 2024.eacl.invited.mp4
+
+
+ Keynote 1: Quality Data for LLMs: Challenges and Opportunities for NLP
+ HinrichSchütze
+ 2024.eacl.keynote1.mp4
+
+
+ Keynote 2: Prompting is not all you need! Or why Structure and Representations still matter in NLP
+ MirellaLapata
+ 2024.eacl.keynote2.mp4
+
+
+ Opening Session
+ MichaelStrube
+ MatthewPurver
+ ClaudiaBorg
+ HaimingLiu
+ 2024.eacl.opening-session.mp4
+
+
+ Business Meeting
+ PreslavNakov
+ RobertoBasili
+ HaimingLiu
+ MatthewPurver
+ 2024.eacl.business-meeting.mp4
+
+
+ Best Papers Awards Session
+ BarbaraPlank
+ 2024.eacl.best-papers.mp4
+
+
+ Closing Session
+ YvetteGraham
+ MichaelStrube
+ PreslavNakov
+ MarkFinlayson
+ BarbaraPlank
+ 2024.eacl.closing-session.mp4
+
2024.findings-eacl
2024.caldpseudo-1
diff --git a/data/xml/2024.sigtyp.xml b/data/xml/2024.sigtyp.xml
index f9de00f525..9f916d86d9 100644
--- a/data/xml/2024.sigtyp.xml
+++ b/data/xml/2024.sigtyp.xml
@@ -36,6 +36,7 @@
Human processing of nonlocal syntactic dependencies requires the engagement of limited working memory for encoding, maintenance, and retrieval. This process creates an evolutionary pressure for language to be structured in a way that keeps the subparts of a dependency closer to each other, an efficiency principle termed dependency locality. The current study proposes that such a dependency locality pressure can be modulated by the surprisal of the antecedent, defined as the first part of a dependency, due to strategic allocation of working memory. In particular, antecedents with novel and unpredictable information are prioritized for memory encoding, receiving more robust representation against memory interference and decay, and thus are more capable of handling longer dependency length. We examine this claim by analyzing dependency corpora of 11 languages, with word surprisal generated from GPT-3 language model. In support of our hypothesis, we find evidence for a positive correlation between dependency length and the antecedent surprisal in most of the languages in our analyses. A closer look into the dependencies with core arguments shows that this correlation consistently holds for subject relations but not for object relations.
2024.sigtyp-1.1
xu-futrell-2024-syntactic
+
GUIDE: Creating Semantic Domain Dictionaries for Low-Resource Languages
@@ -47,6 +48,7 @@
Over 7,000 of the world’s 7,168 living languages are still low-resourced. This paper aims to narrow the language documentation gap by creating multiparallel dictionaries, clustered by SIL’s semantic domains. This task is new for machine learning and has previously been done manually by native speakers. We propose GUIDE, a language-agnostic tool that uses a GNN to create and populate semantic domain dictionaries, using seed dictionaries and Bible translations as a parallel text corpus. Our work sets a new benchmark, achieving an exemplary average precision of 60% in eight zero-shot evaluation languages and predicting an average of 2,400 dictionary entries. We share the code, model, multilingual evaluation data, and new dictionaries with the research community: https://github.com/janetzki/GUIDE
2024.sigtyp-1.2
janetzki-etal-2024-guide
+
A New Dataset for Tonal and Segmental Dialectometry from the Yue- and Pinghua-Speaking Area
@@ -57,6 +59,7 @@
Traditional dialectology or dialect geography is the study of geographical variation of language. Originated in Europe and pioneered in Germany and France, this field has predominantly been focusing on sounds, more specifically, on segments. Similarly, quantitative approaches to language variation concerned with the phonetic level are in most cases focusing on segments as well. However, more than half of the world’s languages include lexical tones (Yip, 2002). Despite this, tones are still underexplored in quantitative language comparison, partly due to the low accessibility of the suitable data. This paper aims to introduce a newly digitised dataset which comes from the Yue- and Pinghua-speaking areas in Southern China, with over 100 dialects. This dataset consists of two parts: tones and segments. In this paper, we illustrate how we can computationaly model tones in order to explore linguistic variation. We have applied a tone distance metric on our data, and we have found that 1) dialects also form a continuum on the tonal level and 2) other than tonemic (inventory) and tonetic differences, dialects can also differ in the lexical distribution of tones. The availability of this dataset will hopefully enable further exploration of the role of tones in quantitative typology and NLP research.
2024.sigtyp-1.3
sung-etal-2024-new
+
A Computational Model for the Assessment of Mutual Intelligibility Among Closely Related Languages
@@ -66,6 +69,7 @@
Closely related languages show linguistic similarities that allow speakers of one language to understand speakers of another language without having actively learned it. Mutual intelligibility varies in degree and is typically tested in psycholinguistic experiments. To study mutual intelligibility computationally, we propose a computer-assisted method using the Linear Discriminative Learner, a computational model developed to approximate the cognitive processes by which humans learn languages, which we expand with multilingual semantic vectors and multilingual sound classes. We test the model on cognate data from German, Dutch, and English, three closely related Germanic languages. We find that our model’s comprehension accuracy depends on 1) the automatic trimming of inflections and 2) the language pair for which comprehension is tested. Our multilingual modelling approach does not only offer new methodological findings for automatic testing of mutual intelligibility across languages but also extends the use of Linear Discriminative Learning to multilingual settings.
2024.sigtyp-1.4
nieder-list-2024-computational
+
Predicting Mandarin and Cantonese Adult Speakers’ Eye-Movement Patterns in Natural Reading
@@ -77,6 +81,7 @@
Please find the attached PDF file for the extended abstract of our study.
2024.sigtyp-1.5
junlin-etal-2024-predicting
+
The Typology of Ellipsis: A Corpus for Linguistic Analysis and Machine Learning Applications
@@ -107,6 +112,7 @@
This paper lays the groundwork for initiating research into Source Language Identification; the task of identifying the original language of a machine-translated text. We contribute a dataset of translations from a typologically diverse spectrum of languages into English and use it to set initial baselines for this novel task.
2024.sigtyp-1.8
reijnaers-pouw-2024-gtnc
+
Sociolinguistically Informed Interpretability: A Case Study on Hinglish Emotion Classification
@@ -130,6 +136,7 @@
In order to draw generalizable conclusions about the performance of multilingual models across languages, it is important to evaluate on a set of languages that captures linguistic diversity.Linguistic typology is increasingly used to justify language selection, inspired by language sampling in linguistics.However, justifications for ‘typological diversity’ exhibit great variation, as there seems to be no set definition, methodology or consistent link to linguistic typology.In this work, we provide a systematic insight into how previous work in the ACL Anthology uses the term ‘typological diversity’.Our two main findings are: 1) what is meant by typologically diverse language selection is not consistent and 2) the actual typological diversity of the language sets in these papers varies greatly.We argue that, when making claims about ‘typological diversity’, an operationalization of this should be included.A systematic approach that quantifies this claim, also with respect to the number of languages used, would be even better.
2024.sigtyp-1.10
poelman-etal-2024-call
+
Are Sounds Sound for Phylogenetic Reconstruction?
@@ -167,6 +174,7 @@
While massively multilingual speech models like wav2vec 2.0 XLSR-128 can be directly fine-tuned for automatic speech recognition (ASR), downstream performance can still be relatively poor on languages that are under-represented in the pre-training data. Continued pre-training on 70–200 hours of untranscribed speech in these languages can help — but what about languages without that much recorded data? For such cases, we show that supplementing the target language with data from a similar, higher-resource ‘donor’ language can help. For example, continued pretraining on only 10 hours of low-resource Punjabi supplemented with 60 hours of donor Hindi is almost as good as continued pretraining on 70 hours of Punjabi. By contrast, sourcing supplemental data from less similar donors like Bengali does not improve ASR performance. To inform donor language selection, we propose a novel similarity metric based on the sequence distribution of induced acoustic units: the Acoustic Token Distribution Similarity (ATDS). Across a set of typologically different target languages (Punjabi, Galician, Iban, Setswana), we show that the ATDS between the target language and its candidate donors precisely predicts target language ASR performance.
2024.sigtyp-1.13
san-etal-2024-predicting
+
ModeLing: A Novel Dataset for Testing Linguistic Reasoning in Language Models
@@ -182,6 +190,7 @@
Large language models (LLMs) perform well on (at least) some evaluations of both few-shot multilingual adaptation and reasoning. However, evaluating the intersection of these two skills—multilingual few-shot reasoning—is difficult: even relatively low-resource languages can be found in large training corpora, raising the concern that when we intend to evaluate a model’s ability to generalize to a new language, that language may have in fact been present during the model’s training. If such language contamination has occurred, apparent cases of few-shot reasoning could actually be due to memorization. Towards understanding the capability of models to perform multilingual few-shot reasoning, we propose modeLing, a benchmark of Rosetta stone puzzles. This type of puzzle, originating from competitions called Linguistics Olympiads, contain a small number of sentences in a target language not previously known to the solver. Each sentence is translated to the solver’s language such that the provided sentence pairs uniquely specify a single most reasonable underlying set of rules; solving requires applying these rules to translate new expressions (Figure 1). modeLing languages are chosen to be extremely low-resource such that the risk of training data contamination is low, and unlike prior datasets, it consists entirely of problems written specifically for this work, as a further measure against data leakage. Empirically, we find evidence that popular LLMs do not have data leakage on our benchmark.
2024.sigtyp-1.14
chi-etal-2024-modeling
+
TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages
@@ -191,6 +200,7 @@
We present our submission to the unconstrained subtask of the SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages for morphological annotation, POS-tagging, lemmatization, characterand word-level gap-filling. We developed a simple, uniform, and computationally lightweight approach based on the adapters framework using parameter-efficient fine-tuning. We applied the same adapter-based approach uniformly to all tasks and 16 languages by fine-tuning stacked language- and task-specific adapters. Our submission obtained an overall second place out of three submissions, with the first place in word-level gap-filling. Our results show the feasibility of adapting language models pre-trained on modern languages to historical and ancient languages via adapter training.
2024.sigtyp-1.15
dorkin-sirts-2024-tartunlp
+
Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers
@@ -200,6 +210,7 @@
Historical languages present unique challenges to the NLP community, with one prominent hurdle being the limited resources available in their closed corpora. This work describes our submission to the constrained subtask of the SIGTYP 2024 shared task, focusing on PoS tagging, morphological tagging, and lemmatization for 13 historical languages. For PoS and morphological tagging we adapt a hierarchical tokenization method from Sun et al. (2023) and combine it with the advantages of the DeBERTa-V3 architecture, enabling our models to efficiently learn from every character in the training data. We also demonstrate the effectiveness of characterlevel T5 models on the lemmatization task. Pre-trained from scratch with limited data, our models achieved first place in the constrained subtask, nearly reaching the performance levels of the unconstrained task’s winner. Our code is available at https://github.com/bowphs/ SIGTYP-2024-hierarchical-transformers
2024.sigtyp-1.16
riemenschneider-krahn-2024-heidelberg
+
UDParse @ SIGTYP 2024 Shared Task : Modern Language Models for Historical Languages
@@ -208,6 +219,7 @@
SIGTYP’s Shared Task on Word Embedding Evaluation for Ancient and Historical Languages was proposed in two variants, constrained or unconstrained. Whereas the constrained variant disallowed any other data to train embeddings or models than the data provided, the unconstrained variant did not have these limits. We participated in the five tasks of the unconstrained variant and came out first. The tasks were the prediction of part-of-speech, lemmas and morphological features and filling masked words and masked characters on 16 historical languages. We decided to use a dependency parser and train the data using an underlying pretrained transformer model to predict part-of-speech tags, lemmas, and morphological features. For predicting masked words, we used multilingual distilBERT (with rather bad results). In order to predict masked characters, our language model is extremely small: it is a model of 5-gram frequencies, obtained by reading the available training data.
2024.sigtyp-1.17
heinecke-2024-udparse
+
Allen Institute for AI @ SIGTYP 2024 Shared Task on Word Embedding Evaluation for Ancient and Historical Languages
diff --git a/data/xml/2024.tacl.xml b/data/xml/2024.tacl.xml
index c36ce91216..b28b524ab7 100644
--- a/data/xml/2024.tacl.xml
+++ b/data/xml/2024.tacl.xml
@@ -22,6 +22,7 @@
1–18
2024.tacl-1.1
glockner-etal-2024-ambifc
+
Language Varieties of Italy: Technology Challenges and Opportunities
@@ -31,6 +32,7 @@
19–38
2024.tacl-1.2
ramponi-2024-language
+
Benchmarking Large Language Models for News Summarization
@@ -45,6 +47,7 @@
39–57
2024.tacl-1.3
zhang-etal-2024-benchmarking
+
mGPT: Few-Shot Learners Go Multilingual
@@ -59,6 +62,7 @@
58–79
2024.tacl-1.4
shliazhko-etal-2024-mgpt
+
Cultural Adaptation of Recipes
@@ -322,6 +326,7 @@
432–448
2024.tacl-1.24
agrawal-carpuat-2024-text
+
Simultaneous Selection and Adaptation of Source Data via Four-Level Optimization
@@ -408,6 +413,7 @@
562–575
2024.tacl-1.31
staniek-etal-2024-text
+
Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions