Skip to content

Fix/Update AUSLAN dataset #103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions src/datasets/AUSLAN.json
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
{
"pub": {
"name": "AUSLAN",
"year": 2010,
"publication": "dataset:johnston2010archive",
"url": "https://elar.soas.ac.uk/Collection/MPI55247"
"year": 2008,
"publication": "dataset:johnston2008archive",
"url": "http://hdl.handle.net/2196/00-0000-0000-0000-D7CF-8"
},
"features": [],
"features": ["video", "gloss"],
"language": "Australian",
"#items": null,
"#samples": "1,100 Videos",
"#signers": 100,
"license": null,
"licenseUrl": null
"license": "Attribution",
"licenseUrl": "http://hdl.handle.net/2196/d8a991a5-d8cc-4f85-a5ff-c37279ebb625"
}
9 changes: 9 additions & 0 deletions src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -730,6 +730,11 @@ and so they have broken the dependency upon costly annotated gloss information i

@shi-etal-2022-open introduce OpenASL, a large-scale American Sign Language (ASL) - English dataset collected from online video sites (e.g., YouTube), and then propose a set of techniques including sign search as a pretext task for pre-training and fusion of mouthing and handshape features to improve translation quality in the absence of glosses and in the presence of visually challenging data.

In the First WMT Shared Task [@muller-etal-2022-findings], they found that about half of the participants chose to epresent sign language data as video frames using a visual feature extractor on the encoder side.
All submitted systems were sequence-to-sequence models based on Transformers [@vaswani2017attention].
<!-- TODO: which? -->


<!-- Really should put MMTLB here, a number of papers cite it including chen2022, which actually builds on it directly, cites it as a source for "mBART is good for SLT", etc. -->

@chen2022TwoStreamNetworkSign present a two-stream network for sign language recognition (SLR) and translation (SLT), utilizing a dual visual encoder architecture to encode RGB video frames and pose keypoints in separate streams.
Expand Down Expand Up @@ -792,6 +797,10 @@ and showed similar performance, with the transformer underperforming on the vali
They experimented with various normalization schemes, mainly subtracting the mean and dividing by the standard deviation of every individual keypoint
either concerning the entire frame or the relevant "object" (Body, Face, and Hand).

In the First WMT Shared Task [@muller-etal-2022-findings], the baseline system [@mueller2022sign-sockeye-baselines] used pose inputs.
In addition, they found that about half of the participants [@tarres-etal-2022-tackling;@hufe-avramidis-2022-experimental] chose to represent signed language data as poses.
All submitted systems were sequence-to-sequence models based on Transformers [@vaswani2017attention].

#### Text-to-Pose
Text-to-Pose, also known as sign language production, is the task of producing a sequence of poses that adequately represent
a spoken language text in sign language, as an intermediate representation to overcome challenges in animation.
Expand Down
78 changes: 77 additions & 1 deletion src/references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -496,7 +496,7 @@ @article{dataset:schembri2013building
year = {2013}
}

@inproceedings{dataset:johnston2010archive,
@inproceedings{dataset:johnston2008archive,
address = {The University of the Philippines Visayas Cebu College, Cebu City, Philippines},
author = {Johnston, Trevor},
booktitle = {Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation},
Expand Down Expand Up @@ -2352,6 +2352,82 @@ @inproceedings{muller-etal-2022-findings
year = {2022}
}

@inproceedings{hufe-avramidis-2022-experimental,
title = "Experimental Machine Translation of the {S}wiss {G}erman Sign Language via 3{D} Augmentation of Body Keypoints",
author = "Hufe, Lorenz and
Avramidis, Eleftherios",
booktitle = "Proceedings of the Seventh Conference on Machine Translation (WMT)",
month = dec,
year = "2022",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.wmt-1.95",
pages = "983--988"
}

@inproceedings{tarres-etal-2022-tackling,
title = "Tackling Low-Resourced Sign Language Translation: {UPC} at {WMT}-{SLT} 22",
author = "Tarres, Laia and
G{\'a}llego, Gerard I. and
Giro-i-nieto, Xavier and
Torres, Jordi",
booktitle = "Proceedings of the Seventh Conference on Machine Translation (WMT)",
month = dec,
year = "2022",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.wmt-1.97",
pages = "994--1000"
}

@inproceedings{hamidullah-etal-2022-spatio,
title = "Spatio-temporal Sign Language Representation and Translation",
author = "Hamidullah, Yasser and
Van Genabith, Josef and
Espa{\~n}a-bonet, Cristina",
booktitle = "Proceedings of the Seventh Conference on Machine Translation (WMT)",
month = dec,
year = "2022",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.wmt-1.94",
pages = "977--982"
}

@inproceedings{dey-etal-2022-clean,
title = "Clean Text and Full-Body Transformer: {M}icrosoft{'}s Submission to the {WMT}22 Shared Task on Sign Language Translation",
author = "Dey, Subhadeep and
Pal, Abhilash and
Chaabani, Cyrine and
Koller, Oscar",
booktitle = "Proceedings of the Seventh Conference on Machine Translation (WMT)",
month = dec,
year = "2022",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.wmt-1.93",
pages = "969--976"
}

@inproceedings{shi-etal-2022-ttics,
title = "{TTIC}{'}s {WMT}-{SLT} 22 Sign Language Translation System",
author = "Shi, Bowen and
Brentari, Diane and
Shakhnarovich, Gregory and
Livescu, Karen",
booktitle = "Proceedings of the Seventh Conference on Machine Translation (WMT)",
month = dec,
year = "2022",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.wmt-1.96",
pages = "989--993"
}

@misc{mueller2022sign-sockeye-baselines,
title={Sockeye baseline models for sign language translation},
author={M\"{u}ller, Mathias and Rios, Annette and Moryossef, Amit},
howpublished={\url{https://github.com/bricksdont/sign-sockeye-baselines}},
year={2022}
}



@inproceedings{shi-etal-2022-open,
address = {Abu Dhabi, United Arab Emirates},
author = {Shi, Bowen and
Expand Down