A list of useful resources in the sign language recognition, translation or generation using different languages. Created during the hearai.pl project.
Feel free to add issue with short description of new publication or create a pull request - add the new resource to the table or fill missing description.
- Hear AI project: hearai/hearai
Dataset | Language | Classes/task decription | Size | Data type | Adnotation type | Language level | Link | Licence | Downloaded |
---|---|---|---|---|---|---|---|---|---|
DGS | German SL | Signers were asked to do one of 20 activities like: describing something and storytelling | 50 hours | video | iLex, srt (subtitles), openpose (json), elan, cmdi | continous | DGS | Licence | partially |
PJM | Polish SL | Signers were asked to tell the story of picture content and clips, tell videos about themselves and talk about topics that interest them. | 400 hours | video | text | continous | PJM | no info | partialy |
Signor Corpus | Slovenian SL | Most of the recording sessions were performed on the premises of the local deaf clubs, only in some cases the mobile recording team visited the informant at home. | no info | video | tokenization, iLex, HamNoSys, gloss | continous | Signor | not avaible | not avaible yet due to datta protection issues |
Dicta Sign Lexicon | British SL Greek SL German SL French SL |
1000 words and phrases and many videos where signers tell some stories | no info | video | gloss/translation (?) | both | Dicta | no info | no |
Galex | German SL | gardening and landscape vocabulary | 654 technical phrases | video | gloss | isolated | GaLex | no info | no |
LEDASILA | Austrian SL | words, different categories | no info | video | meaning (gloss) and description of the move | isolated | LedaSila | Creative commons | in contact with owners |
RWTH-PHOENIX | German SL | It is a sign language transcription of a weather forecast. | 53GB | video (All recorded videos are at 25 frames per second and the size of the frames is 210 by 260 pixels) | gloss | continous | Phoenix | seems to be open to use (not specified directly on the page) | yes |
GSLL | Greek SL | 347 different signs/classes | 3,464 videos /42 GB | video | gloss/traslation | isolated | GSLL | publicly avaible, no info about restrictions | yes |
SIGNUM | German SL | words and phrases (used on a daily basis) | no info | videos | gloss/translation | both | SIGNUM | no info about restrictions | no |
MS-ASL | American SL | 1000 signs | 25000 videos | video | bbox, gloss, | isolated | MS ASL | publicly available | yes |
AUTSL | Turkish SL | 226 signs that are performed by 43 different signers | 38336 videos | video recorded using Microsoft Kinect v2 in RGB, depth and skeleton formats | spatial coordinates of the 25 junction points on the signer body, gloss | isolated | AUTSL | public but not uploaded yet (?) | no |
WLASL | American SL | 3126 glosses | 34404 videos | video | gloss, bbox,temporal boundary (start, end of frame), dialect of SL | isolated | WLASL | Computational Use of Data Agreement (C-UDA) | yes, partially without yt videos |
SPREAD THE SIGN | Polish SL British SL German SL Russian SL French SL American SL Spanish SL |
25030 words in 42 languages | 574273 videos | video | gloss | isolated | Spread the sign | no info | yes |
CSL-Daily | Chinese SL | travel, shopping, medical care and daily-live words | no info | video | spoken language translations and gloss-level annotations | continous | CSL | you need to sign the agreement with the USTC link, research only, noncommercial use | no |
NMFs-CSL | Chinese SL | 1,067 Chinese sign words (610 confusing words , 457 normal words | no info | RGB videos | gloss | isolated | NMFs CSL | you need to sign the agreement with the USTC link, research only, noncommercial use | no |
ASLLVD | American SL | many words and phrases | no info | video | gloss labels, sign start and end time codes, start and end handshape labels for both hands, morphological and articulatory classifications of sign type | isolated | ASLLVD | Can be used for research and education purposes. Commercial use is not allowed. | no |
BUHMAP-DB | Turkish SL | non-manual gestures, 8 different classes | 48 annotated videos | video | gloss, ground truth of selected points | isolated | BUHMAP | available and free for academic research purposes | no |
PL-Kinect | Polish SL | 84 words in PSL (PJM).Each gesture is performed 20 times. | 84 videos (?) | point clouds videos | translation/gloss | isolated | Vision PRZ | publicly avaible, no info about restrictions | no |
LATLAB | American SL | 98 different stories | each video is 0,5-4 minutes long | video | glosses for each sign, an English translation of each passage, and details about the establishment and use of pronominal spatial reference points in space | continous | LATLAB | no info | no |
Year | Paper | Dataset | Language | Task | Algorithms | Results | Code |
---|---|---|---|---|---|---|---|
2021 | Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks | PHOENIX14T | German SL | Sign Language Production | Progressive Transformers and Mixture Density Networks | BLEU-4 ~ 13.64 | ❌ |
2021 | NetFACS: Using network science to understand facial communication systems | FACS datasets | ❌ | Facial Signals Recognition | NetFACS | ❌ | Github - code in R |
2021 | ANONYSIGN: Novel Human Appearance Synthesis for Sign Language Video Anonymisation | SMILE | German SL | Sign Language Production for Sign Language Video Anonymisation | AnonySign architecture | LPIPS ~ 0.243, FID ~ 49.48 | ❌ |
2021 | Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives | Pre-processed Phoenix14T | German SL | Sign Language Production | Mixture of Motion Primitives architecture | BLEU-4 ~ 12.67 | ❌ |
2021 | On-device Real-time Hand Gesture Recognition | ❌ | American SL | Hand Gesture Recognition | Hand Tracking + NN | Recall=88% | ❌ |
2021 | Development of a software module for recognizing the fingerspelling of the Russian Sign Language based on LSTM | ❌ | Russian SL | Sign Alphabet Recognition | LSTM Neural Network | Precision=91%, Recall=91% | ❌ |
2021 | Artificial Intelligence Technologies for Sign Language | - | - | Sign Language Recognition & Translation | - | - | - |
2021 | A Deep Convolutional Neural Network Approach to Sign Alphabet Recognition | Sign Language MNIST | American Sign Language | Sign Alphabet Recognition | CNN | Accuracy=~94% | Kaggle |
2021 | Efficient sign language recognition system and dataset creation method based on deep learning and image processing | ❌ | Brazilian Sign Language | Sign Language Recognition | XCeption | Accuracy=~80% | ❌ |
2021 | Multi-Modal Zero-Shot Sign Language Recognition | RKS-PERSIAN, ASLVID, isoGD | Persian Sign Language, American Sign Language | Sign Language Recognition | C3D, LSTM, BERT | Accuracy=~68% | ❌ |
2021 | Application of Transfer Learning to Sign Language Recognition using an Inflated 3D Deep Convolutional Neural Network | SIGNUM, MS-ASL | German Sign Language, American Sign Language | Sign Language Recognition | Inception-v3 | Accurracy=49% | Github |
2021 | Skeleton Aware Multi-modal Sign Language Recognition | AUTSL | Turkish Sign Language | Sign Language Recognition | SAM-SLR | Top-1Accuracy=~95%, Top-2Accuracy=~99.7% | Github |
2021 | Word-level Sign Language Recognition with Multi-stream Neural Networks Focusing on Local Regions | WLASL, ML-ASL | American Sign Lnaguage | Sign Language Recognition | YOLO3, I3D, ST-GCN | Top-10Accuracy=92.94% | ❌ |
2021 | Automatic Segmentation of Sign Language into Subtitle-Units | MEDIAPI-SKEL | French Sign Language | Sign Language Segmentation | ST-GCN, BiLSTM | Precision=~56%, Recal=~75% | ❌ |
2021 | SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition | MS-ASL, WLASL, NMFs-CSL, SLR500 | American Sign Language, Chinese Sign Language | Sign Language Recognition | SignBERT, Transformers | WLASL2000: top-1Accuracy=~54%, top-5Accuracy=~87% | ❌ |
2021 | PiSLTRc: Position-informed Sign Language Transformer with Content-aware Convolution | PHOENIX-2014, PHOENIX-2014-T, CSL | German Sign Language, Chinese Sign Language | Sign Language Recognition, Sign Language Translation | CNN, Sign Language Transformers, Self Attention Mechanism | PHOENIX2014T: WER=~23%, BLEU-4=~23% | Github |
2020 | Progressive Transformers for End-to-End Sign Language Production | Pre-processed Phoenix14T | German SL | Sign Language Production | Progressive Transformer | BLEU-4 ~ 9.94 | Github |
2020 | HamNoSyS2SiGML: Translating HamNoSys Into SiGML | ❌ | ❌ | Translating HamNoSys Into SiGML | Convert HamNoSys symbols to their Unicode codes | ❌ | Github |
2020 | Video-to-HamNoSys Automated Annotation System | DGS Corspus | Multiple | Convert Pose to HamNoSys | Tree-like-structure | Accuracy=~22% | ❌ |
2020 | Combining Feature Selection with Neural Networks for Polish Sign Alphabet Recognition | ❌ | Polish Sign Language | Sign Alphabet Recognition | VGG16 | ❌ | ❌ |
2020 | Independent sign language recognition with 3D body, hands, and face reconstruction | GSLL | Greek Sign Language | Sign Language Recognition | I3D, SMPL-X | ❌ | ❌ |
2020 | Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation | RWTH-Phoenix | German Sign Language | Sign Language Recognition, Sign Language Translation | SLRR, SLTT (Transformers) | WER=24%, BLEU-4=22% | Github |
2020 | Phonologically-Meaningful Subunits for Deep Learning-Based Sign Language Recognition | RWTH-Phoenix | German Sign Language | Sign Language Recognition | Trajectory Space Factorization, RNN | WER=~27%, Accuracy=~73% | ❌ |
2020 | Real-Time Sign Language Detection using Human Pose Estimation | DGS Corpus | German Sign Language | Sign Language Detection | LSTM | Accuracy=~92% | Github |
2020 | Pose-based Sign Language Recognition using GCN and BERT | WLASL | American Sign Lnaguage | Sign Language Recognition | GCN, BERT | Top-1Accuracy~60%, Top-5Accuracy=~84% | ❌ |
2020 | Spatial-Temporal Graph Convolutional Networks for Sign Language Recognition | ASLLVD | American Sign Language | Sign Language Recognition | ST-GCN | Accuracy=~61% | Github |
2019 | Improving American Sign Language Recognition with Synthetic Data | SYN1...10 | American SL | Sign Language Recognition with Synthetic Data | DeepHand model, K-means Clustering | 58.7% Acc < 71.1% | Github |
2019 | Exploiting Spatial-temporal Relationships for 3D Pose Estimation via Graph Convolutional Networks | Human3.6M, STB | 3D Pose Estimation | GCN | distanceError=~39mm | Github | |
2018 | Approach to the Sign Language Gesture Recognition Framework Based on HamNoSys Analysis | sEMG, ACC, GYRO | Russian Sign Language | Sign Language Gesture Recognition | ❌ | ❌ | ❌ |
2018 | OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields | MPII, COCO | Pose Estimation | CNN, Affinity Fields | AP=~70% | Github | |
2018 | Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition | Kinetics, NTU-RGBD | Action Recognition | ST-GCN | Top-5Accuracy=~53% | Github |