Skip to content

Latest commit

 

History

History
94 lines (93 loc) · 10 KB

dataset.md

File metadata and controls

94 lines (93 loc) · 10 KB
Dataset Data type Scenes Annotation Task #Examples/
#Classes
SOTA/
benchmark
KTH[1] Trimmed-video Daily Living Video-level Action Recognition 2391/
6
98.9%[2]
Collective Activity[3] Trimmed-video Daily Living Person/
Group-level
Group ActivityRecognition 44/
5
91.0%[4]
HOLLYWOOD2[5] Trimmed-video Movie Video-level Action Recognition 3,669/
12
73.7%[6]
Daphnet Gait[7] Signal-sequence Sport Signal-level Action Recognition 1,917,887/
2
94.1%[8]
CK[9] Still-image Facial Expression Image-level Facial ExpressionRecognition 327/
7
88.7%[10]
MMI[11] Video/
Still-image
Facial Expression Action Unit Facial ExpressionRecognition 2900/
6
98.6%[12]
Pascal VOC Aactions[13] Still-image Comprehensive Image-level Action Recognition 11,530/
20
90.2%[14]
WISDM[15] Signal-sequence Daily Living Signal-level Action Recognition 1098213/
6
98.2%[16]
HMDB51[17] Trimmed-video Daily Living Video-level Action Recognition 6,766/
51
82.1%[18]
UCF101[19] Trimmed-video Sport Video-level Action Recognition 13,320/
101
98.2%[20]
Opportunity[21] Signal-sequence Daily Living Signal-level Action Recognition 701,366/
16
91.8%[22]
PAMAP2[23] Signal-sequence Daily Living Signal-level Action Recognition 2,844,868/
18
91.0%[24]
SFEW-2.0[25],[26] Still-image Facial Expression Image-level Facial ExpressionRecognition 1394/
7
58.1%[27]
MPII[28] Still-image Comprehensive Image-level Pose Estimation 24920/
410
92.1%[29]
Breakfast Dataset[30] Trimmed-video Daily Living Video-level Action Recognition 1,989/
10
45.7%[31]
HICO[32] Still-image Comprehensive Image-level Human-Object Interaction Recognition 47774/
117
47.1%[33]
ACTIVITYNET-200[34] Untrimmed-video Daily Living Time-interval Video Understanding 19,994/
200
91.3%[35]
Volleyball[36] Trimmed-video Sport Video-level Group ActivityRecognition 4830/
8
92.6%[4]
Charades[38] Trimmed-video Daily Living Video-level Action Recognition 9,848/
157
43.4%[39]
YouTube-8M[40] Untrimmed-video Comprehensive Time-interval Video Understanding 6,100,000/
3862
85.0%[40]
THUMOS14[42] Untrimmed-video Comprehensive Time-interval Video Understanding 18404/
101
82.2%[35]
Kinetics[44] Trimmed-video Comprehensive Video-level Action Recognition 300,000/
700
82.8%[45]
Something-Something[46] Trimmed-video Daily Living Video-level Action Recognition 220,847/
174
51.6%[45]
FCVID[48] Untrimmed-video Comprehensive Video-level Action Recognition 91,223/
239
77.6%[49]
20BN-JESTER[50] Trimmed-video Hand Gesture Video-level Action Recognition 148000/
27
94.8%[50]
Infrared Visible[52] Trimmed-video Daily Living Video-level Action Recognition 1200/
12
80.2%[52]
AVA[54] Untrimmed-video Movie Time-interval Video Understanding 57,600/
80
27.2%[39]
Epic-kitchen[56] Trimmed-video Daily Living Video-level Action Recognition 432/
149
34.5%[45]
COIN[58] Untrimmed-video Daily Living Time-interval Video Understanding 11827/
180
88.0%[58]
Moments in Time[60] Trimmed-video Comprehensive Video-level Action Recognition 1,000,000/
339
32.4%[61]
  1. Recognizing human actions: a local SVM approach | 2004
  2. Human actions recognition based on 3D deep neural network | 2017
  3. What are they doing?: Collective activity classification using spatio-temporal relationship among people | 2009
  4. Learning Actor Relation Graphs for Group Activity Recognition | 2019
  5. Actions in Context | 2009
  6. Modeling video evolution for action recognition | 2015
  7. Potentials of enhanced context awareness in wearable assistants for Parkinson's disease patients with the freezing of gait syndrome | 2009
  8. Deep recurrent neural networks for human activity recognition | 2017
  9. The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression | 2010
  10. Greedy search for descriptive spatial face features | 2017
  11. Induced disgust, happiness and surprise: an addition to the mmi facial expression database | 2010
  12. Dexpression: Deep convolutional neural network for expression recognition | 2015
  13. The pascal visual object classes (voc) challenge | 2010
  14. Contextual action recognition with r* cnn | 2015
  15. Activity recognition using cell phone accelerometers | 2011
  16. Deep activity recognition models with triaxial accelerometers | 2016
  17. HMDB: a large video database for human motion recognition | 2011
  18. End-to-end video-level representation learning for action recognition | 2018
  19. UCF101: A dataset of 101 human actions classes from videos in the wild | 2012
  20. Potion: Pose motion representation for action recognition | 2018
  21. The Opportunity challenge: A benchmark database for on-body sensor-based activity recognition | 2013
  22. Comparison of feature learning methods for human activity recognition using wearable sensors | 2018
  23. Time series classification using multi-channels deep convolutional neural networks | 2014
  24. A comprehensive study of activity recognition using accelerometers | 2018
  25. Collecting large, richly annotated facial-expression databases from movies | 2012
  26. Emotion recognition in the wild challenge 2014: Baseline, data and protocol | 2014
  27. Covariance pooling for facial expression recognition | 2018
  28. 2d human pose estimation: New benchmark and state of the art analysis | 2014
  29. Multi-scale structure-aware network for human pose estimation | 2018
  30. The language of actions: Recovering the syntax and semantics of goal-directed human activities | 2014
  31. D3tw: Discriminative differentiable dynamic time warping for weakly supervised action alignment and segmentation | 2019
  32. Hico: A benchmark for recognizing human-object interactions in images | 2015
  33. HAKE: Human Activity Knowledge Engine | 2019
  34. ActivityNet: A Large-Scale Video Benchmark for Human Activity Understanding | 2015
  35. Untrimmednets for weakly supervised action recognition and detection | 2017
  36. A hierarchical deep temporal model for group activity recognition | 2016
  37. Learning Actor Relation Graphs for Group Activity Recognition | 2019
  38. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding | 2016
  39. Long-term feature banks for detailed video understanding | 2019
  40. Youtube-8m: A large-scale video classification benchmark | 2016
  41. Youtube-8m: A large-scale video classification benchmark | 2016
  42. The THUMOS challenge on action recognition for videos “in the wild” | 2017
  43. Untrimmednets for weakly supervised action recognition and detection | 2017
  44. The kinetics human action video dataset | 2017
  45. Large-scale weakly-supervised pre-training for video action recognition | 2019
  46. The" Something Something" Video Database for Learning and Evaluating Visual Common Sense. | 2017
  47. Large-scale weakly-supervised pre-training for video action recognition | 2019
  48. Exploiting feature and class relationships in video categorization with regularized deep neural networks | 2017
  49. Pivot correlational neural network for multimodal video categorization | 2018
  50. Temporal Relational Reasoning in Videos | 2018
  51. Temporal Relational Reasoning in Videos | 2018
  52. PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities | 2018
  53. PM-GANs: Discriminative Representation Learning for Action Recognition Using Partial-modalities | 2018
  54. AVA: A video dataset of spatio-temporally localized atomic visual actions | 2018
  55. Long-term feature banks for detailed video understanding | 2019
  56. Scaling Egocentric Vision: The EPIC-KITCHENS Dataset | 2018
  57. Large-scale weakly-supervised pre-training for video action recognition | 2019
  58. Coin: A large-scale dataset for comprehensive instructional video analysis | 2019
  59. Coin: A large-scale dataset for comprehensive instructional video analysis | 2019
  60. Moments in Time Dataset: one million videos for event understanding | 2019
  61. Collaborative Spatiotemporal Feature Learning for Video Action Recognition | 2019