678 Project: UE5-Distress-Action-Detection

Overview

This project is focused on distress-related action recognition in a variety of environmental and weather conditions. We hand crafted the generation of synthetic video data using Unreal Engine 5 (UE5), performing simulations of six distress-related human actions: injured waving, jumping, pointing, injured walked(limping), running, and holding something in pain. And six non-distress-related human actions: blowing a kiss, greeting, rumba dancing, salute, silly dancing, and sitting.

We pass this data through a convolutional neural network (CNN) to perform feature extraction, and follow this up by utilising a Long Short-Term Memory (LSTM) network to analyze patterns within the time series data. We utilize fully connected layers to then map our learned representations to one of the six distress-related actions, which we may use for action classification.

Dataset Composition

Dataset: Google Drive Link

Mixamo Distress Actions:

Waving	Jumping	Pointing

Injured Walking	Running	Holding something in pain

Mixamo Non-Distress Actions

Blowing a kiss	Greeting	Rumba Dancing

Salute	Silly Dancing	Sitting

Camera Profiles:

Front view
Side view
- Left
- Right
Top-down view

Lighting Conditions:

Day
Night

Backgrounds/Environments:

Neutral(plain)
Grassy/Rural
Road/City

File Structure:

dataset/
├── holding_something_in_pain/
│   ├── holding_something_in_pain_day_rural_right_30fps.mp4
│   ├── holding_something_in_pain_night_grassy_back_30fps.mp4
│   ...
├── jumping/
│   ├── jumping_day_grassy_front_30fps.mp4
│   ...
├── injured_walking/
│   ├── injured_walking_night_plain_left_30fps.mp4
│   ...
...

Model Architecture

The model architecture combines convolutional neural networks (CNNs) and Long Short-Term Memory (LSTM) to classify our synthetic video data of distress-related actions. Our data is of the form of 30fps video segments of animations in various environments and conditions.

We pass our data through CNN to extract spatial features from individual frames, which helps the model learn visual patterns.

We then utilise these learned features by passing them through an LSTM model, which are better equipped to handle the time series data from videos, and can interpret the features across each time step.

Finally, we utilise a fully connected layer to then map our learned representations to each of the labels associated with the six distress-related actions.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
With_Mannequin		With_Mannequin
Without_Mannequin		Without_Mannequin
readme_assets		readme_assets
.gitignore		.gitignore
Augment_video.py		Augment_video.py
README.md		README.md
train.ipynb		train.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

678 Project: UE5-Distress-Action-Detection

Overview

Dataset Composition

Dataset: Google Drive Link

Mixamo Distress Actions:

Mixamo Non-Distress Actions

Camera Profiles:

Lighting Conditions:

Backgrounds/Environments:

File Structure:

Model Architecture

About

Releases

Packages

Contributors 2

Languages

joeduman/UE5-Action-Detection

Folders and files

Latest commit

History

Repository files navigation

678 Project: UE5-Distress-Action-Detection

Overview

Dataset Composition

Dataset: Google Drive Link

Mixamo Distress Actions:

Mixamo Non-Distress Actions

Camera Profiles:

Lighting Conditions:

Backgrounds/Environments:

File Structure:

Model Architecture

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages