Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences.…
Transformer & CNN Image Captioning model in PyTorch.
Source Code for Captionomaly: A Deep Learning Toolbox for Anomaly Captioning in Surveillance Videos
Implementation of TAAConvLSTM and SAAConvLSTM used in "Attention Augmented ConvLSTM for Environment Prediction"
Pytorch implementation of Self-Attention ConvLSTM
Visualizing Yolov5's layers using GradCam
The papers or tutorials and relative source code of artificial intelligence for meteorology, ocean and environment science.
Easiest way of fine-tuning HuggingFace video classification models
Implementation of Convolutional LSTM in PyTorch.
Learning to Detect Violent Videos using Convolution LSTM (Keras + tensorflow)
an improvement of the paper: Learning to Detect Violent Videos using Convolution LSTM
AVSS violence recognition in pytorch
BiConvLSTM for violence detection in videos
Violence Detection tutorial using pre-trained CNN and LSTM
Code for the paper: "Efficient Two-Stream Network for Violence Detection Using Separable Convolutional LSTM"
This Pytorch repo uses BiConvLSTM in a Spatiotemporal Encoder to detect violence in Videos. Three benchmark datasets namely Hockey, Movies and Violent Flows were used in this work.
Violence Detection using 3D Convolutional Neural Networks
A large scale video database for violence detection, which has 2,000 video clips containing violent or non-violent behaviours.
Implementation of the model used in the paper Protest Activity Detection and Perceived Violence Estimation from Social Media Images (ACM Multimedia 2017)
Neural Machine Translation with universal Visual Representation (ICLR 2020)
Multi-Modal Reasoning Graph for Scene-Text Based Fine-Grained Image Classification and Retrieval
Optimized code based on M2 for faster image captioning training
Code for paper "Attention on Attention for Image Captioning". ICCV 2019
Official Code for 'RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words' (CVPR 2021)
Simple image captioning model
GRIT: Faster and Better Image-captioning Transformer (ECCV 2022)
Code for "Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations" (NeurIPS 2019)