Skip to content

This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.

Notifications You must be signed in to change notification settings

aimagelab/awesome-human-visual-attention

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 

Repository files navigation

Human Visual Attention Awesome

This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.

❗ Latest Update: 10 September 2024. ❗This repo is a work in progress. New updates coming soon, stay tuned!! 🚧

πŸ“£ Latest News πŸ“£

  • 20 April 2024 Our survey paper has been accepted for publication at IJCAI2024 Survey Track!

Our Survey on Human Visual Attention πŸ‘€

πŸ”₯πŸ”₯ Trends, Applications, and Challenges in Human Attention Modelling πŸ”₯πŸ”₯

Authors: Giuseppe Cartella, Marcella Cornia, Vittorio Cuculo, Alessandro D'Amelio, Dario Zanca, Giuseppe Boccignone, Rita Cucchiara

πŸ“š Table of Contents

  • Human Attention Modelling

    • Saliency Prediction
      Year Conference / Journal Title Authors Links
      2025 WACV SUM: Saliency Unification through Mamba for Visual Attention Modeling Alireza Hosseini et al. πŸ“œ Paper / Code :octocat: / Project Page
      2024 WACV Learning Saliency from Fixations Yasser Abdelaziz Dahou Djilali et al. πŸ“œ Paper / Code :octocat:
      2023 CVPR Learning from Unique Perspectives: User-aware Saliency Modeling Shi Chen et al. πŸ“œ Paper
      2023 CVPR TempSAL - Uncovering Temporal Information for Deep Saliency Prediction Bahar Aydemir et al. πŸ“œ Paper / Code :octocat:
      2023 BMVC Clustered Saliency Prediction Rezvan Sherkat et al. πŸ“œ Paper
      2023 NeurIPS What Do Deep Saliency Models Learn about Visual Attention? Shi Chen et al. πŸ“œ Paper / Code :octocat:
      2022 Neurocomputing TranSalNet: Towards perceptually relevant visual saliency prediction Jianxun Lou et al. πŸ“œ Paper / Code :octocat:
      2020 CVPR STAViS: Spatio-Temporal AudioVisual Saliency Network Antigoni Tsiami et al. πŸ“œ Paper / Code :octocat:
      2020 CVPR How much time do you have? Modeling multi-duration saliency Camilo Fosco et al. πŸ“œ Paper / Code :octocat: / Project Page
      2018 IEEE Transactions on Image Processing Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model Marcella Cornia et al. πŸ“œ Paper / Code :octocat:
      2015 CVPR SALICON: Saliency in Context Ming Jiang et al. πŸ“œ Paper / Project Page
      2009 ICCV Learning to Predict Where Humans Look Tilke Judd et al. πŸ“œ Paper
      1998 TPAMI A Model of Saliency-Based Visual Attention for Rapid Scene Analysis Laurent Itti et al. πŸ“œ Paper
    • Scanpath Prediction
      Year Conference / Journal Title Authors Links
      2024 ECCV GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths Xianyu Chen et al. [πŸ“œ Paper] / Code :octocat:
      2024 ECCV Look Hear: Gaze Prediction for Speech-directed Human Attention Sounak Mondal et al. πŸ“œ Paper / Code :octocat:
      2024 CVPR Beyond Average: Individualized Visual Scanpath Prediction Xianyu Chen et al. πŸ“œ Paper
      2024 CVPR Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers Zhibo Yang et al. πŸ“œ Paper / Code :octocat:
      2023 arXiv Contrastive Language-Image Pretrained Models are Zero-Shot Human Scanpath Predictors Dario Zanca et al. πŸ“œ Paper / Code + Dataset :octocat:
      2023 CVPR Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention Sounak Mondal et al. πŸ“œ Paper / Code :octocat:
      2022 ECCV Target-absent Human Attention Zhibo Yang et al. πŸ“œ Paper / Code :octocat:
      2022 TMLR Behind the Machine's Gaze: Neural Networks with Biologically-inspired Constraints Exhibit Human-like Visual Attention Leo Schwinn et al. πŸ“œ Paper / Code :octocat:
      2022 Journal of Vision DeepGaze III: Modeling free-viewing human scanpaths with deep learning Matthias KΓΌmmerer et al. πŸ“œ Paper / Code :octocat:
      2021 CVPR Predicting Human Scanpaths in Visual Question Answering Xianyu Chen et al. πŸ“œ Paper / Code :octocat:
      2019 TPAMI Gravitational Laws of Focus of Attention Dario Zanca et al. πŸ“œ Paper / Code :octocat:
      2015 Vision Research Saccadic model of eye movements for free-viewing condition Olivier Le Meur et al. πŸ“œ Paper
  • Integrating Human Attention in AI models

    • Image and Video Processing

      • Visual Recognition
        Year Conference / Journal Title Authors Links
        2023 IJCV Joint Learning of Visual-Audio Saliency Prediction and Sound Source Localization on Multi-face Videos Minglang Qiao et al. πŸ“œ Paper / Code :octocat:
        2022 ECML PKDD Foveated Neural Computation Matteo Tiezzi et al. πŸ“œ Paper / Code :octocat:
        2021 WACV Integrating Human Gaze into Attention for Egocentric Activity Recognition Kyle Min et al. πŸ“œ Paper / Code :octocat:
        2019 CVPR Learning Unsupervised Video Object Segmentation through Visual Attention Wenguan Wang et al. πŸ“œ Paper / Code :octocat:
        2019 CVPR Shifting more attention to video salient object detection Deng-Ping Fan et al. πŸ“œ Paper / Code :octocat:
      • Graphic Design
        Year Conference / Journal Title Authors Links
        2020 ACM Symposium on UIST (User Interface Software and Technology) Predicting Visual Importance Across Graphic Design Types Camilo Fosco et al. πŸ“œ Paper / Code :octocat:
        2020 ACM MobileHCI Understanding Visual Saliency in Mobile User Interfaces Luis A. Leiva et al. πŸ“œ Paper
        2017 ACM Symposium on UIST (User Interface Software and Technology) Learning Visual Importance for Graphic Designs and Data Visualizations Zoya Bylinskii et al. πŸ“œ Paper / Code :octocat:
      • Image Enhancement and Manipulation
        Year Conference / Journal Title Authors Links
        2023 CVPR Realistic saliency guided image enhancement S. Mahdi H. Miangoleh et al. πŸ“œ Paper / Code :octocat: / Project Page
        2022 CVPR Deep saliency prior for reducing visual distraction Kfir Aberman et al. πŸ“œ Paper / Project Page
        2021 CVPR Saliency-guided image translation Lai Jiang et al. πŸ“œ Paper
        2017 arXiv Guiding human gaze with convolutional neural networks Leon A. Gatys et al. πŸ“œ Paper
      • Image Quality Assessment
        Year Conference / Journal Title Authors Links
        2023 CVPR ScanDMM: A Deep Markov Model of Scanpath Prediction for 360Β° Images Xiangjie Sui et al. πŸ“œ Paper / Code :octocat:
        2021 ICCV Workshops Saliency-Guided Transformer Network combined with Local Embedding for No-Reference Image Quality Assessment Mengmeng Zhu et al. πŸ“œ Paper
        2019 ACMMM SGDNet: An End-to-End Saliency-Guided Deep Neural Network for No-Reference Image Quality Assessment Sheng Yang et al. πŸ“œ Paper / Code :octocat:
    • Vision-and-Language Applications

      • Automatic Captioning
        Year Conference / Journal Title Authors Links
        2020 EMNLP Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze Ece Takmaz et al. πŸ“œ Paper / Code :octocat:
        2019 ICCV Human Attention in Image Captioning: Dataset and Analysis Sen He et al. πŸ“œ Paper / Code :octocat:
        2018 ACM TOMM Paying More Attention to Saliency: Image Captioning with Saliency and Context Attention Marcella Cornia et al. πŸ“œ Paper
        2017 CVPR Supervising Neural Attention Models for Video Captioning by Human Gaze Data Youngjae Yu et al. πŸ“œ Paper / Code :octocat:
        2016 arXiv Seeing with Humans: Gaze-Assisted Neural Image Captioning Yusuke Sugano et al. πŸ“œ Paper
      • Visual Question Answering
        Year Conference / Journal Title Authors Links
        2023 EMNLP GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations Muhammet Furkan Ilaslan et al. πŸ“œ Paper / Code :octocat:
        2023 CVPR Workshops Multimodal Integration of Human-Like Attention in Visual Question Answering Ekta Sood et al. πŸ“œ Paper / Project Page
        2021 CoNLL VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering Ekta Sood et al. πŸ“œ Paper / Dataset + Project Page
        2020 ECCV AiR: Attention with Reasoning Capability Shi Chen et al. πŸ“œ Paper / Code :octocat:
        2018 AAAI Exploring Human-like Attention Supervision in Visual Question Answering Tingting Qiao et al. πŸ“œ Paper / Code :octocat:
        2016 EMNLP Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? Abhishek Das et al. πŸ“œ Paper
    • Language Modelling

      • Machine Reading Comprehension
        Year Conference / Journal Title Authors Links
        2023 ACL Workshops Native Language Prediction from Gaze: a Reproducibility Study Lina Skerath et al. πŸ“œ Paper / Code :octocat:
        2022 ETRA Inferring Native and Non-Native Human Reading Comprehension and Subjective Text Difficulty from Scanpaths David R. Reich et al. πŸ“œ Paper / Code :octocat:
        2017 ACL Predicting Native Language from Gaze Yevgeni Berzak et al. πŸ“œ Paper
      • Natural Language Understanding
        Year Conference / Journal Title Authors Links
        2023 EMNLP Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding Shuwen Deng et al. πŸ“œ Paper / Code :octocat:
        2023 EACL Synthesizing Human Gaze Feedback for Improved NLP Performance Varun Khurana et al. πŸ“œ Paper
        2020 NeurIPS Improving Natural Language Processing Tasks with Human Gaze-Guided Neural Attention Ekta Sood et al. πŸ“œ Paper / Project Page
    • Domain-Specific Applications

      • Robotics
        Year Conference / Journal Title Authors Links
        2023 IEEE RA-L GVGNet: Gaze-Directed Visual Grounding for Learning Under-Specified Object Referring Intention Kun Qian et al. πŸ“œ Paper
        2022 RSS Gaze Complements Control Input for Goal Prediction During Assisted Teleoperation Reuben M. Aronson et al. πŸ“œ Paper
        2019 CoRL Understanding Teacher Gaze Patterns for Robot Learning Akanksha Saran et al. πŸ“œ Paper / Code :octocat:
        2019 CoRL Nonverbal Robot Feedback for Human Teachers Sandy H. Huang et al. πŸ“œ Paper
      • Autonomous Driving
        Year Conference / Journal Title Authors Links
        2023 ICCV FBLNet: FeedBack Loop Network for Driver Attention Prediction Yilong Chen et al. πŸ“œ Paper
        2022 IEEE Transactions on Intelligent Transportation Systems DADA: Driver Attention Prediction in Driving Accident Scenarios Jianwu Fang et al. πŸ“œ Paper / Code :octocat:
        2021 ICCV MEDIRL: Predicting the Visual Attention of Drivers via Deep Inverse Reinforcement Learning Sonia Baee et al. πŸ“œ Paper / Code :octocat: / Project Page
        2020 CVPR β€œLooking at the right stuff” - Guided semantic-gaze for autonomous driving Anwesan Pal et al. πŸ“œ Paper / Code :octocat:
        2019 ITSC DADA-2000: Can Driving Accident be Predicted by Driver Attention? Analyzed by A Benchmark Jianwu Fang et al. πŸ“œ Paper / Code :octocat:
        2018 ACCV Predicting Driver Attention in Critical Situations Ye Xia et al. πŸ“œ Paper / Code :octocat:
        2018 TPAMI Predicting the Driver’s Focus of Attention: the DR(eye)VE Project Andrea Palazzi et al. πŸ“œ Paper / Code :octocat:
      • Medicine
        Year Conference / Journal Title Authors Links
        2024 MICCAI Weakly-supervised Medical Image Segmentation with Gaze Annotations Yuan Zhong et al. πŸ“œ Paper / Code :octocat:
        2024 AAAI Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis Zihao Zhao et al. πŸ“œ Paper / Code :octocat:
        2024 WACV GazeGNN: A Gaze-Guided Graph Neural Network for Chest X-ray Classification Bin Wang et al. πŸ“œ Paper / Code :octocat:
        2023 WACV Probabilistic Integration of Object Level Annotations in Chest X-ray Classification Tom van Sonsbeek et al. πŸ“œ Paper
        2023 IEEE Transactions on Medical Imaging Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning Chong Ma et al. πŸ“œ Paper
        2023 Transactions on Neural Networks and Learning Systems Rectify ViT Shortcut Learning by Visual Saliency Chong Ma et al. πŸ“œ Paper
        2022 IEEE Transactions on Medical Imaging Follow My Eye: Using Gaze to Supervise Computer-Aided Diagnosis Sheng Wang et al. πŸ“œ Paper / Code :octocat:
        2022 MICCAI GazeRadar: A Gaze and Radiomics-Guided Disease Localization Framework Moinak Bhattacharya et al. πŸ“œ Paper / Code :octocat:
        2022 ECCV RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention–guided Disease Classification Moinak Bhattacharya et al. πŸ“œ Paper / Code :octocat:
        2021 Nature Scientific Data Creation and validation of a chest X-ray dataset with eye-tracking and report dictation for AI development Alexandros Karargyris et al. πŸ“œ Paper / Code :octocat:
        2021 BMVC Human Attention in Fine-grained Classification Yao Rong et al. πŸ“œ Paper / Code :octocat:
        2018 Journal of Medical Imaging Modeling visual search behavior of breast radiologists using a deep convolution neural network Suneeta Mall et al. πŸ“œ Paper
  • Datasets & Benchmarks πŸ“‚πŸ“Ž

How to Contribute πŸš€

  1. Fork this repository and clone it locally.
  2. Create a new branch for your changes: git checkout -b feature-name.
  3. Make your changes and commit them: git commit -m 'Description of the changes'.
  4. Push to your fork: git push origin feature-name.
  5. Open a pull request on the original repository by providing a description of your changes.

This project is in constant development, and we welcome contributions to include the latest research papers in the field or report issues πŸ’₯πŸ’₯.

About

This repository contains a curated list of research papers and resources focusing on saliency and scanpath prediction, human attention, human visual search.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published