layout | title | permalink | tags |
---|---|---|---|
page |
Publications |
/publications/ |
publications |
VisMin: Visual Minimal-Change Understanding
Rabiul Awal, Saba Ahmadi, Le Zhang, Aishwarya Agrawal
arXiv preprint, arXiv:2407.16772, 2024
[ArXiv]
Benchmarking Vision Language Models for Cultural Understanding
Shravan Nayak, Kanishk Jain, Rabiul Awal, Siva Reddy, Sjoerd van Steenkiste, Lisa Anne Hendricks, Karolina Stanczak, Aishwarya Agrawal
arXiv preprint, arXiv:2407.10920, 2024
[ArXiv]
Decompose and Compare Consistency: Measuring VLMs' Answer Reliability via Task-Decomposition Consistency Comparison
Qian Yang, Weixiang Yan, Aishwarya Agrawal
arXiv preprint, arXiv:2407.07840, 2024
[ArXiv]
An Introduction to Vision-Language Modeling
Florian Bordes et al.
arXiv preprint, arXiv:2405.17247, 2024
[ArXiv]
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Oscar Mañas, Pietro Astolfi, Melissa Hall, Candace Ross, Jack Urbanek, Adina Williams, Aishwarya Agrawal, Adriana Romero-Soriano, Michal Drozdzal
arXiv preprint, arXiv:2403.17804, 2024
[ArXiv]
Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding
Le Zhang, Rabiul Awal, Aishwarya Agrawal
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[ArXiv]