[Benchmark] Support SCAM #1026

JonasLoos · 2025-05-28T17:33:46Z

SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models

[🌐 Homepage] [🤗 Huggingface Dataset] [📖 ArXiv Paper]

👀 Introduction

Typographic attacks exploit the interplay between text and visual content in multimodal foundation models, causing misclassifications when misleading text is embedded within images. However, existing datasets are limited in size and diversity, making it difficult to study such vulnerabilities. We introduce SCAM, the largest and most diverse dataset of real-world typographic attack images to date, containing images across hundreds of object categories and attack words.

📈 Evaluation

torchrun --nproc-per-node=1  run.py --data SCAM --model your_model --verbose

JonasLoos added 7 commits May 9, 2025 05:25

add SCAM dataset

a83ba44

remove testing limit

f946331

clean up code

5f78728

scam dataset: cleanup and add shuffling

eab3d73

update SCAM dataset question phrasing

d514530

Merge remote-tracking branch 'upstream/main'

d650a11

fix pre-commit linter issues

0a08d41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Benchmark] Support SCAM #1026

[Benchmark] Support SCAM #1026

Uh oh!

JonasLoos commented May 28, 2025

Uh oh!

Uh oh!

[Benchmark] Support SCAM #1026

Are you sure you want to change the base?

[Benchmark] Support SCAM #1026

Uh oh!

Conversation

JonasLoos commented May 28, 2025

SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models

👀 Introduction

📈 Evaluation

Uh oh!

Uh oh!