-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
142 lines (142 loc) · 8.9 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
<!DOCTYPE html>
<html class="h-100">
<head>
<title>Jason Vega - CS PhD Student @ UIUC</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="style.css">
<link rel="icon" type="image/png" href="favicon.png">
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-T3c6CoIi6uLrA9TneNEoa7RxnatzjcDSCmG1MXxSR1GAsXEV/Dwwykc2MPK8M2HN" crossorigin="anonymous">
<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/js/bootstrap.bundle.min.js" integrity="sha384-C6RzsynM9kWDrMNeT87bh95OGNyZPhcTNXj1NW7RuBCsyN/o0jlpcV8Qyq46cDfL" crossorigin="anonymous"></script>
</head>
<body class="h-100 bg-light text-dark">
<div class="d-flex flex-column h-100">
<nav class="navbar navbar-expand-sm navbar-dark bg-dark text-light px-3">
<a class="navbar-brand" href="index.html">Jason Vega</a>
</nav>
<div class="container-fluid flex-grow-1">
<div class="row p-4 bg-light bg-gradient text-dark justify-content-center">
<div class="col" id="bioCol">
<div class="row align-items-center">
<div class="col-lg-3 mb-4 mb-lg-0 text-center">
<img src="headshot.jpeg" class="img-fluid img-thumbnail rounded-circle" />
</div>
<div class="col-lg-9">
<div class="row">
<div class="col">
<p>
Hi there! I'm a third-year computer science Ph.D. student at the <a href="https://cs.illinois.edu">University of Illinois Urbana-Champaign</a> working on artificial intelligence research, particularly on topics in trustworthy machine learning. I'm a member of the <a href="https://ggndpsngh.github.io">FOrmally
Certified Automation and Learning (FOCAL) Lab</a>, where I'm advised by Prof. Gagandeep Singh. I graduated from the <a href="https://cse.ucsd.edu">University of California San Diego</a> in June 2022 with a B.S. in Computer Science. My research vision is to enable efficient, ethical development of intelligent systems that are
highly performant yet safe, transparent and ultimately beneficial to humanity.
</p>
</div>
</div>
<div class="row align-items-center justify-content-center">
<div class="col-auto py-2 text-center">
<a class="btn btn-outline-primary" href="CV.pdf">CV</a>
</div>
<div class="col-auto py-2 text-center">
<a class="btn btn-outline-primary" href="https://www.threads.net/@_jasonvega">Threads</a>
</div>
<div class="col-auto py-2 text-center">
<a class="btn btn-outline-primary" href="https://medium.com/@jasonvega14">Medium</a>
</div>
<div class="col-auto py-2 text-center">
<a class="btn btn-outline-primary" href="https://www.linkedin.com/in/jason-vega/">LinkedIn</a>
</div>
<div class="col-auto py-2 text-center">
<a class="btn btn-outline-primary" href="mailto:[email protected]">Email</a>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="row pb-4 px-4 justify-content-center">
<div class="col" id="mainContentCol">
<h2>
Research Interests
</h2>
<ul class="mb-0">
<li>
<b>Safety of Large Language Models (LLMs)</b>
<ul>
<li>
Efficient attacks for bypassing the safety alignment of LLMs
</li>
</ul>
</li>
</ul>
<br>
<h2>
Papers
</h2>
<p>
(* denotes equal contribution)
</p>
<div class="card bg-light text-dark">
<div class="card-body">
<h5 class="card-title">
Stochastic Monkeys at Play: Random Augmentations Cheaply Break LLM Safety Alignment
</h5>
<h6 class="card-subtitle mb-2 text-muted">
<b>Jason Vega</b>, Junsheng Huang*, Gaokai Zhang*, Hangoo Kang*, Minjia Zhang, Gagandeep Singh
</h6>
<h6 class="card-subtitle mb-2 text-muted">
Arxiv, 2024; under peer review
</h6>
<p class="card-text">
We show that low-resource and unsophisticated attackers, i.e. <i>stochastic monkeys</i>, can significantly improve their chances of bypassing safety alignment of SoTA LLMs with just 25 random augmentations per prompt.
</p>
<a href="https://arxiv.org/abs/2411.02785" class="card-link">Paper</a>
</div>
</div>
<br>
<div class="card bg-light text-dark">
<div class="card-body">
<h5 class="card-title">
Bypassing the Safety Training of Open-Source LLMs with Priming Attacks
</h5>
<h6 class="card-subtitle mb-2 text-muted">
<b>Jason Vega*</b>, Isha Chaudhary*, Changming Xu*, Gagandeep Singh
</h6>
<h6 class="card-subtitle mb-2 text-muted">
ICLR 2024, Tiny Papers
</h6>
<p class="card-text">
We investigate the fragility of SOTA open-source LLMs under simple, optimization-free attacks we refer to as priming attacks (now known as <i>prefilling attacks</i>), which are easy to execute and effectively bypass alignment from safety training.
</p>
<a href="https://arxiv.org/abs/2312.12321" class="card-link">Paper</a>
<a href="https://github.com/uiuc-focal-lab/llm-priming-attacks" class="card-link">Code</a>
<a href="https://llmpriming.focallab.org" class="card-link">Website</a>
</div>
</div>
<br>
<h2>
Other
</h2>
<ul>
<li>
I grew up in the Bay Area 🌉 and will always be a Californian 🐻 at heart.
</li>
<li>
Outside of research, I enjoy:
<ul>
<li>
Playing the violin 🎻 in the <a href="https://music.illinois.edu/perform/orchestras/philharmonia-orchestra/">UIUC Philharmonia Orchestra</a>
</li>
<li>
Going for a run 🏃
</li>
<li>
Watching films and shows 🎥
</li>
</ul>
</li>
</ul>
</div>
</div>
</div>
</div>
</body>
</html>