-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
330 lines (300 loc) · 17.6 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<!-- Meta tags for social media banners, these should be filled in appropriatly as they are your "business card" -->
<!-- Replace the content tag with appropriate information -->
<meta name="description"
content="Training-Free Model Merging for Multi-target Domain Adaptation. In this paper, we study multi-target domain adaptation of scene understanding models. While previous methods achieved commendable results through inter-domain consistency losses, they often assumed unrealistic simultaneous access to images from all target domains, overlooking constraints such as data transfer bandwidth limitations and data privacy concerns. Given these challenges, we pose the question: How to merge models adapted independently on distinct domains while bypassing the need for direct access to training data? Our solution to this problem involves two components, merging model parameters and merging model buffers (i.e., normalization layer statistics). For merging model parameters, empirical analyses of mode connectivity surprisingly reveal that linear merging suffices when employing the same pretrained backbone weights for adapting separate models. For merging model buffers, we model the real-world distribution with a Gaussian prior and estimate new statistics from the buffers of separately trained models. Our method is simple yet effective, achieving comparable performance with data combination training baselines, while eliminating the need for accessing training data." />
<meta name="keywords"
content="ModelMerging, Training-Free, multi-target domain adaptation, Model Connectivity, Model Averaging" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>
ModelMerging | Project Page
</title>
<link rel="icon" type="image/x-icon" href="static/images/favicon.ico" />
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet" />
<link rel="stylesheet" href="static/css/bulma.min.css" />
<link rel="stylesheet" href="static/css/bulma-carousel.min.css" />
<link rel="stylesheet" href="static/css/bulma-slider.min.css" />
<link rel="stylesheet" href="static/css/fontawesome.all.min.css" />
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css" />
<link rel="stylesheet" href="static/css/index.css" />
<link rel="stylesheet" href="https://unpkg.com/beerslider/dist/BeerSlider.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script src="https://documentcloud.adobe.com/view-sdk/main.js"></script>
<script defer src="static/js/fontawesome.all.min.js"></script>
<script src="static/js/bulma-carousel.min.js"></script>
<script src="static/js/bulma-slider.min.js"></script>
<script src="static/js/index.js"></script>
</head>
<body>
<section class="hero">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h2 class="title is-2 publication-title">
Training-Free Model Merging for Multi-target Domain Adaptation
</h2>
<div class="is-size-5 publication-authors">
<!-- Paper authors -->
<span class="author-block">
<a href="https://github.com/wenyi-li/" target="_blank">Wenyi Li</a><sup>*1</sup>,</span>
<!-- <a href="SECOND AUTHOR PERSONAL LINK" target="_blank" -->
<span class="author-block">
<a href="https://c7w.tech/about/" target="_blank">Huan-ang Gao</a><sup>*1</sup>,</span>
<span class="author-block">
<!-- <a href="SECOND AUTHOR PERSONAL LINK" target="_blank" -->
Mingju Gao<sup>1</sup>,</span>
<span class="author-block">
<!-- <a href="SECOND AUTHOR PERSONAL LINK" target="_blank" -->
Beiwen Tian<sup>1</sup>,</span>
<span class="author-block">
<!-- <a href="SECOND AUTHOR PERSONAL LINK" target="_blank" -->
Rong Zhi<sup>2</sup>,</span><br />
<span class="author-block">
<a href="https://sites.google.com/view/fromandto" target="_blank">Hao Zhao</a><sup>†1</sup>
</span>
</div>
<!-- a margin of 0.5em -->
<div style="margin: 0.5em;"></div>
<div class="is-size-5 publication-authors">
<span class="author-block is-size-6">
<sup>1</sup> Institute for AI Industry Research (AIR), Tsinghua University
<br>
<sup>2</sup> Mercedes-Benz Group China Ltd.
<!-- <div style="margin: 0.1em;"></div> -->
<span class="eql-cntrb"><small><br /><sup>*</sup>Indicates Equal Contribution</small></span>
<!-- a span of 5em-->
<span style="margin: 1em;"></span>
<span class="eql-cntrb"><small><sup>†</sup>Indicates Corresponding Author</small></span>
</div>
<div class="column has-text-centered">
<div class="publication-links">
<!-- Arxiv PDF link -->
<span class="link-block">
<a href="https://arxiv.org/pdf/2407.13771" target="_blank"
class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fas fa-file-pdf"></i>
</span>
<span>Paper</span>
</a>
</span>
<!-- Github link -->
<span class="link-block">
<a href="https://github.com/AIR-DISCOVER/Model-Merging-MTDA" target="_blank"
class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="fab fa-github"></i>
</span>
<span>Code</span>
</a>
</span>
<!-- ArXiv abstract Link -->
<span class="link-block">
<a href="http://arxiv.org/abs/2407.13771" target="_blank" class="external-link button is-normal is-rounded is-dark">
<span class="icon">
<i class="ai ai-arxiv"></i>
</span>
<span>arXiv</span>
</a>
</span>
</div>
</div>
</div>
</div>
</div>
</div>
</section>
<!-- Teaser video-->
<section class="hero teaser">
<div class="container is-max-desktop">
<div class="hero-body">
<!-- <video poster="" id="tree" autoplay controls muted loop height="100%"> -->
<!-- Your video here -->
<!-- <source src="static/videos/banner_video.mp4" type="video/mp4" /> -->
<!-- </video> -->
<!-- centering the image -->
<div class="columns is-centered">
<!-- <div class="column is-four-fifths"> -->
<!-- <div class="publication-video"> -->
<img src="static/images/FigTeaser.jpg" width="70%" />
<!-- </div> -->
<!-- </div> -->
</div>
<!-- <img src="static/images/Teaser_cs1.jpg" width="100%" /> -->
<h2 class="has-text-left is-size-6">
<b>Comparison of Domain Adaptation Settings.</b> (a) Single Target Domain Adaptation (STDA) focuses on leveraging labeled synthetic data and unlabeled data from a single target domain together for optimal performance in that target domain. (b) Multi-target Domain Adaptation (MTDA) with data access involves utilizing data from target domains together to train a single model capable of excelling across all these domains. (c) MTDA without direct access to training data, employing model merging to enhance robustness.
</h2>
</div>
</div>
</section>
<!-- End teaser video -->
<!-- Paper abstract -->
<section class="section hero is-light">
<div class="container is-max-desktop">
<div class="columns is-centered has-text-centered">
<div class="column is-four-fifths">
<h2 class="title is-3">Abstract</h2>
<div class="content has-text-justified">
<p>
In this paper, we study multi-target domain adaptation of scene understanding models. While previous methods achieved commendable results through inter-domain consistency losses, they often assumed unrealistic simultaneous access to images from all target domains, overlooking constraints such as data transfer bandwidth limitations and data privacy concerns. Given these challenges, we pose the question: How to merge models adapted independently on distinct domains while bypassing the need for direct access to training data? Our solution to this problem involves two components, merging model parameters and merging model buffers (i.e., normalization layer statistics). For merging model parameters, empirical analyses of mode connectivity surprisingly reveal that linear merging suffices when employing the same pretrained backbone weights for adapting separate models. For merging model buffers, we model the real-world distribution with a Gaussian prior and estimate new statistics from the buffers of separately trained models. Our method is simple yet effective, achieving comparable performance with data combination training baselines, while eliminating the need for accessing training data.
<br/><br/> Our code release is undergoing a review process within the company of our co-authors due to regulations.
If you meet with problems when trying to reproduce our results or have problems with our implementation, feel free to contact us :)
</p>
</div>
</div>
</div>
</div>
</section>
<!-- End paper abstract -->
<!-- Youtube video -->
<section class="hero is-small">
<div class="hero-body">
<div class="container">
<!-- Paper video. -->
<h2 class="title is-3">Method</h2>
<div class="columns is-centered">
<img src="static/images/FigPipeline.jpg" width="55%" />
</div>
<h2 class="has-text-left is-size-6">
<b>Overview of Two-stage Pipeline of Our Proposed Multi-target Domain Adaptation Solution.</b> After training STDA methods on separate domains, we integrate models together using our proposed merging techniques (parameter merging + buffer merging).
</h2>
<br />
<h3 class="title is-4">Parameter Merging</h3>
<div class="columns is-centered">
<img src="static/images/Fig_mid_rebasin.jpg" width="55%" />
</div>
<h2 class="has-text-left is-size-6" width="55%">
<b>Results of Git Re-Basin and Mid-Point Merging on Different Backbones.</b> In our domain adaptation scenario, Git Re-Basin reduced to a straightforward mid-point merging approach.
</h2>
<br />
<div class="columns is-centered">
<img src="static/images/Fig_3_Ablation.jpg" width="55%" />
</div>
<h2 class="has-text-left is-size-6" width="55%">
<b>Empirical Analysis for Linear Mode Connectivity.</b> (a) Exploring the linear mode connectivity of two trained ResNet101 backbones targeted at two different domains. (b-e) Ablation studies on synthetic data, self-training architecture, initializaiton weights and pretrained weights to find the cause of the linear mode connectivity
</h2>
<br />
<h3 class="title is-4">Buffer Merging</h3>
<div class="columns is-centered">
<img src="static/images/FigBufferMerging.jpg" width="35%" />
</div>
<h2 class="has-text-left is-size-6" width="55%">
<b>Illustration on Merging Statistics in Batch Normalization (BN) Layers.</b> We are provided with two sets of means and variances of data points sampled from this Gaussian prior, along with the sizes of these sets.
</h2>
</div>
</div>
</section>
<!-- End youtube video -->
<!-- Youtube video -->
<section class="hero is-small is-light">
<div class="hero-body">
<div class="container">
<!-- Paper video. -->
<h2 class="title is-3">Results</h2>
<h3 class="title is-4">Quantitative Results</h3>
<!-- quantitative results start here -->
<h2 class="has-text-left is-size-6">
<b>Performance Comparison of Our Method and Baselines.</b> The mIoU (mean Intersection-over-Union) represents the average IoU across 19 categories. 'Enc.' denotes the encoder architecture, with 'R' representing ResNet101 and 'V' indicating MiT-B5. The 'Metric' column specifies whether evaluation was conducted on the Cityscapes ('C') or IDD ('I') dataset. The harmonic mean ('H'), representing adaptation ability across the two domains, is considered as the primary metric. Bold text highlights the best harmonic mean results, while underlined text indicates the second-best results. † signifies only merging backbones while keeping separate decode heads.
</h2>
<br>
<div class="columns is-centered">
<img src="static/images/compare_baseline.jpg" width="90%" />
</div>
<br>
<h2 class="has-text-left is-size-6">
<b>Comparison of Our Method with State-of-the-Art Approaches.</b>
Prior MTDA methods used different training methods from ours, only for reference.
† signifies results reproduced by us.
</h2>
<br>
<div class="columns is-centered">
<img src="static/images/compare_sota.jpg" width="50%" />
</div>
<!-- <img src="static/images/Teaser_cs1.jpg" width="100%" /> -->
<br>
<h2 class="has-text-left is-size-6">
<b>Application of Our Model Merging Techniques Across Four Target Domains.</b>
The datasets Cityscapes, IDD, ACDC, and DarkZurich are represented by 'C', 'I', 'A', and 'D', respectively. The mIoU of each dataset and the harmonic mean (H) is reported.
</h2>
<br>
<div class="columns is-centered">
<img src="static/images/four_domains.jpg" width="60%" />
</div>
<br><br>
<h3 class="title is-4">Qualitative Results</h3>
<div class="columns is-centered">
<!-- <div class="column is-four-fifths"> -->
<!-- <div class="publication-video"> -->
<img src="static/images/VisualResults.jpg" width="70%" />
<!-- </div> -->
<!-- </div> -->
</div>
<!-- <img src="static/images/Teaser_cs1.jpg" width="100%" /> -->
<h2 class="has-text-left is-size-6">
<b>Visualization results for GTA to Cityscapes and IDD.</b>
(a) Test images from Cityscapes and IDD. We visualize results of (b) single-target domain adaptation (STDA) trained on Cityscapes target, (c) single-target domain adaptation (STDA) trained on IDD target, (d) our model merging method. (e) Ground-truth segmentation maps.
</h2>
</div>
</div>
</section>
<!-- End youtube video -->
<!--BibTex citation -->
<section class="section" id="BibTeX">
<div class="container is-max-desktop content">
<h2 class="title">BibTeX</h2>
If you find our work useful in your research, please consider citing:
<div style="margin: 0.5em;"></div>
<pre><code>@inproceedings{li2024training,
title={Training-Free Model Merging for Multi-target Domain Adaptation},
author={Li, Wenyi and Gao, Huan-ang and Gao, Mingju and Tian, Beiwen and Zhi, Rong and Zhao, Hao},
booktitle={European Conference on Computer Vision},
year={2024},
organization={Springer}
}</code></pre>
</div>
</section>
<!--End BibTex citation -->
<footer class="footer">
<div class="container">
<div class="columns is-centered">
<div class="column is-8">
<div class="content">
<p>
This page was built using the
<a href="https://github.com/eliahuhorwitz/Academic-project-page-template" target="_blank">Academic Project
Page Template</a>
which was adopted from the <a href="https://nerfies.github.io" target="_blank">Nerfies</a> project page.
You are free to borrow the of this website, we
just ask that you link back to this page in the footer. <br />
This website is licensed under a
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/" target="_blank">Creative Commons
Attribution-ShareAlike 4.0 International
License</a>.
</p>
</div>
</div>
</div>
</div>
</footer>
<!-- Statcounter tracking code -->
<!-- You can add a tracker to track page visits by creating an account at statcounter.com -->
<!-- End of Statcounter Code -->
<script src="https://unpkg.com/beerslider/dist/BeerSlider.js"></script>
<script>
new BeerSlider(document.getElementById('slider1'), { start: '40' });
new BeerSlider(document.getElementById('slider2'), { start: '40' });
new BeerSlider(document.getElementById('slider3'), { start: '40' });
new BeerSlider(document.getElementById('slider4'), { start: '40' });
new BeerSlider(document.getElementById('slider5'), { start: '40' });
new BeerSlider(document.getElementById('slider6'), { start: '40' });
new BeerSlider(document.getElementById('slider7'), { start: '40' });
new BeerSlider(document.getElementById('slider8'), { start: '40' });
new BeerSlider(document.getElementById('slider9'), { start: '40' });
new BeerSlider(document.getElementById('slider10'), { start: '40' });
</script>
</body>
</html>