- 刘子宸师兄讲解了提示学习的基础概念,并且确定了目前论文idea需要有哪些方面需要进一步跟进讨论:对于基于GNN的提示学习,图的节点的增加、修改、删除,前两个有一定效果,但是对于第三点目前不确定具体实现的方法,之前有讨论使用掩码的方式,但是具体效果有待进行实验和讨论。
- 包诗峰需要vig模型在视觉提示学习的benchmark上跑通,跑出全量微调和只调分类头的结果。并且在这个过程中熟悉vig和视觉提示学习的代码,以便下一步实现我们的方法。目前已经阅读了ViG的论文,并且进一步阅读提示学习领域相关文章。
-
刘子宸师兄
- ![image-20241014091217668](/Users/baoshifeng/Library/Application Support/typora-user-images/image-20241014091217668.png)
-
徐坤仑
- ![image-20241014092506194](/Users/baoshifeng/Library/Application Support/typora-user-images/image-20241014092506194.png)
-
艾子翔
- ![image-20241014095254318](/Users/baoshifeng/Library/Application Support/typora-user-images/image-20241014095254318.png)
-
- ![image-20241014100502582](/Users/baoshifeng/Library/Application Support/typora-user-images/image-20241014100502582.png)
-
梁俊杰
- ![image-20241014104131426](/Users/baoshifeng/Library/Application Support/typora-user-images/image-20241014104131426.png)
-
存疑的一些地方
- prompt learning部分参数冻结怎么做到反向传播更新的?
- ViG迁移到提示学习上,要解决的问题是什么?
-
阅读DAM-VP
- 然而,我们认为这些方法忽略了图像数据集的多样性分布特性,为每个数据集中的所有图像使用单一提示并不是最佳选择。在图1中,我们展示了提示收益与数据集多样性之间的关系。这里的收益代表了提示方法相比线性探针设置的精度提升。我们发现,**VP和VPT方法在低多样性数据集上大幅提高了模型精度,但在高多样性数据集上的增益相对较小,这符合直觉。**对于低多样性数据集,例如街景门牌号码(SVHN)数据集,所有图像的内容相似,因此统一的提示就足够了。相反,对于高多样性数据集,如ImageNet数据集,类别间没有预定义的关系,因此很难使用单一提示为所有图像提供合适的先验知识。
- Motivated by this observation, we propose our Diversity- Aware Meta Visual Prompting (DAM-VP). It has two core designs. Firstly, to provide a proper prompt for each image from high-diversity datasets, we propose a clustering-based prompt selection method. In detail, given a pre-trained vi- sual model and a downstream dataset, we use the off-the- shelf clustering method to cluster the feature of the down- stream data into several coarse-grained subsets, and guide each cluster to learn its own prompt separately. Based on the strong homogeneity of the same clustered data, the opti- mization of cluster-specific visual prompts can be greatly facilitated and the data commonalities can be also easily covered. Secondly, we argue the prompt across different clusters or datasets may have some shared pattern, from which the model can be adapted to a new dataset faster and get better performance. This motivates us to introduce a meta-learning-based method that learns a meta prompt and initializes the prompt of each cluster with it. 基于这一观察,我们提出了多样性感知元视觉提示(DAM-VP)。它有两个核心设计。首先,针对高多样性数据集中的每张图像提供合适的提示,我们提出了一种基于聚类的提示选择方法。具体来说,给定一个预训练视觉模型和下游数据集,我们使用现成的聚类方法将下游数据的特征聚类成几个粗粒度子集,并引导每个子集分别学习其特定的提示。基于相同聚类数据的强同质性,聚类特定提示的优化可以很容易地完成,并且数据的共性也能轻松捕获。其次,我们认为跨不同聚类或数据集的提示可能具有一些共享的模式,从而使模型能够更快地适应新的数据集并获得更好的性能。这促使我们引入了一种基于元学习的方法,它通过学习元提示来初始化每个聚类的提示
- 刘子宸师兄
- ![image-20241021091152324](/Users/baoshifeng/Library/Application Support/typora-user-images/image-20241021091152324.png)
- ![image-20241021111107005](/Users/baoshifeng/Library/Application Support/typora-user-images/image-20241021111107005.png)