-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DiversityPartitioner中缺少lmbalance项 #41
Comments
你好,找到一组数据分布同时满足diversity比例、用户数据量分布、数据集属性分布是个比较复杂的问题,即假设用户持有数据数量构成一个矩阵,其中每个元素是某用户持有某类数据的数量,整理一下可以发现找这个分布相当于优化一个非线性整数规划问题,需要调用求解器进行求解。我将在后续的更新中找到一个合适的方式将imbalance插入进去。 |
还想请问一下,现在的库能实现波动客户端集吗,就是初始化定义一开始的客户端,然后在需要新的客户端加入时,再进行数据的分配,并将其加入训练过程,如果目前不能实现的话,我是否能够改动实现这一效果 |
你好,1)目前数据集的分配不能够动态地进行,每个task生成后用户的数据分布就是固定的了,这种设置也比较贴合实际情形;2)另一方面我不明确你所描述的波动的客户端的具体含义;如果是指部分用户对于服务器在训练早期不可见,而后期训练过程中动态加入的话,可以在Simulator中通过设置用户的活跃性实现,比如分别设置每个用户活跃的起点轮数,即从某一轮开始后变成始终活跃;这样服务器通过available_clients属性可以访问每一轮活跃的用户,从而无法在早期接触到没加入的用户;如果指的是服务器主动增添新的客户端的话,我认为可以不对用户做修改,通过设置服务器的行为实现相同的效果,比如一开始只与self.clients中的前10个用户交互,后面逐步扩充交互用户的规模 |
集合波动性就是可用的客户可能在不同的时间发生变化,并且可能有新的客户加入培训。我所设想的理想情况是能够在初始化是先设定一批客户端,在随后的训练过程需要有新的客户端集加入时,独立的进行数据分配并将其加入系统中,如果现在没有办法解决,我会考虑您所说的解决方案,您也可以考虑一下这方面的代码,我认为以后的联邦学习可能也会从波动性入手 |
你好,可用的用户可能在不同的时间发生变化这个概念在FL里现有工作一般称作Intermittent Client Availability,具体论文包括MIFA、F3AST等。有新的用户加入训练有一篇忘了名字的ICML论文称作Flexiable Participation。如果是实现Client Availability的话,我想这个tutorials可以帮到你https://flgo-xmu.github.io/Tutorials/4_Simulator_Customization/4.1_Client_Availability/ ,此外我觉得从Client Availability角度去模拟Flexiable Participation也是可行的。 |
你好,这个变量的含义是:在同一个聚合轮次内,用户的availability是否会随着时间变化。我贴了一个例子来解释这件事:这里simulator1是roundwise_fixed_availability=True, simulator2是False(后面简称该变量为rfa)。然后我稍微修改了下fedavg算法,让服务器在每个iterate里通过self.gv.clock.step(1)让时间强制流逝1个时间单位。此时若rfa为True,则用户的availability在下次模型聚合之前,不会随着时间变化而变化;若rfa为False,则用户的availability会随着时间变化而变化,与聚合轮次无关。 import flgo
import flgo.algorithm.fedavg as fedavg
import flgo.simulator.base as fsb
import random
class MySimulator1(fsb.BasicSimulator):
def update_client_availability(self):
if self.gv.clock.current_time==0:
self.set_variable(self.all_clients, 'prob_available', [1 for _ in self.clients])
self.set_variable(self.all_clients, 'prob_unavailable', [int(random.random() >= 0.5) for _ in self.clients])
return
pa = [0.1 for _ in self.clients]
pua = [0.1 for _ in self.clients]
self.set_variable(self.all_clients, 'prob_available', pa)
self.set_variable(self.all_clients, 'prob_unavailable', pua)
self.roundwise_fixed_availability = True
class MySimulator2(fsb.BasicSimulator):
def update_client_availability(self):
if self.gv.clock.current_time==0:
self.set_variable(self.all_clients, 'prob_available', [1 for _ in self.clients])
self.set_variable(self.all_clients, 'prob_unavailable', [int(random.random() >= 0.5) for _ in self.clients])
return
pa = [0.1 for _ in self.clients]
pua = [0.1 for _ in self.clients]
self.set_variable(self.all_clients, 'prob_available', pa)
self.set_variable(self.all_clients, 'prob_unavailable', pua)
self.roundwise_fixed_availability = False
class Server(fedavg.Server):
def iterate(self):
print("The number of currently available clients: {}".format(len(self.available_clients)))
print("The availability of clients being selected at last round: {}".format([(cid in self.available_clients) for cid in self.selected_clients]))
self.gv.clock.step(1) # 等待1个时间单位
print('After a time unit...')
print("The number of currently available clients: {}".format(len(self.available_clients)))
print("The availability of clients being selected at last round after a second: {}".format([(cid in self.available_clients) for cid in self.selected_clients]))
self.selected_clients = self.sample()
models = self.communicate(self.selected_clients)['model']
self.model = self.aggregate(models)
return True
class MyFedavg:
Server = Server
Client = fedavg.Client
if __name__=='__main__':
task = 'my_task'
flgo.init(task, MyFedavg, option={'gpu':0, 'num_steps':1, 'sample':'uniform_available', 'num_rounds':5}, Simulator=MySimulator1).run()
flgo.init(task, MyFedavg, option={'gpu': 0, 'num_steps': 1, 'sample': 'uniform_available', 'num_rounds': 5}, Simulator=MySimulator2).run()
运行这段代码后,可以看到屏幕上的输出为: # Simulator1-rfa=True的情形
2023-09-19 11:00:46,006 fedbase.py run [line:253] INFO Eval Time Cost: 1.4580s
The number of currently available clients: 34
The availability of clients being selected at last round: [False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False]
After a time unit...
The number of currently available clients: 34
The availability of clients being selected at last round after a second: [False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False 可以看到在一个时间单位后用户的活跃分布不发生改变。 # Simulator2-rfa=False的情形
The number of currently available clients: 33
The availability of clients being selected at last round: [False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False]
After a time unit...
The number of currently available clients: 35
The availability of clients being selected at last round after a second: [False, False, False, False, False, False, False, True, False, False, False, True, False, False, True, True, False, False, False, False] 可以看到用户的活跃分布在1个时间单位后刷新了。 |
你好,这种情况是因为noniid程度太极端,同时local epoch过大,导致模型收敛极慢甚至无法收敛;此时需要调小local epoch(或直接设置num_steps替代num_epochs)或是调小learning_rate,才能观察到数十个round内损失稳定下降;或是用使用一些针对niid问题进行了优化的算法替代fedavg; |
class DiversityPartitioner(BasicPartitioner):
"""`Partition the indices of samples in the original dataset according to numbers of types of a particular
attribute (e.g. label) . This way of partition is widely used by existing works in federated learning.
貌似imbalance项没有在调用里
The text was updated successfully, but these errors were encountered: