Towards a Federated Learning Framework on a Multi-Cloud Environment
Abstract
This paper proposes Multi-FedLS, a Cross-silo Federated Learning (FL) framework for a multi-cloud environment aiming at minimizing financial cost as well as execution time. It comprises four modules: Pre-Scheduling, Initial Mapping, Fault Tolerance, and Dynamic Scheduler. Given an application and a multi-cloud environment, the Pre-Scheduling module runs experiments to obtain the expected execution times of the FL tasks and communication delays. The Initial Mapping module receives these computed values and provides a scheduling map for the server and clients' VMs. Finally, Multi-FedLS deploys the selected VMs, starts the FL application, and monitors it. The Fault Tolerance (FT) module includes fault tolerance strategies in the FL application, such as checkpoint and replication, and detects some anomalous behaviors. In case of an unexpected increase in the communication delay or a VM failure, the FT module triggers the Dynamic Scheduler Module in order to select a new VM and resume the concerned tasks of the FL application. Some preliminary experiments are presented, confirming that some proposed strategies are crucial to efficiently execute an FL application on a multi-cloud environment.
Origin | Files produced by the author(s) |
---|