A Massively Parallel Modularity-Maximizing Algorithm with Provable Guarantees
Résumé
Graph clustering is one of the most basic and popular unsupervised learning problems. Among the different formulations of the problem, the modularity objective has been particularly successful in helping design impactful algorithms; Most notably, the Louvain algorithm has become one of the most used algorithm for clustering graphs. However, one major limitation of the Louvain algorithm is its sequential nature which makes it impractical in distributed environments and on massive datasets. In this paper, we provide a parallel version of Louvain which works in the massively parallel computation model (MPC). We show that it recovers the ground-truth clusters in the classic stochastic block model in only a constant number of parallel rounds, and so for a wider regime of parameters than the standard Louvain algorithm as shown recently in [Cohen-Addad et al. 2020].