A new coreset framework for clustering - Sorbonne Université
Communication Dans Un Congrès Année : 2021

A new coreset framework for clustering

Résumé

Given a metric space, the (k, z)-clustering problem consists of finding k centers such that the sum of the of distances raised to the power z of every point to its closest center is minimized. This encapsulates the famous k-median (z = 1) and k-means (z = 2) clustering problems. Designing small-space sketches of the data that approximately preserves the cost of the solutions, also known as coresets, has been an important research direction over the last 15 years. In this paper, we present a new, simple coreset framework that simultaneously improves upon the best known bounds for a large variety of settings, ranging from Euclidean space, doubling metric, minor-free metric, and the general metric cases: with Γ = min(ε −2 +ε −z , kε −2)polylog(ε −1), this framework constructs coreset with size • O (Γ • k(d + log k)) in doubling metrics, improving upon the recent breakthrough of [Huang, Jiang, Li, Wu, FOCS' 18], who presented a coreset with size O(k 3 d/ε 2). • O(Γ•k•min(d, ε −2 log k)) in d-dimensional Euclidean space, improving on the recent results of [Huang, Vishnoi, STOC' 20], who presented a coreset of size O(k log k • ε −2z • min(d, ε −2 log k)). • O(Γ • k(t + log k)) for graphs with treewidth t, improving on [Baker, Braverman, Huang, Jiang, Krauthgamer, Wu, ICML'20], who presented a coreset of size O(k 2 t/ε 2) for z = 1. • O Γ • k log 2 k + log k ε 4
Fichier principal
Vignette du fichier
main.pdf (689.67 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03505350 , version 1 (30-12-2021)

Identifiants

Citer

Vincent Cohen-Addad, David Saulpic, Chris Schwiegelshohn. A new coreset framework for clustering. STOC '21: 53rd Annual ACM SIGACT Symposium on Theory of Computing, Jun 2021, Rome ( Virtual ), Italy. pp.169-182, ⟨10.1145/3406325.3451022⟩. ⟨hal-03505350⟩
89 Consultations
53 Téléchargements

Altmetric

Partager

More