Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces - Sorbonne Université
Conference Papers Year : 2021

Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces

Vincent Cohen-Addad
David Saulpic
Chris Schwiegelshohn

Abstract

In this paper, we consider the problem of finding high dimensional power means: given a set A of n points in R^d , find the point m that minimizes the sum of Euclidean distance, raised to the power z, over all input points. Special cases of problem include the well-known Fermat-Weber problem-or geometric median problem-where z = 1, the mean or centroid where z = 2, and the Minimum Enclosing Ball problem, where z = ∞. We consider these problem in the big data regime. Here, we are interested in sampling as few points as possible such that we can accurately estimate m. More specifically, we consider sublinear algorithms as well as coresets for these problems. Sublinear algorithms have a random query access to the set A and the goal is to minimize the number of queries. Here, we show that O ε −z−3 samples are sufficient to achieve a (1+ε)-approximation, generalizing the results from Cohen, Lee, Miller, Pachocki, and Sidford [STOC '16] and Inaba, Katoh, and Imai [SoCG '94] to arbitrary z. Moreover, we show that this bound is nearly optimal, as any algorithm requires at least Ω ε −z+1 queries to achieve said approximation. The second contribution are coresets for these problems, where we aim to find find a small, weighted subset of the points which approximates cost of every candidate point c ∈ R d up to a (1 ± ε) factor. Here, we show that O ε −2 points are sufficient, improving on the O dε −2 bound by Feldman and Langberg [STOC '11] and the O ε −4 bound by Braverman, Jiang, Krauthgamer, and Wu [SODA 21].
Fichier principal
Vignette du fichier
NeurIPS-2021-improved-coresets-and-sublinear-algorithms-for-power-means-in-euclidean-spaces-Paper.pdf (359.56 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-03944707 , version 1 (18-01-2023)

Identifiers

  • HAL Id : hal-03944707 , version 1

Cite

Vincent Cohen-Addad, David Saulpic, Chris Schwiegelshohn. Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces. Advances in Neural Information Processing Systems, Dec 2021, Virtual, France. pp.21085--21098. ⟨hal-03944707⟩
39 View
11 Download

Share

More