Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces
Abstract
In this paper, we consider the problem of finding high dimensional power means: given a set A of n points in R^d , find the point m that minimizes the sum of Euclidean distance, raised to the power z, over all input points. Special cases of problem include the well-known Fermat-Weber problem-or geometric median problem-where z = 1, the mean or centroid where z = 2, and the Minimum Enclosing Ball problem, where z = ∞. We consider these problem in the big data regime. Here, we are interested in sampling as few points as possible such that we can accurately estimate m. More specifically, we consider sublinear algorithms as well as coresets for these problems. Sublinear algorithms have a random query access to the set A and the goal is to minimize the number of queries. Here, we show that O ε −z−3 samples are sufficient to achieve a (1+ε)-approximation, generalizing the results from Cohen, Lee, Miller, Pachocki, and Sidford [STOC '16] and Inaba, Katoh, and Imai [SoCG '94] to arbitrary z. Moreover, we show that this bound is nearly optimal, as any algorithm requires at least Ω ε −z+1 queries to achieve said approximation. The second contribution are coresets for these problems, where we aim to find find a small, weighted subset of the points which approximates cost of every candidate point c ∈ R d up to a (1 ± ε) factor. Here, we show that O ε −2 points are sufficient, improving on the O dε −2 bound by Feldman and Langberg [STOC '11] and the O ε −4 bound by Braverman, Jiang, Krauthgamer, and Wu [SODA 21].
Domains
Computer Science [cs]
Fichier principal
NeurIPS-2021-improved-coresets-and-sublinear-algorithms-for-power-means-in-euclidean-spaces-Paper.pdf (359.56 Ko)
Télécharger le fichier
Origin | Files produced by the author(s) |
---|