HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Theses

Sliced-Wasserstein distance for large-scale machine learning : theory, methodology and extensions

Abstract : Many methods for statistical inference and generative modeling rely on a probability divergence to effectively compare two probability distributions. The Wasserstein distance, which emerges from optimal transport, has been an interesting choice, but suffers from computational and statistical limitations on large-scale settings. Several alternatives have then been proposed, including the Sliced-Wasserstein distance (SW), a metric that has been increasingly used in practice due to its computational benefits. However, there is little work regarding its theoretical properties. This thesis further explores the use of SW in modern statistical and machine learning problems, with a twofold objective: 1) provide new theoretical insights to understand in depth SW-based algorithms, and 2) design novel tools inspired by SW to improve its applicability and scalability. We first prove a set of asymptotic properties on the estimators obtained by minimizing SW, as well as a central limit theorem whose convergence rate is dimension-free. We also design a novel likelihood-free approximate inference method based on SW, which is theoretically grounded and scales well with the data size and dimension. Given that SW is commonly estimated with a simple Monte Carlo scheme, we then propose two approaches to alleviate the inefficiencies caused by the induced approximation error: on the one hand, we extend the definition of SW to introduce the Generalized Sliced-Wasserstein distances, and illustrate their advantages on generative modeling applications; on the other hand, we leverage concentration of measure results to formulate a new deterministic approximation for SW, which is computationally more efficient than the usual Monte Carlo technique and has nonasymptotical guarantees under a weak dependence condition. Finally, we define the general class of sliced probability divergences and investigate their topological and statistical properties; in particular, we establish that the sample complexity of any sliced divergence does not depend on the problem dimension.
Complete list of metadata

https://tel.archives-ouvertes.fr/tel-03533097
Contributor : Abes Star :  Contact
Submitted on : Tuesday, January 18, 2022 - 4:23:17 PM
Last modification on : Wednesday, January 19, 2022 - 3:05:23 AM

File

106842_NADJAHI_2021_archivage....
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-03533097, version 1

Collections

Citation

Kimia Nadjahi. Sliced-Wasserstein distance for large-scale machine learning : theory, methodology and extensions. Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT050⟩. ⟨tel-03533097⟩

Share

Metrics

Record views

190

Files downloads

155