Subsampling under distributional constraints
Résumé
Some complex models are frequently employed to describe physical and mechanical phenomena. In this setting we have an input X in a general space, and an output Y = f (X) where f is a very complicated function, whose computational cost for every new input is very high. We are given two sets of observations of X, S 1 and S 2 of different sizes such that only f (S 1) is available. We tackle the problem of selecting a subsample S 3 ∈ S 2 of smaller size on which to run the complex model f , and such that distribution of f (S 3) is close to that of f (S 1). We suggest three algorithms to solve this problem and show their efficiency using simulated datasets and the Airfoil self-noise data set.
Origine | Fichiers produits par l'(les) auteur(s) |
---|