Dual tree traversal on integrated GPUs for astrophysical N-body simulations - Sorbonne Université Accéder directement au contenu
Article Dans Une Revue International Journal of High Performance Computing Applications Année : 2019

Dual tree traversal on integrated GPUs for astrophysical N-body simulations

Résumé

In astrophysical N-body simulations, O(N) fast multipole methods (FMMs) with dual tree traversal (DTT) on multi-core CPUs are faster than O(N log N) CPU tree-codes but can still be outperformed by GPU ones. In this paper, we aim at combining the best algorithm , namely FMM with DTT, with the most powerful hardware currently available, namely GPUs. In the astrophysical context requiring low accuracies and non-uniform particle distributions, we show that such combination can be achieved thanks to an hybrid CPU-GPU algorithm on integrated GPUs: while the DTT is performed on the CPU cores, the far-and near-field computations are all performed on the GPU cores. We show how to efficiently expose the interactions resulting from the DTT to the GPU cores, how to deploy both the far-and near-field computations on GPU and how to overlap the parallel DTT on CPU with GPU computations. Based on the falcON code and using OpenCL on AMD Accelerated Processing Units and on Intel integrated GPUs, this first heterogeneous deployment of DTT for FMM outperforms standard multi-core CPUs, and matches GPU and high-end CPU performance, being hence more cost-and power-efficient.
Fichier principal
Vignette du fichier
article-HAL.pdf (751.7 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02073710 , version 1 (23-07-2020)

Identifiants

Citer

Pierre Fortin, Maxime Touche. Dual tree traversal on integrated GPUs for astrophysical N-body simulations. International Journal of High Performance Computing Applications, 2019, 33 (5), pp.960-972. ⟨10.1177/1094342019840806⟩. ⟨hal-02073710⟩
261 Consultations
289 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More