Skip to Main content Skip to Navigation
Conference papers

Parallel dual tree traversal on multi-core and many-core architectures for astrophysical N-body simulations

Abstract : In astrophysical N-body simulations, Dehnen's algorithm, implemented in the serial falcON code and based on a dual tree traversal, is faster than serial Barnes-Hut tree-codes, but outperformed by parallel CPU and GPU tree-codes. In this paper, we present a parallel dual tree traversal, implemented in the pfalcON code, targeting multi-core CPUs and many- core architectures (Xeon Phi). We focus here on both performance and portability, while preserving Dehnen's original algorithm. We first use task parallelism, with either OpenMP or Intel TBB, for the dual tree traversal. We then rely on the SPMD (single-program, multiple- data) model for the SIMD vectorization of the near field part thanks to the Intel SPMD Program Compiler. We compare the pfalcON performance to related work, and finally obtain performance results that match one of the best current tree-code implementations on GPU.
Complete list of metadata

Cited literature [15 references]  Display  Hide  Download

https://hal.sorbonne-universite.fr/hal-00947130
Contributor : Benoit Lange Connect in order to contact the contributor
Submitted on : Friday, May 30, 2014 - 2:34:38 PM
Last modification on : Saturday, December 4, 2021 - 4:05:37 AM
Long-term archiving on: : Saturday, August 30, 2014 - 10:44:24 AM

File

RR_hal-00947130_V2.pdf
Files produced by the author(s)

Identifiers

Citation

Benoit Lange, Pierre Fortin. Parallel dual tree traversal on multi-core and many-core architectures for astrophysical N-body simulations. 20th International Conference Euro-Par 2014 Parallel Processing, Aug 2014, Porto, Portugal. pp.716-727, ⟨10.1007/978-3-319-09873-9_60⟩. ⟨hal-00947130v2⟩

Share

Metrics

Record views

1264

Files downloads

1003