Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N1 2020 year

DOI: 10.17587/prin.11.14-20
Modern Parallel Programming Tools in a Distributed Memory Model
M. B. Kuzminsky, kus@free.net, Zelinsky Institute of Organic Chemistry RAS, Moscow, 119991, Russian Federation, A. M. Chernetsov, chernetsovam@mpei.ru, National Research University Moscow Power Engineering Institute, Moscow, 111250, Russian Federation, an@ccas.ru, Dorodnicyn Computing Centre CSC RAS, Moscow, 119333, Russian Federation
Corresponding author: Kuzminsky Mikhail B., Ph. D., Senior Researcher, Zelinsky Institute of Organic Chemistry RAS, Moscow, 119991, Russian Federation, E-mail: kus@free.net
Received on September 02, 2019
Accepted on October 21, 2019

The paper provides an overview of the implementation of MPI parallelization tools, focused primarily on the field of high performance computing (HPC). Low-level communications software used in various MPI implementations (such as uDAPL, OFED, OFI, etc.), which significantly affect the achieved performance, is considered. For the most widely used MPI implementations in HPC (OpenMPI, MVAPICH2, Intel MPI), an analysis is made of the performance achieved in their modern versions using high-speed interconnects with remote direct access to RDMA memory (Ethernet 100 Gbit — RoCE / iWARP, Intel Omni-Path, Infiniband EDR) for communication of the most widely used computing nodes based on x86-64 processors. Among the characteristics of achieved performance, indicators of achieved throughput and message transmission delays are considered, including data in terms of achieved performance in widely used MPI microtests OMB (Ohio State University Micro Benchmarks) and IMB (Intel Micro Benchmark), taking into account its dependence from number of message and their sizes. These latency and throughput data also characterize the actual performance indicators of the interconnect hardware itself. More attention is paid in the review to one-way RMA communications, supported since MPI 2.0, which helps to increase productivity and support promising PGAS parallelization models. On the contrary, widespread performance tests using MPI applications, SPEC MPI 2007, that depend on a huge number of hardware and software parameters, not only related to interconnects and parallelization tools, are less relevant for HPC and are not analyzed in the paper.

Keywords: parallel programming tools, MPI, OpenMPI, MVAPICH2, Intel MPI, performance testing
pp. 14–20
For citation:
Kuzminsky M. B., Chernetsov A. M. Modern Parallel Programming Tools in a Distributed Memory Model, Programmnaya Ingeneria, 2020, vol. 11, no. 1, pp. 14—20.