Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N2 2026 year

DOI: 10.17587/prin.17.59-66
Dynamic Analysis of Automatic Vectorization and Runtime Memory Disambiguation in the LCC Compiler for the "Elbrus" Architecture
D. N. Levchenko1, 2, Software Engineer, Postgraduate Student, levchenko_d@mcst.ru, A. V. Ermolitsky1, Head of the Language Compiler division, era@mcst.ru, M. I. Neiman-Zade1, 2, Head of the Department of Programming Systems, Associate Professor at Department of Informatics and Computer Engineering, muradnz@mcst.ru,
1 AO "MCST", Moscow, 117437, Russian Federation,
2 Moscow Institute of Physics and Technology (National Research University), Moscow, 117303, Russian Federation
Corresponding author: Dmitry N. Levchenko, Software Engineer, AO "MCST", Moscow, 117437, Russian Federation, E-mail: levchenko_d@mcst.ru
Received on September 19, 2025
Accepted on October 21, 2025

The article addresses the problem of evaluating the efficiency of compiler optimizations, specifically autovectorization and runtime memory disambiguation (RTMD), within the LCC compiler for the Elbrus VLIW architecture. These optimizations are crucial for improving the performance of applications that has loops with floating-point computations. However, optimizations often introduce additional overhead, and in single-pass compilation without dynamic information, compiler heuristics may mispredict profile data. As a result, optimizations that are intended to accelerate program execution can instead lead to noticeable slowdowns. To mitigate this issue, the authors propose a combined method of detecting inefficiencies by instrumenting loops during compilation and using a runtime support library to gather execution statistics. The tool collects data of loop executions, analyzes whether vectorization and RTMD were effective, and generates detailed reports with recommendations for software developers. These recommendations include targeted insertion of pragmas, which allow programmers to disable unprofitable optimizations at critical points. Compared to traditional profile-guided optimizations, this approach avoids problems such as cumulative errors in profiling information and the need for repeated recompilation. Experimental results on the SPEC CPU 2006 and SPEC CPU 2017 rate benchmark suites demonstrated measurable improvements, with some tasks achieving speedups of up to 7.4 %. The method has been integrated into the LCC compiler and will be available in version 1.30, offering developers a practical tool to balance compiler heuristics with runtime performance characteristics on the Elbrus platform.

Keywords: compilers, loops optimizations, autovectorization, runtime memory disambiguation (RTMD), loop instrumentation, Elbrus architecture, VLIW, profile information
pp. 59—66
For citation:
Levchenko D. N., Ermolitsky A. V., Neiman-Zade M. I. Dynamic Analysis of Automatic Vectorization and Runtime Memory Disambiguation in the LCC Compiler for the "Elbrus" Architecture, Programmnaya Ingeneria, 2026, vol. 17, no. 2, pp. 59—66. DOI: 10.17587/prin.17.59-66. (in Russian).
References:
  1. Kim A. K., Perekatov V. I., Feldman V. M. Microprocessors and computing systems of the "Elbrus" family, St. Petersburg, Piter, 2013, 272 p. (in Russian).
  2. Devkota S., Aschwanden P., Kunen A. et al. CcNav: Understanding Compiler Optimizations in Binary Code, IEEE Transactions on Visualization and Computer Graphics, 2021, vol. 27, no. 2, pp. 667— 677. DOI: 10.1109/TVCG.2020.3030357.
  3. Zhao W., Cai B., Whalley D. et al. VISTA: A System for Interactive Code Improvement, LCTES/SCOPES '02: Proceedings of the joint conference on Languages, compilers and tools for embedded systems: software and compilers for embedded systems, 2002. pp. 155—164. DOI:10.1145/513829.513857.
  4. Devkota S., Isaacs K. E. CFGExplorer: Designing a Visual Control Flow Analytics System around Basic Program Analysis Operations, Eurographics Conference on Visualization (EuroVis)'18, 2018, pp. 453—464. DOI:10.1111/cgf.13433.
  5. Cooper K. D., Torczon L. Engineering a Compiler. 3-rd ed., Cambridge, USA, Morgan Kaufmann, 2023, 822 p.
  6. Drozdov A. Yu., Stepanenkov A. M. Loops Optimization Technology for Architectures with Hardware Pipelining Support, Informacionnye tekhnologii i vychislitel'nye sistemy, 2004, no. 3, pp. 52—62.
  7. Shahbahrami A., Juurlink B., Vassiliadis S. Performance Impact of Misaligned Accesses in SIMD Extensions, 17th Annual Workshop on Circuits Systems and Signal Processing, 2006, pp. 334—342.
  8. Jeffers J., Reinders J. Intel Xeon Phi Coprocessor High Performance Programming, USA, Waltham, Morgan Kaufmann, 2013, 430 p.