|
ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 6. Vol. 30. 2024
DOI: 10.17587/it.30.318-328
V. A. Egunov, Cand. Sc., Assistant Professor, A. G. Kravets, Dr. Sc., Professor,
Volgograd State Technical University, Volgograd, Russian Federation
The New Method for Increasing the Efficiency of Vectorization of BLAS Operations
The issue of increasing the efficiency of software for computing architectures that support vector extensions of the command system is considered. Modern compilers can perform automatic vectorization of calculations, convert programs from a scalar representation to a vector implementation. The paper analyzes the effectiveness of automatic vectorization performed by modern compilers, discusses the problems inherent in automatic vectorization. A new algorithm for vectorization of calculations is proposed, which allows to significantly increasing the efficiency of the resulting software.
Keywords: program efficiency, vectorization, optimization, automatic vectorization, SSE, AVX
P. 318-328
References
- Gergel V. P. Theory and Practice of Parallel Computing: Study Guide, Moscow, Internet-Universitet Informacionnyh Tekhologij, 2007, 424 p. (in Russian).
- Voevodin V. V., Voevodin Vl. V., Parallel Computing, SPb., BHV-Peterburg, 2002. 608 p. (in Russian).
- Tlyaumbetov I. A., Derevyanko A. E., Valiev A. A., Rozh-nov A. V. Hardware and software architecture of parallel computing, Fundamentalnaya i prikladnaya nauka: sostoyanie i tendencii razvitiya, 2020, pp. 336—341 (in Russian).
- Klejmenov A. A., Popova N. N. A method for predicting the execution time of programs for graphics processors, Computational Nanotechnology, 2021, vol. 8. no. 1, pp. 38—45 (in Russian).
- Levin I. I., Podoprigora A. V. A method of parallelization by basic macro operations for processing large sparse unstructured matrices on RCS, Izvestiya YUzhnogo federal'nogo universiteta. Tekhnicheskie nauki, 2022, no. 6 (230), pp. 72—83 (in Russian).
- Levin I. I., Dordopulo A. I. On the issue of automatic creation of parallel application programs for reconfigurable computing systems, Vychislitelnye tekhnologii, 2020, vol. 25, no. 1, pp. 66—81 (in Russian).
- Borisov R. S., Chernyh A. M. Dynamic load balancing of a heterogeneous computing system, Vestnik kompyuternyh i informacionnyh tekhnologij, 2017, no. 10, pp. 28—34 (in Russian).
- OpenMP. Reference Guides, available at: https://www. openmp.org/resources/refguides/.
- Intel. Intel® Threading Building Blocks, available at: https://www.inf.ed.ac.uk/teaching/courses/ppls/TBBtutorial.pdf.
- Intel. Intel® Parallel Studio XE 2019, available at: https://community.intel.com/legacyfs/online/drupal_files/parallel-studio-xe-2019-install-guide-windows_0.pdf.
- Kravets A. G., Egunov V. The Software Cache Optimization-Based Method for Decreasing Energy Consumption of Computational Clusters, Energies, 2022, vol. 15, no. 20, pp. 7509.
- Egunov V. A., Kravets A. G. A Method for Improving the Caching Strategy for Computing Systems with Shared Memory, Programmnaya inzheneriya, 2023, vol. 14, no. 7, pp. 329—338, DOI: 10.17587/prin.14.329-338 (in Russian).
- OpenBLAS. An optimized BLAS library, available at: https://www.openblas.net/
- Van Zee F. G., Van De Geijn R. A., Smith T. M., Marker B., Low T. M., Igual F. D., Smelyanskiy M., Zhang X., Kistler M., Austel V., Gunnels J. A., Killough L. The BLIS Framework: Experiments in Portability, ACM Transactions on Mathematical Software, 2016, vol. 42, no. 2, pp. 12.
- Intel. Intel Math Kernel Library (Intel MKL), available at: https://software.intel.com/ru-ru/intel-mkl/.
- Aggarwal, C. C. Optimization basics: a machine learning view. Linear Algebra and Optimization for Machine Learning, Cham, Springer, 2020.
- Shukla N., Fricklas K. Machine learning with TensorFlow, Greenwich, Manning, 2018.
- Stevens E., Antiga L., Viehmann T. Deep learning with PyTorch, New York, Manning Publications, 2020.
- Bourez C. Deep learning with Theano, Birmingham, Packt Publishing Ltd, 2017.
- Egunov V. A., Andreev A. E. Vectorization of algorithms for performing proper and singular matrix expansions using the Householder transformation, Prikaspijskij zhurnal: upravlenie i vysokie tekhnologii, 2020, no. 2 (50), pp. 71—85 (in Russian).
- Egunov V. A., Kravets A. G. Certificate of State Registration of a Computer Program ¹ 2023664318 03.07.23. Russian Federation. Support Module for Vectorization of Matrix Transformations, VSTU, 2023 (in Russian).
- Golub G., Van Loan C. Matrix Computations, Moscow, Mir, 1999, 548 p. (in Russian)
To the contents
|
|