main| new issue| archive| editorial board| for the authors| publishing house|
Ðóññêèé
Main page
New issue
Archive of articles
Editorial board
For the authors
Publishing house

 

 


ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 6. Vol. 30. 2024

DOI: 10.17587/it.30.318-328

V. A. Egunov, Cand. Sc., Assistant Professor, A. G. Kravets, Dr. Sc., Professor,
Volgograd State Technical University, Volgograd, Russian Federation

The New Method for Increasing the Efficiency of Vectorization of BLAS Operations

The issue of increasing the efficiency of software for computing architectures that support vector extensions of the command system is considered. Modern compilers can perform automatic vectorization of calculations, convert programs from a scalar representation to a vector implementation. The paper analyzes the effectiveness of automatic vectorization performed by modern compilers, discusses the problems inherent in automatic vectorization. A new algorithm for vectorization of calculations is proposed, which allows to significantly increasing the efficiency of the resulting software.
Keywords: program efficiency, vectorization, optimization, automatic vectorization, SSE, AVX

P. 318-328

References

  1. Gergel V. P. Theory and Practice of Parallel Computing: Study Guide, Moscow, Internet-Universitet Informacionnyh Tekhologij, 2007, 424 p. (in Russian).
  2. Voevodin V. V., Voevodin Vl. V., Parallel Computing, SPb., BHV-Peterburg, 2002. 608 p. (in Russian).
  3. Tlyaumbetov I. A., Derevyanko A. E., Valiev A. A., Rozh-nov A. V. Hardware and software architecture of parallel computing, Fundamentalnaya i prikladnaya nauka: sostoyanie i tendencii razvitiya, 2020, pp. 336—341 (in Russian).
  4. Klejmenov A. A., Popova N. N. A method for predicting the execution time of programs for graphics processors, Computational Nanotechnology, 2021, vol. 8. no. 1, pp. 38—45 (in Russian).
  5. Levin I. I., Podoprigora A. V. A method of parallelization by basic macro operations for processing large sparse unstructured matrices on RCS, Izvestiya YUzhnogo federal'nogo universiteta. Tekhnicheskie nauki, 2022, no. 6 (230), pp. 72—83 (in Russian).
  6. Levin I. I., Dordopulo A. I. On the issue of automatic cre­ation of parallel application programs for reconfigurable computing systems, Vychislitelnye tekhnologii, 2020, vol. 25, no. 1, pp. 66—81 (in Russian).
  7. Borisov R. S., Chernyh A. M. Dynamic load balancing of a heterogeneous computing system, Vestnik kompyuternyh i informacionnyh tekhnologij, 2017, no. 10, pp. 28—34 (in Russian).
  8. OpenMP. Reference Guides, available at: https://www. openmp.org/resources/refguides/.
  9. Intel. Intel® Threading Building Blocks, available at: https://www.inf.ed.ac.uk/teaching/courses/ppls/TBBtutorial.pdf.
  10. Intel. Intel® Parallel Studio XE 2019, available at: https://community.intel.com/legacyfs/online/drupal_files/parallel-studio-xe-2019-install-guide-windows_0.pdf.
  11. Kravets A. G., Egunov V. The Software Cache Optimi­zation-Based Method for Decreasing Energy Consumption of Computational Clusters, Energies, 2022, vol. 15, no. 20, pp. 7509.
  12. Egunov V. A., Kravets A. G. A Method for Improving the Caching Strategy for Computing Systems with Shared Memory, Programmnaya inzheneriya, 2023, vol. 14, no. 7, pp. 329—338, DOI: 10.17587/prin.14.329-338 (in Russian).
  13. OpenBLAS. An optimized BLAS library, available at: https://www.openblas.net/
  14. Van Zee F. G., Van De Geijn R. A., Smith T. M., Marker B., Low T. M., Igual F. D., Smelyanskiy M., Zhang X., Kistler M., Austel V., Gunnels J. A., Killough L. The BLIS Framework: Experiments in Portability, ACM Transactions on Mathematical Software, 2016, vol. 42, no. 2, pp. 12.
  15. Intel. Intel Math Kernel Library (Intel MKL), available at: https://software.intel.com/ru-ru/intel-mkl/.
  16. Aggarwal, C. C. Optimization basics: a machine learning view. Linear Algebra and Optimization for Machine Learning, Cham, Springer, 2020.
  17. Shukla N., Fricklas K. Machine learning with TensorFlow, Greenwich, Manning, 2018.
  18. Stevens E., Antiga L., Viehmann T. Deep learning with PyTorch, New York, Manning Publications, 2020.
  19. Bourez C. Deep learning with Theano, Birmingham, Packt Publishing Ltd, 2017.
  20. Egunov V. A., Andreev A. E. Vectorization of algorithms for performing proper and singular matrix expansions using the Householder transformation, Prikaspijskij zhurnal: upravlenie i vysokie tekhnologii, 2020, no. 2 (50), pp. 71—85 (in Russian).
  21. Egunov V. A., Kravets A. G. Certificate of State Registra­tion of a Computer Program ¹ 2023664318 03.07.23. Russian Federation. Support Module for Vectorization of Matrix Transformations, VSTU, 2023 (in Russian).
  22. Golub G., Van Loan C. Matrix Computations, Moscow, Mir, 1999, 548 p. (in Russian)

To the contents