Informacionnye Tehnologii, 2025, vol. 31, no. 12, pp. 649-658

Ðóññêèé

ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 12. Vol. 31. 2025

DOI: 10.17587/it.31.649-658

S. G. Bobkov, PhD, Professor, Deputy Head of the Department, D. N. Zmejev, Researcher, A. V. Klimov, Senior Researcher, N. N. Levchenko, PhD, Leading Researcher, Department for Design Problems in Microelectronics of the Center for Advanced Microelectronics NRC,
"Kurchatov institute", Moscow, Russian Federation

Optimization of the Matching Process in Computing Systems Implementing a Dataflow Computing Model with a Dynamically Formed Context

Received on 31.05.2025
Accepted on 27.06.2025

The paper considers the problem of a drop in real performance with an increase in the number of computing cores on supercomputer systems. The results of the first three systems from the TOP500 List are analyzed. The main approaches to solving the problem of increasing real performance are given.
A dataflow computing model with a dynamically formed context is considered. The architecture of a parallel dataflow computing system implementing this model is described. The model and architecture are one of the approaches to improving the real performance of computing systems. The principles of functioning of hardware ternary content-addressable memory, which implements the dataflow computing model in the most efficient way, are described. This is due to the fact that the concept of the computing model, such as a token workspace, involves the simultaneous comparison of an incoming token with all tokens present in this space. One of the key problems of content-addressable memory is analyzed — high power consumption when performing matching operations. Methods for optimizing the matching process are proposed, which are divided into three groups — hardware, software, and hardware-software. Optimization methods are aimed at overall reducing the number of "parasitic" comparisons, as well as reducing the number of compared bits. An analysis of its effectiveness was carried out for each method. One of the effective methods of reducing the number of matchings in the content-addressable memory of the keys of the matching processor of the parallel dataflow computing system is the use of special "Double grouped" tokens. These tokens allow not only to reduce the total number of task tokens, but also to reduce the load on the communication network, reduce the number of comparisons, and free up execution units by transferring part of the load to the matching processor.
The research results obtained when performing various tasks on the behavioral block-register model of the system and the emulator are presented. The results demonstrate the effectiveness of the proposed methods.
Keywords: content-addressable memory, parallel dataflow computing system, dataflow computing model, matching process

Acknowlegements: The work was carried out within the state assignment of NRC "Kurchatov institute".

P. 649-658

Full text on eLIBRARY

References

Kokosinski Z., Malus B. FPGA implementations of a parallel associative processor with multi-comparand multi-search operations, in International Symposium on Parallel and Distributed Computing 2008, 1-5 July 2008, Krakow, Poland, pp. 444—448, DOI: 10.1109/ISPDC.2008.42.
Levin I. I., Dordopulo A. I., Kaljaev I. A., Doronchenko Ju. I., Raskladkin M. K. Modern and next-generation high-performance computer systems with reconfigurable architecture, Vestn. JuUrGU. Ser. Vych. matem. Inform, 2015 á vol. 4, no. 3, pp. 24—39, DOI: 10 .14529/cmse150303 (in Russian).
Cicconetti C. À practical introduction to quantum computing and networking, in proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing (HPDC '24), Association for Computing Machinery, New York, NY, USA, 2024, pp. 348—349, DOI: 10.1145/3625549.3660507.
Abdeldayem H., Frazier D. O. Optical computing: need and challenge, Communications of the ACM, 01 September 2007, vol. 50, issue 9, pp. 60 —62, DOI: 10.1145/1284621.1284649.
Burcev Â . C. About the need to create a supercomputer in Russia, Cbornik statej "Parallelizm vychislitel'nyh processov i razvitie arhitektury superJeVM", Moscow, IVVS RA, 1997 (in Russian).
Smith K. C., Sedra À. S. Associative memory, Encyclopedia of Computer Science, John Wiley and Sons Ltd., GBR, 2003, pp. 105—106.
Biihrn A. P. W. Dataflow and hybrid dataflow architecture summary, Parallel computer systems, Rebecca Koskela and Margaret Simmons (Eds.), New York, NY, USA, ACM, 1990, pp. 281—286.
Arvind A., Brobst S. The evolution of dataflow architectures: from static dataflow to P-RISC, International Journal of High Speed Computing, 1993, vol. 5, no. 2, pp. 125—153.
Lee B., Hurson À. R. Dataflow Architectures and Multi threading, Computer, Aug 1994, vol. 27, no. 8, pp. 27 —39.
Silc J., Robic B., Ungerer T. Asynchrony in parallel computing: From dataflow to multithreading, Parallel and Distributed Computing Practices, 1998, vol. 1, no. 1, pp. 3—30.
Zmeev D. N., Klimov A. V., Levchenko N. N., Okunev A. S., Stempkovskii À . L. Features of the architecture implementing the dataflow computational model and its application in the creation of microelectronic high-performance computing systems, Russian Microelectronics, 2019, vol. 48, no. 5, pp. 292—298.
Stempkovskij A. L., Levchenko N. N., Okunev A. S., Cvetkov V. V. Parallel stream computing system — further development of architecture and structural organization of computing system with automatic resource allocation, Informacionnye Tehnologii, 2008, no. 10, pp. 2—7 (in Russian).
Zmeev D. N. Design tools of high-performance dataflow computing systems, Problemy razrabotki perspektivnyh mikro- i nanojelektronnyh sistem - 2016. Sbornik trudov, À . L. Stempkovskii ed., Moscow, IPPM RAN, 2016, part II, pp. 159—163 (in Russian).
Levchenko N. N., Okunev A. S., Zmejev D. N. Development tools for high-performance computing systems using associative environment for computing process organization, in Proceedings of IEEE EAST-WEST DESIGN & TEST SYMPOSIUM (EWDTS'2016), Yerevan, Armenia, 2016, October 14—17, pp. 359—362.
Soleimani P., Capson D. W., Li K. F. À partitioned CAM architecture with FPGA acceleration for binary descriptor matching, ACM Transactions on Reconfigurable Technology and Systems, March 2024, vol. 17, no. 1, article 10, pp. 1—21, DOI: 10.1145/3624749.
To the contents