Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N8 2024 year

DOI: 10.17587/prin.15.387-401
Distributed Caching System with Strict Consistency Guarantee for Critical Information Infrastructure
V. O. Repin, Postgraduate Student, hiddenstmail@gmail.com, A. A. Sidorov, Head of Department, anatolii.a.sidorov@tusur.ru, Tomsk State University of Control Systems and Radioelectronics, Tomsk, 634050, Russian Federation
Corresponding author: Anatoly A. Sidorov, Head of Department, Tomsk State University of Control Systems and Radioelectronics, Tomsk, 634050, Russian Federation, E-mail: anatolii.a.sidorov@tusur.ru
Received on May 21, 2024
Accepted on June 25, 2024

The place of caching in modern information systems is considered. Typical models of network applications architecture construction and consistency guarantees that they provide to the end user are presented. A comparative analysis of analogues is given, which allows one to outline the features of the proposed system. The developed architecture is described. The architecture of a separate cluster node is described, as well as the architecture and implementation of software development kits for different programming languages. The specifics of designing complex client libra­ries for distributed systems are considered. The importance of consistency models is noted, their interrelationships and hierarchy are described based on the summarized information of existing research. A brief comparison of the two popular consensus algorithms, Raft and Paxos, is given in order to determine the basics of designing a common architecture. Industry leaders in caching systems emphasize performance, deliberately relaxing the consistency of the stored data. Using such systems in an environment that requires maintaining the order of events in the external world is difficult or involves the need for additional tools to maintain data consistency. This paper proposes a caching system that guarantees strict data consistency and provides a single interface for handling it, similar to industry leaders.

Keywords: cache, distributed systems, linearizability, consistency, fault-tolerance, architecture, distributed cache, cache eviction algorithms, critical, critical information infrastructure
pp. 387—401
For citation:
Repin V. O., Sidorov A. A. Distributed Caching System with Strict Consistency Guarantee for Critical Information Infrastructure, Programmnaya Ingeneria, 2024, vol. 15, no. 8, pp. 387—401. DOI: 10.17587/prin.15.387-401.
This research was funded by Ministry of Science and Higher Education of the Russian Federation; project FEWM-2023-0013.
References:
  1. Rodriguez L. V., Yusuf F., Lyons S. et al. Learning Cache Replacement with Cacheus, In Proc. 19th USENIX Conference on File and Storage Technologies, Santa Clara, USA, Feb 2021, pp. 341— 354.
  2. Liu Z., Bai Zh., Liu Zh. et al. Provable Load Balancing for LargeScale Storage Systems with Distributed Caching, 17th USENIX Conference on File and Storage Technologies (FAST 19), 2019, pp. 143—157.
  3. Pang R., Caceres R., Burrows M. et al. Zanzibar: Google's Consistent, Global Authorization System, Usenix-atc. usenix. Renton, WA, USA, July 2019, pp. 33—46.
  4. Bailis P., Davidson A., Fekete A. et al. Highly available transactions: Virtues and limitations, Proc. VLDB Endow., 2013, vol. 7, no. 3, pp. 181—192. DOI: 10.14778/2732232.2732237.
  5. Viotti P., Vukolic M. Consistency in non-transactional distributed storage systems, ACM Comput. Surv., 2016, vol. 49, no. 1, pp. 1—34. DOI: 10.1145/2926965.
  6. Ongaro D., Ousterhout J. In Search of an Understandable Consensus Algorithm, Proceedings of the 2014 USENIX conference on USENIX Annual Technical Conference (USENIX ATC'14), 2014, pp. 305—320.
  7. Mohan S., Tay N. Scalable Distributed Cache Using Consistent Hashing, available at: https://www.scs.stanford.edu/22sp-cs244b/projects/Scalable%20Distributed%20Cache%20Using%20Consistent%20Hashing.pdf (date of access 15.04.2024).
  8. Torabi H., Khazaei H., Litoiu M. A Learning-Based Caching Mechanism for Edge Content Delivery, arXiv:2402.02795 [cs. NI], 2024, available at: https://arxiv.org/abs/2402.02795v2 (date of access 15.04.2024).
  9. Berger D. S., Berg B., Zhu T., Sen S., Harchol-Balte M. RobinHood: Tail Latency Aware Caching — Dynamic Reallocation from Cache-Rich to Cache-Poor, USENIX OSDI'18, 2018, pp. 195—212.
  10. Abdi M., Mosayyebzadeh A., Hajkazemi M. H. et al. A Community Cache with Complete Information, FAST 2021, 2021, pp. 323—340.
  11. Strati F., Mcallister S. DejaVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving, arXiv:2403.01876 [cs. DC], 2024, available at: https://arxiv.org/abs/2403.01876 (date of access 15.04.2024).
  12. Alam Sh.I., Wazir S., Khalique A., Hassan S. I. Data Cache with Distributed Cache: A Design Approach, SSRG International Journal of Computer Science and Engineering, 2017, vol. 4, no. 6, pp. 17—23. DOI: 10.14445/23488387/IJCSE-V4I6P104.
  13. Cotroneo D., Natella R., Rosiello S. Dependability Evaluation of Middleware Technology for Large-scale Distributed Caching, arXiv:2008.06943v2 [cs.SE], 2020, available at: https://arxiv.org/ pdf/2008.06943 (date of access 17.04.2024).
  14. Ren K., Thomson A., Abadi D. J. VLL: a lock manager redesign for main memory database systems, The VLDB Journal 24, January 2015, pp. 681—705. DOI: 10.1007/s00778-014-0377-7.
  15. TIOBE Index for May 2024, available at: https://www.tiobe.com/tiobe-index (date of access 19.04.2024).
  16. Lord of the io_uring, available at: https://unixism.net/loti/index.html (date of access 25.04.2024).
  17. Leesatapornwongsa T., Lukman J. F., Lu Sh., Gunawi H. S. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems, ASPLOS'16, 2016, pp. 517—530. DOI: 10.1145/2872362.2872374.
  18. Ganesan A., Alagappan R., Arpaci-Dusseau A. C., Arpaci-Dusseau R. H. Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions, FAST 17, 2017, pp. 149—165.
  19. Lukman J. F., Ke H., Suminto R. O. et al. FlyMC: Highly Scalable Testing of Complex Interleavings in Distributed Systems, EuroSys'19, 2019. DOI: 10.1145/3302424.3303986.