Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N4 2023 year

DOI: 10.17587/prin.14.165-174
Finding Compiler Bugs Duplicates by Generating Witness Programs
D. S. Stepanov, Senior Lecturer, stepanov0995@gmail.com, V. M. Itsykson, PhD, Eng., Professor, itsykson@yandex.ru, Peter the Great St. Petersburg Polytechnic University (SPbPU), Saint Petersburg, 195251, Russian Federation
Corresponding author: Daniil S. Stepanov, Senior Lecturer, Peter the Great St. Petersburg Polytechnic University (SPbPU) Saint Petersburg, 195251, Russian Federation, E-mail: stepanov0995@gmail.com
Received on February 09, 2023
Accepted on February 27, 2023

Programming language compilers are complex software projects, the quality of which directly affects the quality of programs created by them. Therefore, compilers are subject to increased requirements for such software quality characteristics as functional suitability, reliability, performance level, safety, etc. To ensure quality, various methods are used: user and manual testing, tools for automatic error detection. And due to many methods of searching for bugs often situations arise when the same error, especially if its cause is trivial, is detected repeatedly, and the test cases that lead to it can be completely different from each other. Such test cases are called duplicates, and their determination is an urgent and acute problem, since their manual search requires a large amount of human resources. Algorithms for automatically finding duplicates would greatly simplify the process of developing and maintaining a compiler. The main idea of the approach for finding duplicates presented in the article is that the causes of the same errors are located in the same place in the compiler source code. To search for this place, the method of generating witness programs is used: for each test program containing an error, similar programs are generated that do not contain it. After that, metrics are calculated based on the source code coverage of the compiler and a list of source code files that are potentially causing the compiler to fail is formed. If these lists for two test programs are similar in terms of the proximity metric, then the test programs are considered duplicates. The proposed approach was developed and implemented for the compiler of the Kotlin programming language. Testing has shown the applicability of the proposed approach for solving the problem of finding duplicate errors of compilers of programming languages

Keywords: compiler testing, compiler bug duplicates, bugs isolation
pp. 165–174
For citation:
Stepanov D. S., Itsykson V. M. Finding Compiler Bugs Duplicates by Generating Witness Programs, Programmnaya Ingeneria, 2023, vol. 14, no. 4, pp. 165—174. DOI: 10.17587/prin.14.165-174 (in Russian).
References:
  1. Chen Y., Groce A., Zhang C. et al. Taming compiler fuzzers, Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation. New York, NY, USA,ACM, 2013, pp. 197—208. DOI: 10.1145/2491956.2462173.
  2. Gonzalez T. F. Clustering to minimize the maximum intercluster distance, Theor Comput Sci., 1985, vol. 38, pp. 293—306. DOI: 10.1016/0304-3975(85)90224-5.
  3. Holmes J., Groce A. Causal Distance-Metric-Based Assistance for Debugging after Compiler Fuzzing, 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE), IEEE, 2018, pp. 166—177. DOI: 10.1109/ISSRE.2018.00027.
  4. Myers E. W. AnO(ND) difference algorithm and its variations, Algorithmica, 1986. vol. 1, no. 1—4, pp. 251—266. DOI: 10.1007/BF01840446.
  5. Wong W. E., Gao R., Lo Y. et al. A Survey on Software Fault Localization, IEEE Transactions on Software Engineering, 2016, vol. 42, no. 8, pp. 707—740. DOI: 10.1109/TSE.2016.2521368.
  6. Zakari A., Lee S., Abreu R. et al. Multiple fault localization of software programs: A systematic literature review, Inf Softw Technol., 2020, vol. 124, pp. 106312. DOI: 10.1016/j.infsof.2020.106312.
  7. Soremekun E., Kirschner L., Bohme M. et al. Locating faults with program slicing: an empirical analysis, Empir Softw Eng., 2021, vol. 26, no. 3, pp. 51. DOI: 10.1007/s10664-020-09931-7.
  8. Chang B.-Y. E., Chlipala A., Necula G. et al. Type-based verification of sssembly language for compiler debugging, Proceedings of the 2005 ACM SIGPLAN international workshop on Types in languages design and implementation. New York, NY, USA, ACM, 2005, pp. 91—102. DOI: 10.1145/1040294.1040303.
  9. Hemmert K. S., Tripp J., Hutchings B. et al. Source level debugger for the Sea Cucumber synthesizing compiler, 11th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, FCCM 2003, IEEE Comput. Soc., 2003, pp. 228—237. DOI: 10.1109/FPGA.2003.1227258.
  10. Krebs N., Schmitz L. Jaccie: A Java-based compiler—compiler for generating, visualizing and debugging compiler components, Sci Comput Program., 2014, vol. 79, pp. 101—115. DOI: 10.1016/j.scico.2012.03.001.
  11. Chen J., Han J., Sun P. et al. Compiler bug isolation via effective witness test program generation, Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, New York, NY, USA, ACM, 2019, pp. 223—234. DOI: 10.1145/3338906.3338957.
  12. Sohn J., Yoo S. FLUCCS: using code and change metrics to improve fault localization, Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, New York, NY, USA,ACM, 2017, pp. 273—283. DOI: 10.1145/3092703.3092717.
  13. Abreu R., Zoeteweij P., van Gemund A. J. C. On the Ac­curacy of Spectrum-based Fault Localization, Testing: Academic and Industrial Conference Practice and Research Techniques — MUTA­TION (TAICPART-MUTATION 2007), IEEE, 2007, pp. 89—98. DOI: 10.1109/TAIC.PART.2007.13.
  14. JaCoCo — Java Code Coverage Library, available at: https://www.jacoco.org/jacoco/trunk/index.html (date of access 09.02.2023).
  15. Clover java and groovy code coverage tool homepage, available at:https://www.atlassian.com/software/clover/overview (date of access 09.02.2023).
  16. Cobertura java code coverage utility homepage, available at: http://cobertura.github.io/cobertura/ (date of access 09.02.2023).
  17. Stepanov D., Itsykson V. Backend Bug Finder — a plat­form for effective compiler fuzzing, Information and Control Systems, 2022, no. 6, pp. 31—40. DOI: 10.31799/1684-8853-2022-6-31-40.