Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N8 2017 year

DOI: 10.17587/prin.8.345-353
Program Intermediate Representation Techniques
V. A. Vasenin, vasenin@msu.ru, M. A. Krivchikov, maxim.krivchikov@gmail.com, Lomonosov Moscow State University, Moscow, 119234, Russian Federation
Corresponding author: Krivchikov Maxim A., Senior Researcher, Lomonosov Moscow State University, Moscow, 119234, Russian Federation, E-mail: maxim.krivchikov@gmail.com
Received on April 21, 2017
Accepted on May 30, 2017

The concept of an intermediate representation of a program originates from software translation theory. In the present paper the classification of high-level intermediate representations is proposed, together with the review of current state-of-the-art research in intermediate representations. The paper presents a part of a research project aimed at development of intermediate representation for high-level specification of formal semantics for domain specific programming languages. The review is mostly concerned with mathematical representation of intermediate representations. The five proposed intermediate representation classes are:

  • low-level intermediate representations used mostly in compilers;
  • bytecode-based application virtual machines for imperative programming languages;
  • bytecode-based application virtual machines for functional programming languages;
  • internal graph-based intermediate representations;
  • high-level, programming language-specific intermediate representations based on canonical forms for the AST.
The review contains 22 intermediate representations. Looking at them in chronological order we can observe the following shift in problem statements. Earlier intermediate representations ([4, 6, 7, 11—14, 19, 21] and classical examples of SSA, RTL and three-address code) are used primarily as a fixed data structure, in other words, an interface between different stages of compilation pipeline. Recent intermediate representations ([2, 15—18, 20, 22, 23] and, to lesser degree, [4], mentioned previously) are more concerned with type system. These representations are used to simplify implementation of the type checker. In addition, we shall mention intermediate representations aimed at formal verification procedures ([1, 10, 24] and [22, 23] from previous group). Industrial intermediate representations (in comparison with academic research) usually have more primitive instructions or commands. The increase is mostly due to primitive type constants, operations and relations. We shall also note the similarity between intermediate representations for the object-oriented programming languages (e.g. [4, 6]) and dynamically-typed programming languages ([8, 12]): in both cases instructions of the dynamical (run-time) type checking are present. Related work includes the article [25], which contains systematic review of intermediate representations, and report [26] with rather detailed comparison of type system expressiveness for three typed intermediate representations.

Keywords: programming languages, intermediate representation, program translation, formal semantics, software engineering, review, classification
pp. 354–353
For citation:
Vasenin V. A., Krivchikov M. A. Program Intermediate Representation Techniques, Programmnaya Ingeneria, 2017, vol. 8, no. 8, pp. 345—353