Journal "Software Engineering" (Programmnaya Ingeneria) | Program Intermediate Representation Techniques

Main

New Issue

Archive

Most cited articles

Editor in chief

Editorial board

For the authors

Publishing ethic

Peer reviewing

Publishing House

Old site

Russian

Issue N8 2017 year

DOI: 10.17587/prin.8.345-353

Program Intermediate Representation Techniques

V. A. Vasenin, vasenin@msu.ru, M. A. Krivchikov, maxim.krivchikov@gmail.com, Lomonosov Moscow State University, Moscow, 119234, Russian Federation

Corresponding author: Krivchikov Maxim A., Senior Researcher, Lomonosov Moscow State University, Moscow, 119234, Russian Federation, E-mail: maxim.krivchikov@gmail.com

Received on April 21, 2017

Accepted on May 30, 2017

The concept of an intermediate representation of a program originates from software translation theory. In the present paper the classification of high-level intermediate representations is proposed, together with the review of current state-of-the-art research in intermediate representations. The paper presents a part of a research project aimed at development of intermediate representation for high-level specification of formal semantics for domain specific programming languages. The review is mostly concerned with mathematical representation of intermediate representations. The five proposed intermediate representation classes are:

low-level intermediate representations used mostly in compilers;
bytecode-based application virtual machines for imperative programming languages;
bytecode-based application virtual machines for functional programming languages;
internal graph-based intermediate representations;
high-level, programming language-specific intermediate representations based on canonical forms for the AST.

The review contains 22 intermediate representations. Looking at them in chronological order we can observe the following shift in problem statements. Earlier intermediate representations ([4, 6, 7, 11—14, 19, 21] and classical examples of SSA, RTL and three-address code) are used primarily as a fixed data structure, in other words, an interface between different stages of compilation pipeline. Recent intermediate representations ([2, 15—18, 20, 22, 23] and, to lesser degree, [4], mentioned previously) are more concerned with type system. These representations are used to simplify implementation of the type checker. In addition, we shall mention intermediate representations aimed at formal verification procedures ([1, 10, 24] and [22, 23] from previous group). Industrial intermediate representations (in comparison with academic research) usually have more primitive instructions or commands. The increase is mostly due to primitive type constants, operations and relations. We shall also note the similarity between intermediate representations for the object-oriented programming languages (e.g. [4, 6]) and dynamically-typed programming languages ([8, 12]): in both cases instructions of the dynamical (run-time) type checking are present. Related work includes the article [25], which contains systematic review of intermediate representations, and report [26] with rather detailed comparison of type system expressiveness for three typed intermediate representations.

Keywords: programming languages, intermediate representation, program translation, formal semantics, software engineering, review, classification

pp. 354–353

For citation:
Vasenin V. A., Krivchikov M. A. Program Intermediate Representation Techniques, Programmnaya Ingeneria, 2017, vol. 8, no. 8, pp. 345—353