# Intermediate representation

> Mediated Wiki article. Canonical URL: https://mediated.wiki/source/Intermediate_representation
> Markdown URL: https://mediated.wiki/source/Intermediate_representation.md
> Source: https://en.wikipedia.org/wiki/Intermediate_representation
> Source revision: 1317312839
> License: Creative Commons Attribution-ShareAlike 4.0 International (https://creativecommons.org/licenses/by-sa/4.0/)

Data structure or code used by a compiler

"Intermediate form" redirects here. For the use of the term in biology, see [Transitional fossil](/source/Transitional_fossil).

Program execution General concepts Code Translation Compiler Compile time Optimizing compiler Linking Execution Runtime system Executable Interpreter Virtual machine Intermediate representation (IR) Types of code Source code Object code Bytecode Machine code Microcode Compilation strategies Ahead-of-time (AOT) Just-in-time (JIT) Tracing just-in-time Compile and go system Precompilation Transcompilation Recompilation Meta-tracing Notable runtimes Android Runtime (ART) BEAM (Erlang) Common Language Runtime (CLR) and Mono CPython and PyPy crt0 (C target-specific initializer) Java virtual machine (JVM) LuaJIT Objective-C and Swift's V8 and Node.js Zend Engine (PHP) Notable compilers & toolchains GNU Compiler Collection (GCC) LLVM and Clang MSVC Glasgow Haskell Compiler (GHC) v t e

An **intermediate representation** (**IR**) is the [data structure](/source/Data_structure) or code used internally by a [compiler](/source/Compiler) or [virtual machine](/source/Virtual_machine) to represent [source code](/source/Source_code). An IR is designed to be conducive to further processing, such as [optimization](/source/Compiler_optimization) and [translation](/source/Program_transformation).[1] A "good" IR must be *accurate* – capable of representing the source code without loss of information[2] – and *independent* of any particular source or target language.[1] An IR may take one of several forms: an in-memory [data structure](/source/Data_structure), or a special [tuple](/source/Tuple)- or [stack](/source/Stack_(abstract_data_type))-based [code](/source/Bytecode) readable by the program.[3] In the latter case it is also called an *intermediate language*.

A canonical example is found in most modern compilers. For example, the [CPython interpreter](/source/CPython) transforms the linear human-readable text representing a program into an intermediate [graph structure](/source/Graph_(data_structure)) that allows [flow analysis](/source/Flow_analysis) and re-arrangement before execution. Use of an intermediate representation such as this allows compiler systems like the [GNU Compiler Collection](/source/GNU_Compiler_Collection) and [LLVM](/source/LLVM) to be used by many different source languages to [generate code](/source/Code_generation_(compiler)) for many different target [architectures](/source/Instruction_set).

## Intermediate language

This section needs more citations. Please help improve this article by adding citations to reliable sources in this section. Unsourced material may be challenged and removed. (February 2025) (Learn how and when to remove this message)

An **intermediate language** is the language of an [abstract machine](/source/Abstract_machine) designed to aid in the analysis of [computer programs](/source/Computer_program). The term comes from their use in [compilers](/source/Compiler), where the source code of a program is translated into a form more suitable for code-improving transformations before being used to generate [object](/source/Object_file) or [machine](/source/Machine_language) code for a target machine. The design of an intermediate language typically differs from that of a practical [machine language](/source/Machine_language) in three fundamental ways:

- Each instruction represents exactly one fundamental operation; e.g. "shift-add" [addressing modes](/source/Addressing_mode) common in [microprocessors](/source/Microprocessors) are not present.

- [Control flow](/source/Control_flow) information may not be included in the instruction set.

- The number of [processor registers](/source/Processor_register) available may be large, even limitless.

A popular format for intermediate languages is [three-address code](/source/Three-address_code).

The term is also used to refer to languages used as intermediates by some [high-level programming languages](/source/High-level_programming_language) which do not output object or machine code themselves, but output the intermediate language only. This intermediate language is submitted to a compiler for such language, which then outputs finished object or machine code. This is usually done to ease the process of [optimization](/source/Optimization_(computer_science)) or to increase [portability](/source/Porting) by using an intermediate language that has compilers for many [processors](/source/Central_processing_unit) and [operating systems](/source/Operating_systems), such as [C](/source/C_(programming_language)). Languages used for this fall in complexity between high-level languages and [low-level](/source/Low-level_programming_language) languages, such as [assembly languages](/source/Assembly_language).

### Languages

Though not explicitly designed as an intermediate language, [C](/source/C_(programming_language))'s nature as an abstraction of [assembly](/source/Assembly_language) and its ubiquity as the *de facto* [system language](/source/System_programming_language) in [Unix-like](/source/Unix-like) and other operating systems has made it a popular intermediate language: [Eiffel](/source/Eiffel_(programming_language)), [Sather](/source/Sather), [Esterel](/source/Esterel), some [dialects](/source/Programming_language_dialect) of [Lisp](/source/Lisp_(programming_language)) ([Lush](https://en.wikipedia.org/w/index.php?title=Lush_(programming_language)&action=edit&redlink=1), [Gambit](/source/Gambit_(Scheme_implementation))), [Squeak](/source/Squeak)'s Smalltalk-subset Slang, [Nim](/source/Nim_(programming_language)), [Cython](/source/Cython), [SystemTap](/source/SystemTap), [Vala](/source/Vala_(programming_language)), V, and others make use of C as an intermediate language. Variants of C have been designed to provide C's features as a portable [assembly language](/source/Assembly_language), including [C--](/source/C--) and the [C Intermediate Language](/source/C_Intermediate_Language).

Any language targeting a [virtual machine](/source/Virtual_machine) or [p-code machine](/source/P-code_machine) can be considered an intermediate language:

- [Java bytecode](/source/Java_bytecode)

- Microsoft's [Common Intermediate Language](/source/Common_Intermediate_Language) is an intermediate language designed to be shared by all compilers for the [.NET Framework](/source/.NET_Framework), before static or dynamic compilation to machine code.

- While most intermediate languages are designed to support statically typed languages, the [Parrot intermediate representation](/source/Parrot_intermediate_representation) is designed to support dynamically typed languages—initially Perl and Python.

- [TIMI](/source/IBM_i#TIMI) is used by compilers on the [IBM i](/source/IBM_i) platform.

- [O-code](/source/O-code) for [BCPL](/source/BCPL)

- [MATLAB](/source/MATLAB) precompiled code

- [Microsoft P-Code](/source/Microsoft_P-Code)

- [Pascal](/source/Pascal_(programming_language)) [p-code](/source/P-code)

The [GNU Compiler Collection](/source/GNU_Compiler_Collection) (GCC) uses several intermediate languages internally to simplify portability and [cross-compilation](/source/Cross-compilation). Among these languages are

- the historical [Register Transfer Language](/source/Register_Transfer_Language) (RTL)

- the tree language [GENERIC](/source/GNU_Compiler_Collection#GENERIC_and_GIMPLE)

- the [SSA](/source/SSA_(computing))-based [GIMPLE](/source/GIMPLE). (Lower-level than GENERIC; input for most optimizers; has a compact "bytecode" notation.)

GCC supports generating these IRs, as a final target:

- [HSA Intermediate Layer](/source/HSA_Intermediate_Layer)

- [LLVM Intermediate Representation](/source/LLVM#Intermediate_representation) (converted from GIMPLE in the now-defunct llvm-gcc which uses LLVM optimizers and codegen)

The [LLVM](/source/LLVM) compiler framework is based on the [LLVM IR](/source/LLVM#Intermediate_representation) intermediate language, of which the compact, binary serialized representation is also referred to as "bitcode" and has been productized by Apple.[4][5] Like GIMPLE Bytecode, LLVM Bitcode is useful in link-time optimization. Like GCC, LLVM also targets some IRs meant for direct distribution, including Google's [PNaCl](/source/Native_Client) IR and [SPIR](/source/Standard_Portable_Intermediate_Representation). A further development within LLVM is the use of *Multi-Level Intermediate Representation* ([MLIR](/source/MLIR_(software))) with the potential to generate code for different heterogeneous targets, and to combine the outputs of different compilers.[6]

The ILOC intermediate language[7] is used in classes on compiler design as a simple target language.[8]

## Other

[Static analysis](/source/Static_program_analysis) tools often use an intermediate representation. For instance, [Radare2](/source/Radare2) is a toolbox for binary files analysis and reverse-engineering. It uses the intermediate languages ESIL[9] and REIL[10] to analyze binary files.

## See also

- [Abstract syntax tree](/source/Abstract_syntax_tree) – Tree representation of the abstract syntactic structure of source code

- [BURS](/source/BURS)

- [Bytecode](/source/Bytecode) – Instruction set designed to be run by a software interpreter

- [Graph rewriting](/source/Graph_rewriting) – Creating a new graph from an existing graph

- [Interlingual machine translation](/source/Interlingual_machine_translation) – Type of machine translation

- [Pivot language](/source/Pivot_language) – Intermediary language between different languages

- [Source-to-source compiler](/source/Source-to-source_compiler) – Translator of computer source code

- [Symbol table](/source/Symbol_table) – Data structure used by a language translator such as a compiler or interpreter

- [Term rewriting](/source/Term_rewriting) – Replacing subterm in a formula with another termPages displaying short descriptions of redirect targets

- [UNCOL](/source/UNCOL)

## References

1. ^ [***a***](#cite_ref-Walker_1-0) [***b***](#cite_ref-Walker_1-1) Walker, David. ["CS320: Compilers: Intermediate Representation"](http://www.cs.princeton.edu/courses/archive/spr03/cs320/notes/IR-trans1.pdf) (Lecture slides). Retrieved 12 February 2016.

1. **[^](#cite_ref-Chow_2-0)** Chow, Fred (22 November 2013). ["The Challenge of Cross-language Interoperability"](https://queue.acm.org/detail.cfm?id=2544374). *ACM Queue*. **11** (10). Retrieved 12 February 2016.

1. **[^](#cite_ref-Toal_3-0)** Toal, Ray. ["Intermediate Representations"](http://cs.lmu.edu/~ray/notes/ir/). Retrieved 12 February 2016.

1. **[^](#cite_ref-Apple's_bitcode_4-0)** ["Bitcode (iOS, watchOS)"](https://news.ycombinator.com/item?id=9684223). Hacker News. 10 June 2015. Retrieved 17 June 2015.

1. **[^](#cite_ref-LLVM_Bitcode_5-0)** ["LLVM Bitcode File Format"](http://llvm.org/docs/BitCodeFormat.html). llvm.org. Retrieved 17 June 2015.

1. **[^](#cite_ref-6)** ["MLIR"](https://mlir.llvm.org/).

1. **[^](#cite_ref-7)** ["An ILOC Simulator"](http://www.engr.sjsu.edu/wbarrett/Parser/simManual.htm) [Archived](https://web.archive.org/web/20090507084132/http://www.engr.sjsu.edu/wbarrett/Parser/simManual.htm) 2009-05-07 at the [Wayback Machine](/source/Wayback_Machine) by W. A. Barrett 2007, paraphrasing Keith Cooper and Linda Torczon, "Engineering a Compiler", [Morgan Kaufmann](/source/Morgan_Kaufmann), 2004. [ISBN](/source/ISBN_(identifier)) [1-55860-698-X](https://en.wikipedia.org/wiki/Special:BookSources/1-55860-698-X).

1. **[^](#cite_ref-8)** ["CISC 471 Compiler Design"](http://www.cis.udel.edu/~pollock/471/project2spec.pdf) by Uli Kremer

1. **[^](#cite_ref-ESIL_9-0)** Radare2 Contributors. ["ESIL"](https://web.archive.org/web/20150818235122/http://radare.gitbooks.io/radare2book/content/esil.html). Radare2 Project. Archived from [the original](https://radare.gitbooks.io/radare2book/content/esil.html) on 18 August 2015. Retrieved 17 June 2015.

1. **[^](#cite_ref-REIL_10-0)** Sebastian Porst (7 March 2010). ["The REIL language – Part I"](http://blog.zynamics.com/2010/03/07/the-reil-language-part-i/). zynamics.com. Retrieved 17 June 2015.

## External links

- The Stanford SUIF Group

---
Adapted from the Wikipedia article [Intermediate representation](https://en.wikipedia.org/wiki/Intermediate_representation) by Wikipedia contributors ([contributor history](https://en.wikipedia.org/wiki/Intermediate_representation?action=history)). Available under [Creative Commons Attribution-ShareAlike 4.0 International](https://creativecommons.org/licenses/by-sa/4.0/). Changes may have been made.
