A pure internal representation with linear types

3.2. A pure internal representation with linear types

The classical compilation community has found great advantages in sharing a common standardised IR format. Indeed, whilst the exact syntax and constructs vary from language to language, and whilst, at the other end of the compiler stack, the specific assembly instructions that are emitted differ between hardare targets, much of the compiler middleware can be broadly shared between use cases. This gave rise to the LLVM Lattner, 2004Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International Symposium on Code Generation and Optimization, 2004. CGO 2004.. IEEE, 75--86. doi: 10.1109/CGO.2004.1281665, and more recently, the MLIR Lattner, 2021Chris Lattner, Mehdi Amini, Uday Bondhugula, Albert Cohen, Andy Davis, Jacques Pienaar, River Riddle, Tatiana Shpeisman, Nicolas Vasilache and Oleksandr Zinenko. 2021. MLIR: Scaling Compiler Infrastructure for Domain Specific Computation. In 2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), February 2021. IEEE, 2--14. doi: 10.1109/CGO51591.2021.9370308 projects, which provide common compiler IRs, along with all the infrastructure compilers typically require: IR transformation tooling, translation into hardware-specific assembly, efficient serialisations, in-memory formats etc.

The idea of adopting LLVM for quantum was championed by QIR QIR Al., 2021 QIR Alliance. 2021. QIR Specification v0.1. Retrieved on 31/12/24 from https://www.qir-alliance.org/, a standard introducing quantum primitives into the LLVM IR. This was subsequently adopted by many quantum hardware providers for its superior expressive power compared to circuit-based formats QIR Al., 2023 QIR Alliance. 2023. NVIDIA Joins the QIR Alliance as the Effort Enters Year Two. Retrieved on 02/01/2025 from https://www.qir-alliance.org/posts/year_one_in_review/. Building on top of QIR, an IR specifically for quantum-classical programs was proposed in Mark K., 2025Seyon Sivarajah, Alan Lawrence, Alec Edgington, Douglas Wilson, Craig Roy, Luca Mondada, Lukas Heidemann, Ross Duncan Mark Koch. 2025. HUGR: A Quantum-Classical Intermediate Representation. In Fifth International Workshop on Programming Languages for Quantum Computing (PLanQC 2025), with additional soundness guarantees based among others on the no-cloning principle of quantum information. In parallel, projects with similar aims have also emerged McCask., 2021Alexander McCaskey and Thien Nguyen. 2021. A MLIR Dialect for Quantum Assembly Languages. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), October 2021. IEEE. doi: 10.1109/QCE52317.2021.00043 Ittah, 2022David Ittah, Thomas Häner, Vadym Kliuchnikov and Torsten Hoefler. 2022. QIRO: A Static Single Assignment-based Quantum Program Representation for Optimization. ACM Transactions on Quantum Computing 3, 3 (June 2022, 1--32). doi: 10.1145/3491247 that make use of the full MLIR and LLVM toolchain.

Describing the specifics of any one quantum programming language or compiler IR is beyond our scope and would force us to restrict our considerations to a narrow subset of the options that are still being actively explored and developed. For the purposes of this thesis, it is sufficient to introduce a simplified “toy” IR that we will call minIR. It captures all the expressiveness that we require for hybrid programs whilst remaining sufficiently abstract to be applicable to a variety of IRs and, by extension, programming languages.

We will provide formal graph-based semantics in the next section, but will first give an overview of minIR in a perhaps more conventional, textual form. The exact details of the text-based syntax are therefore of no importance.

MinIR is built from statements of the form

x, y, ... := op(a, b, c, ...)

to be understood as an operation op applied on the inputs a, b, c, ... and producing the outputs x, y, .... The number of inputs and outputs is determined by the type signature of the operation op. Purely for syntactic convenience, we will often use “parametric” operations written as op<T1, ...>. These denote different operations that we group together for their similar semantics but might differ by e.g. their type signature. In those cases we might also omit the parameters within <> if they can be unambiguously infered from the context.

The list of all valid operations along with their signatures must be defined by the user. A valid set of type signatures could be:

# A typical quantum gate
h:                  Qubit -> Qubit
# A classical XOR gate on two inputs
xor<Bit, Bit>:      Bit, Bit -> Bit
# The same XOR gate, but with three inputs
xor<Bit, Bit, Bit>: Bit, Bit, Bit -> Bit
# Measurements consume a qubit and produce a bit
measure:            Qubit -> Bit

Bit and Qubit are the defined types. The operations are h, measure and two variants of xor, that differentiate themselves by the type signatures.

In minIR, each variable used either as an input (right-hand side) or an output (left-hand side) to an operation is a value with a fixed type. The type can always be infered from the type signature of the operation that produces and/or consumes the value. Throughout, we can thus omit type parameters without ambiguity. We will use TT to refer to the set of defined types and strings of types in TT^\ast whenever we consider the input or output types of an operation signature.

Every value must appear exactly once on the left hand side of an IR statement, as an output. Values are immutable and represent a piece of data in the computation at one moment in time. This IR format is known in compiler speak as single static assignment (SSA) Rosen, 1988B. K. Rosen, M. N. Wegman and F. K. Zadeck. 1988. Global Value Numbers and Redundant Computations. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages - POPL ’88. ACM Press, 12--27. doi: 10.1145/73560.73562. This represents a notable departure from quantum circuits, where operations are typically thought of as “placed” on a qubit (the “lines” in the circuit representation) that sits there for the entire duration of the computation. In the value semantics of SSA, on the other hand, qubits only exist in the form of the data they encode: when applying an operation, the (quantum) data is “given” to the operation and new data is returned.

To make the difference clear, compare the program representations of the following computation:

Quantum circuit

H C X X

minIR (SSA)
in : q0_0, q1_0 out : q0_2, q1_2

q0_1       := h(q0_0)
q0_2, q1_1 := cx(q0_1, q1_0)
q1_2       := x(q1_1)

In value semantics, it becomes much harder to track qubits across their life span. This has very practical implications: without the convenient naming scheme, it would for example be non-trivial to count how many qubits are required in the SSA representation of the computation above. However, it is a drastically simpler picture from the point of view of the compiler and the optimiser – hence its popularity in classical compilers. When operations are defined based on the qubit they act on, the compiler must carefully track the ordering of these operations: operations on the same qubit must always be applied in order. Through multi-qubit gates, this also imposes a (partial) ordering on operations across different qubits. Conversely, the fact that two operations, separated in the computation by a million other operations, will eventually be applied on the same physical entity is totally irrelevant information to the optimiser.

This is precisely what value semantics abstracts away: the notion of physical qubit disappears and the ordering of statements becomes irrelevant. All that matters is to connect each use of a value (i.e. occurrence as an input on the right hand side of an IR statement) with its unique definition on the left hand side of another statement.

In minIR, a program is thus defined by a set of statements, with no explicit ordering defined between them. The program is well-defined if

  1. every value is defined exactly once (SSA),
  2. the dataflow graph connecting operations defining values with their uses is acyclic (acylicity) and,
  3. it can be successfully typed, i.e. each value can be assigned a type such that the type signature of each operation in the program is satisfied (well-typedness).

The lack of explicit statement ordering differentiates minIR (and HUGR) from most classical IRs, which unless specified otherwise typcially assume that instructions may have side effects and thus cannot be reordered. All quantum operations (and the classical operations we are interested in) are all side-effect free, which significantly simplifies our IR.

Linear Types and Structured Control Flow #

We have studied so far how to express the data flow of a program, i.e. how outputs of previous computations feed into the inputs of subsequent ones. An equally essential function of an IR is to capture the control flow of a program: the order of execution of the program instructions, which is how an IR can express loops, conditionals, function calls etc.

The most popular – and simplest – way of expressing control flow in classical IRs is with the introduction of branch statements1. For instance, LLVM IR provides a conditional branch statement

br i1 %cond, label %iftrue, label %iffalse

that will jump to the iftrue label if %cond is true and to the iffalse label otherwise.

This is a both simple and versatile approach to control flow that can be used to express any higher level language constructs. However, in the context of quantum computing, the combination of value-based semantics, conditional branching and the no-cloning theorem is proving to be a toxic brew.

Indeed, quantum data introduces an additional condition on the well-formedness of IR programs: every value that represents a quantum state must not only be defined exactly once, but must also be used exactly once. We call a type that satisfies this constraint a linear type2:

  1. every value with a linear type is used exactly once (linearity).

In the absence of conditional branching, this is a very simple verification pass to perform on our IR, the exact opposite of checking that each value is defined exactly once.

However, as conditional branching is introduced into the IR, the linearity constraint would have to be relaxed to allow for a single use in each mutually exclusive branch of the program. The following two uses of b should be allowed (in pseudo-IR):

b := h(a)
<if cond> c := x(b)
<else>    d := h(b)

This is now a much harder invariant to check on the IR and would be extremely error prone! Instead, we resort to structured control flow to express control flow at a higher abstraction level and maintain the linearity constraint in its simplest form.

Regions #

We introduce a hierarchical nesting of IR blocks that we will call regions, following the MLIR terminology. Regions delimit a set of instructions that can then be passed to higher order control flow constructs (functions, conditionals, loops etc.) and be used as their bodies3. To simplify, minIR restricts every value to only be available within its defining region. In particular, unlike nested scopes in most programming languages, regions do not have access to values of regions within which they are nested.

The values to a region are passed in as arguments to the region, as in function calls in programming languages. Let us call UTU \in T^\ast the types that the region uses, i.e. takes as inputs and DTD \in T^\ast the types it defines. We require every such region to have a designated, unique in: () -> U and out: D -> () operation4. We introduce the following text syntax for regions:

region_name := def {
    arg1, arg2, ... := in()
    # region body using only values arg1, arg2, ...
    ...
    # and defining values res1, res2, ...
    () := out(res1, res2, ...)
}

We can then use region_name as any other value in minIR to refer to the region. Its type will be a special Region<U, D> type that is specific to the input and output types of the region. The only way we can obtain a value of this type is from such a region definition5. Note that regions here are parametrised on type strings instead of mere types. Whenever we use type strings as parameters or operation type signatures, they should be viewed as splatted, i.e. the pair U, D should be read as the concatenation of types in UU and DD.

Regions allow us to define (very minimal) subroutines! We can define a call operation to make use of them:

call<U, D>: Region<U, D>, U -> D

We can thus define a region to obtain a Region value that we then pass on to call operations to invoke the subroutine. Finally, the out statement of the region designates the values that the subroutine returns. These are in turn passed to the outputs of the call operation where the subroutine was invoked. In summary, a subroutine definition and invocation may look like this:

f := {
  a, b := in()
  c, d = cx(a, b)
  out(c, d)
}

z, w := call(f, x, y)

The effect of the call operation to a subroutine is to redirect the control flow to the set of operations within the region. We can obtain the sequence of operations that result from the invocation by replacing all uses of values output by the region with the input values to the call op, and conversely, swapping out the values defined by the outputs of the call op for the values in the subroutine in the out statement. In pictures:

The following subroutine invocation

c a l l r e g o i i u n o t n

results in the following control flow

c a l l r e o g u i i t n o n

The left diagram represents the IR, with arrows connecting value definitions to their uses (the arrow with rounded edges is the Region value passed to call). The right diagram on the other hand shows the simplified IR, where the subroutine was inlined, i.e. the call operation was replaced with the region as its body.

Notice that because of the precise type signatures we imposed on the call operations and the region definition, the types of the values that are rewired between the left and right diagrams will always match.

Conditionals #

Using the same Region value, we can model conditional control flow and loops similarly. We introduce for this a new boolean type Bool that can hold the values True or False:

ifelse<U, D>: Bool, U, Region<U, D>, Region<U, D> -> D
dowhile<T>: Region<T, (T, Bool)>, T -> T

The ifelse operation is passed four inputs: a boolean condition, a set of input values and two Regions. Depending on the value of the boolean condition, one of the two Regions will be evaluated by passing it the input values. The dowhile operation is similar, but instead of being provided a boolean conditional as input, a Bool value can be computed by evaluating the Region passed as input. At the first iteration, the region passed to the dowhile operation is evaluated on the input value of type T . Each region evaluation returns a new value of T. If the returned boolean condition is true, the Region is re-evaluated on the new value of T. The loop terminates when the returned boolean condition is false. The outputs of the dowhile operation are the values of T returned by the last evaluation of the region.

The ifelse operation

i f _ r i e f g - i c o o o u n d t e i f i o n i u p f t u e p t l u s s t e s B o o e l l s e e l e s l e s - o e c u _ o t r d e e g i o n

The dowhile operation

i d o n o u p w t u h p t i u s l t e s l o r o r e p e g - t i c u o o r n d n e r e p e a t ?

This approach can be extended to support virtually any control flow primitives, as required by the available programming language abstractions. For the purposes of minIR, we will contend ourselves with the three constructs just introduced.


All the concepts introduced here also embed themselves very easily within the MLIR-based quantum IRs, as well as the (proper) HUGR IR. In this sense, our toy IR serves as the minimum denominator across IRs and compiler technologies, so that proposals and contributions we are about to make can be applied regardless of the underlying technical details.

By waiving goodbye to the circuit model, we have been able to integrate much of the theory of traditional compiler design and has brought us much closer to traditional compiler research and the large scale software infrastructure that is already available. This naturally gives us access to all the classical optimisation and program transformation techniques that were developed over decades. Using structured control flow, we were also able to model linear resources such as qubits well – by using value semantics and SSA, checking that no-cloning is not violated is as simple as checking that each linear value is used exactly once.

Finally, this new design is also extremely extensible. Not only does it support arbitrary operations, the type system is very flexible, too. There is dedicated support for linear types, but this does not have to be restricted to qubits: lists of qubits could be added or even, depending on the target hardware, higher dimensional qudits, continuous variable quantum data, etc.


  1. You may know this from prehistoric times as the goto statement, in languages such as Fortran, C, and, yes, even Go↩︎

  2. The terminology comes from “linear” logic Girard, 1987Jean-Yves Girard. 1987. Linear logic. Theoretical Computer Science 50, 1 (1--101). doi: 10.1016/0304-3975(87)90045-4. We apologise for slamming additional semantics on an already very overloaded term. ↩︎

  3. i.e. a region could be used as the body of a function, or the code within an if block. ↩︎

  4. Here and in the following, () designates the empty type string, or unit type. In our context it designates functions that take no input or return no output. ↩︎

  5. This is like the label of a block in LLVM IR. You can also view it as a function pointer, but note that the code that it points to is always fixed in the IR! ↩︎