5.5. Bounding the search space size
We show in this section that under some assumptions on the GTSs that hold in the use cases of interest to quantum compilation, there is a provable gap between the size of the search space of all reachable graphs in the GTS and the size of the corresponding confluently persistent data structure .
Let us introduce first the notion of overwriting rewrites.
For two rewrites and , we say that overwrites , written , if the deletion set of includes a vertex of the vertex set of the replacement graph of
The definition can identically be applied to events. In this case, the overwriting events are precisely given by the parent-child relation: the set of all overwriting events of is by definition the set of parents of .
Our argument relies on the comparison of asymptotic bounds for the sizes of two sets and , which we now define. Consider a GTS and an input graph . A graph is reachable from within depth if there is a sequence of rewrites in the GTS from to such that all subsequences formed of overwriting rewrites have length at most .
The set is the set of all graphs reachable within depth . We derive:
- a lower bound on the size of , the space of all graphs reachable in at most rewrites from some input , and
- an upper bound on the size of the equivalent confluently persistent data
structure , i.e. such that
In order to obtain bounds, we will introduce hypotheses that the GTSs must satisfy. Throughout this section, we will illustrate and motivate the restrictions that they impose in the following two use cases.
Use case 1: -complete GTS #
The first GTS we consider can be defined on any graph domain that has a notion of graph size (e.g. based on number of nodes, number of edges, etc.1) and for any graph size . The GTS is such that for any subgraph of size , there is at least one transformation rule in the GTS that matches . We will call this case CompleteGTS.
This is the use case of quantum superoptimisation discussed in section 3.1 and used for benchmarking in section 4.7. In those cases, the transformation rules are obtained by enumerating all small circuits up to a certain size , thus guaranteeing that any subcircuit of size will be matched by the GTS.
Note that there is also an (obvious) upper bound on the number of transformation rules that can match on any given subgraph: the total number of transformation rules in the GTS.
Use case 2: single-rule GTS in a uniform domain #
At the other extreme of the GTS spectrum, we can consider a GTS made of a single (arbitrary) transformation rule. In this case, we require that graphs are drawn from a domain uniformly at random, so that for any subgraph , all patterns of size are equally likely. We will call this case SingleRuleGTS.
In this case, we will not show that our hypotheses hold for all inputs, but rather that they hold with a high probability. We will phrase our statements as a function of and will require that they hold with probability for randomly drawn .
This regime is interesting as it is the simplest instance of problem domains for which few assumptions can be made about the GTS themselves, but all inputs are expected to be equally likely.
Lower bound on the naive search tree #
The event history of the set of graphs defines a tree , where are the nodes and is the parent of if there is a rewrite rewriting to in the GTS. Paths in are sequences of rewrites. We call the naive search tree of the GTS. We wish to derive a lower bound for .
Graph partitioning #
For fixed search depth , let be the largest integer such that for all graphs , there exists disjoint subgraphs of that satisfies the following property, for all
there exists a rewrite in the GTS that can be applied to .
For a fixed GTS, a fixed depth and a family of input graphs , we have the scaling .
We conjecture that this scaling holds for many GTSs of interest:
CompleteGTS and for SingleRuleGTS probabilistically.In CompleteGTS, it suffices to partition any input into disjoint subgraphs of size at least . Each subgraph will match a rule of the GTS by definition.
For SingleRuleGTS, let . Let be the size of the left hand side of the GTS rule. By assumption, for any subgraph of size of an input , there is a constant probability that the rule matches . For a subgraph of size , the probability of the rule not matching in is . Picking
for some ensures that whenever (i.e. for large enough),
where was chosen. It follows that a partition of into disjoint subgraphs of size at least satisfies the hypothesis with probability .
Lower bound on #
Fix the tree depth and the GTS. Any rewrite from the GTS removes at most a constant number of vertices from the graph it applies on. Thus, Any graph in is at least of size . Let be the smallest value of for a graph . Whenever hypothesis 1 applies, we have .
For each and , pick a rewrite that applies to and let be the set of all such rewrites
We can consider the subtree that only contains graphs obtained by applying rewrites in .
For , the search tree will contain graphs: for each subgraph of the input graph , we can choose to either apply or not2.
By repeating times the search tree of size for , we obtain a lower bound
We frame this result as the following proposition.
As a result, for any GTS satisfying Hypothesis 1 the size of grows at least exponentially with input graph size and search depth .
Upper bound on the factorised search space #
Consider two rewrites and . If neither overwrites the other, i.e. and , then the order in which they are applied is irrelevant (see also Proposition 5.2). The persistent data structure uses this symmetry explicitly when exploring the set of reachable graphs.
This drastically reduces the size of the event history of . The event history defines a directed acyclic graph that is the equivalent for to the naive search tree of . The vertices of are the flattened histories of events in :
with an edge from to if there is a parent-child relation . We call the factorised search space of .
By construction, any graph in the naive search tree maps injectively to a subgraph of the factorised search space, given by the subgraph of induced by the rewrites on the path from to in .
Whereas our earlier discussion was focused on proving a lower bound for the size of the search tree, we now show an upper bound on the number of graphs in the factorised search space.
Graph covering #
Instead of considering partitions of graphs in as we did above, we now consider coverings of graphs in , i.e. a set of subgraphs such that their union is but that might not be disjoint.
Let , and be parameters and fix a covering for each graph such that:
- for all and all , there are at most applicable rewrites to the -th covering set of . Furthermore, all rewrites within are mutually exclusive, i.e. they modify a shared subgraph so that it is never possible to apply more than one rewrite among the (up to ) applicable ones3;
- for all and for all rewrites that apply to , there is such that applies to (i.e. the matching subgraph of is fully contained within one of the coverings). Furthermore, the matching subgraph of overlaps with at most distinct ;
- for all rewrites that apply to the covering set of a graph , the image must be a subgraph of the -th covering subgraph of .
The first condition is satisfied whenever the size of the coverings can be bounded: in that case can be chosen based on the number of distinct subgraphs that can be contained in a covering set and the number of rules that can apply to each. The second condition is related to the connectivity between the covering sets: can thus often be derived by considering how many neighbours a covering set has, and how many of those neighbours can a match of a GTS rule span.
The third condition above can be understood as “rewrites must preserve the coverings”. In other words, the coverings are chosen such that a graph mutation produced by the application of a rewrite on is always contained within a single covering subgraph of .
For a fixed GTS, a fixed depth and a family of input graphs , we have the scaling and and .
These conditions along with Hypothesis 2 are somewhat restrictive and future
work should explore how to relax them. For our use cases CompleteGTS and
SingleRuleGTS, we restrict our considerations to the special case where the
graph domain is quantum circuits. We make the further simplifying assumptions
(similar results can be obtained with variations on these assumptions)
- transformation rules have two-qubit circuits as left and right hand sides, and
- the number of qubits on the inputs is fixed (i.e. the number of gates on each qubit scales with circuit size).
Define to be the largest number of gates on any one qubit in the left hand sides of the GTS transformation rules. Consider a partition of the gates on each qubit into sequences of gates. We can obtain a covering of a quantum circuit by considering covering sets defined for all such that for all matches of a left hand side of a transformation rule of the GTS, if and only if there is a gate in such that is in the -th sequence of gate on some qubit in . Imposing the condition that rewrites must preserve coverings, the covering of the input fixes the covering of all reachable graphs in the GTS.
CompleteGTS and for SingleRuleGTS.Let be the input circuit with qubits. Consider the covering of as constructed above. By construction , where is the maximum number of gates on a qubit of .
The covering set contains the set of gates composed of the -th sequence of gates for each qubit in . Furthermore, for all , if is a match of a two-qubit rule that contains , then may contain at most other gates. Hence by construction, . This is a constant and thus there is a constant such that for all there are most matches of a two-qubit rule that intersect .
Finally, any match spans two qubits; the gates on each qubit (at most
) may belong to at most two distinct sequences of gates of that
qubit. Thus, any match spans at most distinct covering sets.
These arguments made no assumption on properties of the rule set and thus apply
equally to CompleteGTS and SingleRuleGTS.
Upper bound on #
The preservation of coverings under rewrites allows us to consider a covering of : for each , let be the set of graphs in that are the result of a rewrite of its -th covering subgraph. Every graph in is the result of a rewrite on some covering subgraph , or is the input graph . So, from a bound for all , we can obtain a bound
The bound can be obtained recursively: upper bounds by definition the number of rewrites in any covering subgraph of the root graph , and thus the number of graphs in . We then proceed by induction for .
A rewrite overlaps with at most other covering subgraphs. It can overwrite at most one previous rewrite for each subgraph, and thus will have at most parent graphs in sets . Each of the sets is of size at most . Furthermore, there are at most rewrites in any covering subgraph. We thus obtain the recursion:
Unrolling the recursion, we can write this as
Recalling that by construction , we obtain:
The factorised search space size is in .
Discussion and empirical exploratory analysis #
We have derived bounds on the size of the search spaces and shown that under some assumptions on the properties of the GTS, the factorised search space grows linearly in the size of the input graph . This stands in stark contrast to the lower bound of the naive search tree, which scales exponentially with the size of the input graph.
However, when considering the overall optimisation problem of finding the optimal solution over the set of reachable graphs in a GTS, the exponential overhead does not disappear: it is rather shifted to the extraction phase that relies on a SAT solver. It is therefore an open question whether the factorised search space can be used to improve optimisation problems on GTSs.
To this end, we devise a simple numerical experiment that assesses the potential of using the unfolding construction as presented in this chapter in the context of quantum computation optimisation.
The toy problem. We consider a very simple circuit optimisation problem that is desiged to require a deep search space (i.e. a large number of rewrites) to be solved. This will exacerbate the scaling difference between an optimiser that must traverse the naive search space and another that relies on the factorised representation instead.
The inputs are quantum circuits composed of two-qubit and single-qubit rotation gates. The angles of the rotations are not relevant and set randomly. They are of the following form:
i.e. each pair of subsequent qubits have 2 gates at either end and 10 rotation in-between, on the control qubits of the gates. These circuits admit a very simple optimisation that can be expressed by the following two transformation rules:
Given the objective of minimising the number of gates, the optimiser must commute the leftmost gates through all of the rotation gates, until the two on each qubit are adjacent and cancel out. We study the performance of the optimisers as we increase the number of qubits in the circuit.
Optimisers. We define two optimisers. Badger is a backtracking search through the naive search space of reachable graphs in the GTS: starting from the input, the search space is expanded by computing all possible rewrites at a given state. States with the lowest cost function are processed first. This is similar to an A* search Hart, 1968. 1968. A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems Science and Cybernetics 4, 2 (100--107). doi: 10.1109/tssc.1968.300136.
Seadog on the other hand performs the backtracking search on the factorised search space instead: when expanding a state of the search space, only rewrites that overlap with the last rewrite are considered and added to the search space, as discussed in section 5.4. In a second phase, the search space is encoded as a SAT problem that is solved using Z3 Moura, 2008. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems. Springer Berlin Heidelberg, 337--340. doi: 10.1007/978-3-540-78800-3_24.
The Badger optimiser is released and publicly available as part of the open-source TKET package4. The Seadog optimiser on the other hand is still in early development; more benchmarks and a release will follow.
Results. We ran the experiment on an Apple M3 Max CPU (4.05GHz) for inputs between ( gates) and qubits ( gates). Both optimisers ran on a single core. For each instance, we set a timeout of seconds and report the relative gate reduction, i.e.
The results are shown in the figures on the right.
Discussion. On the left, we observe that both optimisers are able to find the optimum for circuits with up to 30 CX gates. Beyond that point, the time limit starts impacting Badger performance, which drops continuously and reaches 0% for inputs of 50 CX gates and above. Seadog on the other hand does not time out and is able to explore the entire (factorised) search space exhaustively up until 70 CX gates.
Observe that the Badger optimiser reaches the time limit for as few as 10 CX. Indeed, the complete naive search space size can be calculated to have states (each pair of qubits can be in one of 12 states). For we get states, but this already reaches over states for .

CX gate count reduction (left) and runtime (right) for the Badger and Seadog optimisers. 100% gate count reduction is optimal. A timeout was set to 2 seconds.
Size of factorised search space for Seadog.
On the other hand, the factorised search space will only contain states for each qubit pair. This results in a linear scaling of the search space size, as can clearly be seen in the second figure.
Where the runtime exceeds the 2 second timeout, this is due to pre- and post-optimisation steps such as memory allocation/deallocation, I/O, file parsing etc that are included in the measurements. The quadratic runtime scaling that we observe in Seadog is due to a hash function that is run on every state of the search space to detect and discard duplicates: as the number of states in the search space grows linearly with input size and each state requires a hash in linear time, the overall runtime grows quadratically. Future work may be able to address this issue by designing updateable hash functions that do not require the full graph to be rehashed when applying a local rewrite.
Future work should also investigate how to scale Seadog to larger input sizes on a broader class of problems. We have observed that the SAT-based extraction phase of Seadog corresponds to less than 1% of the runtime budget (under 15ms for all input sizes). Whilst being asymptotically exponential in the worst case, it is thus not currently a bottleneck. On the other hand, the number of states visited per second in the exploration phase is currently up to slower for Seadog compared to Badger. Further investigations into the causes of this are still required, but we expect that large performance improvements can be realised on the current implementation and as a result could scale to larger inputs.
-
The only constraint on the notion of graph size is that it must be compatible with the subgraph relation: if , then ↩︎
-
Note that this counting is already an act of clemency: we are not counting all permutations of the rewrites, which would be considered separately by a naive exploration that applies one graph rewrite at a time. In this case, the search tree for would contain graphs. ↩︎
-
This can always be made to hold by replacing any set of mutually disjoint rewrites with their cartesian product, in effect viewing the application of multiple disjoint rewrites as one large rewrite. Thi comes at the cost of a larger value for ↩︎
-
As a Python package on PyPI and a rust crate on crates.io). ↩︎