Chapter 6
A Framework for Scalable Graph Rewriting
With this chapter we have arrived at the core of this thesis: a proposal for a new quantum compilation framework. In summary, our claim is that given
- the challenge of scaling up quantum programs sizes to make the most of the computational capabilities of upcoming hardware (cf. sections 2.1 and 2.3) and
- the modularity and expressiveness that quantum compilers will require to simultaneoulsy express higher level abstractions, hardware primitives and interleaved quantum classical computation (cf. sections 3.3, 2.4, and 2.5),
graph rewriting is uniquely positioned to serve as the backbone of a quantum compilation framework.
Our proposal draws much from the design and techniques of classical compilers (cf. sections 5.1 and 3.3). Quantum, however, distinguishes itself in two ways, forming the cornerstones of our design. The focus on small, local graph transformations for quantum optimisation is justified by the groundbreaking work by Clément et al Cléme., 2023. 2023. A Complete Equational Theory for Quantum Circuits. In 2023 38th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), June 2023. IEEE, 1--13. doi: 10.1109/lics56636.2023.10175801 Cléme., 2024. 2024. Minimal Equational Theories for Quantum Circuits. In Proceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science, July 2024. ACM, 1--14. doi: 10.1145/3661814.3662088. They showed that the rich algebraic structure of quantum circuits can be fully captured and expressed using local rewrite rules1. Our compiler can therefore restrict program manipulation to local transformations without losing expressiveness. This design choice in turn opens the door for large scale optimisation and compilation on parallel or even distributed hardware.
Equally important, the linear types of quantum computing (cf. section 3.3) significantly constrain the space of possible program transformations. Our contributions in this thesis highlight how these restrictions can be leveraged to create quantum-specific variants of classical compilation techniques that scale much more favourably. This makes approaches that are too expensive for classical compilers (cf. section 5.1) perfectly feasible2 in the context of quantum compilation.
6.1. Inspiration: Superoptimisation
This needs to be reorganised
Optimisations in compilers are typically built as passes. Compiler authors then design sequences of these passes to create the default optimisation pipelines that most end-users rely on. This is not easy. For a large project like LLVM, the end result looks something like this – a thousand lines of meticulously commented code, carefully crafted and hand tuned to handle every optimisation edge case and perform well on thousands of benchmarks. This is the phase-ordering problem Click, 1995. 1995. Combining analyses, combining optimizations. ACM Transactions on Programming Languages and Systems 17, 2 (March 1995, 181--196). doi: 10.1145/201059.201061 – one of the hardest problems in classical compilation.
Quantum compilers are not immune to this, either. The same pass ordering sorcery can be found just as well in TKET, a leading quantum compiler Sivara., 2020. 2020. t|ket⟩: a retargetable compiler for NISQ devices. Quantum Science and Technology 6, 1 (November 2020, 014003). doi: 10.1088/2058-9565/ab8e92. It does not look quite as scary yet, but the code is growing and some passes are already being called three times within the default sequence. This is hard to maintain and any change and new feature comes at the risk of degrading the compiler performance on established use cases.
If we are serious about adopting classical compiler tooling for quantum computations, we will need to find a more convincing solution to this problem. Fortunately, classical compilation is a mature field that has already experienced (and solved!) most of the challenges that quantum compilers have faced, and will ever face, in peephole optimisation. Our proposal, described in chapter 6, combines two modern compilation techniques that mitigate this and which we will review now: Superoptimisation and Equality saturation.
Large compiler projects support a multitude of program representations and large sets of operations/instructions to be able to generate code for a wide array of devices and architectures. Designers of optimising compiler passes must thus necessarily put constraints on the program input format that the pass will accept. In all likelihood, the pass will furthermore only be applicable for certain cost functions or for a limited set of hardware target. This all leads to a proliferation of special purpose compiler passes that are bug-prone and must be carefully ordered to perform well across all the use cases of interest. On top of this, new instruction sets, architectures or new cost functions require new sets of compiler passes, in effect rebuilding the entire compilation pipeline.
As early as 1979, Fraser suggested deriving peephole optimisations automatically Fraser, 1979. 1979. A compact, machine-independent peephole optimizer. In Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages - POPL ’79. ACM Press, 1--6. doi: 10.1145/567752.567753, an idea Massalin coined superoptimisation Massal., 1987. 1987. Superoptimizer: a look at the smallest program. In Proceedings of the second international conference on Architectual support for programming languages and operating systems, October 1987. ACM, 122--126. doi: 10.1145/36206.36194. Instead of hard-coding the peephole optimisations as passes to be ordered and executed, useful valid program transformations are synthesised automatically. Early work used probabilistic verification, which meant that program transformations were generated ahead of time, and once verified manually, added to the compiler for use in optimisation Massal., 1987. 1987. Superoptimizer: a look at the smallest program. In Proceedings of the second international conference on Architectual support for programming languages and operating systems, October 1987. ACM, 122--126. doi: 10.1145/36206.36194 Sands, 2011. 2011. Super-optimizing LLVM IR. (November 2011). Retrieved on 13/01/2025 (LLVM Developer's meeting) from http://llvm.org/devmtg/2011-11/Sands_Super-optimizingLLVMIR.pdf. With advances in theorem proving techniques, program transformation synthesis and verification was further automated, resulting in end-to-end automatically generated and proven compiler transformations Bansal, 2006. 2006. Automatic generation of peephole superoptimizers. ACM SIGARCH Computer Architecture News 34, 5 (October 2006, 394--403). doi: 10.1145/1168919.1168906 Sasnau., 2017. 2017. Souper: A Synthesizing Superoptimizer. CoRR abs/1711.04422.
A particularly simple superoptimising compiler design was proposed in Jia, 2019. 2019. TASO: optimizing deep learning computation with automatic generation of graph substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, October 2019. ACM, 47--62. doi: 10.1145/3341301.3359630, specially for the purposes of computation graph optimisation Fang, 2020. 2020. Optimizing DNN computation graph using graph substitutions. Proceedings of the VLDB Endowment 13, 12 (August 2020, 2734--2746). doi: 10.14778/3407790.3407857 in deep learning. The set of all small programs, up to a threshold, is generated ahead of time and partitioned into disjoint classes of equivalent programs. This concisely expresses every possible peephole optimisation up to the specified size: for every small enough subset of instructions of an input program, its equivalence class can be determined. Any replacement of that set of instructions with another program in the same equivalence class is a valid transformation, and thus a potential peephole optimisation.
This framework was adapted to quantum circuit optimisation in Xu, 2022. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 and separately in Xu, 2023. 2023. Synthesizing Quantum-Circuit Optimizers. Proceedings of the ACM on Programming Languages 7, PLDI (June 2023, 835--859). doi: 10.1145/3591254. On top of excellent performance, this approach is extremely flexible. For any supplied cost function, the compiler can explore all valid program transformations to find the sequence of transforms that minimise cost. This keeps the cost function-specific logic separate from the transformation semantics of the program, making it straightforward to replace or update the optimisation objective.
Furthermore, the set of supported primitives and architectures can easily be extended at any time by supplementing the equivalence classes with new programs. Any input using this extended set of primitives can thus be successfully compiled and any output program format constraints can be enforced by restricting the set of valid transformations.
The adaptation of superoptimisation to quantum optimisation of Xu, 2022. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433 and Xu, 2023. 2023. Synthesizing Quantum-Circuit Optimizers. Proceedings of the ACM on Programming Languages 7, PLDI (June 2023, 835--859). doi: 10.1145/3591254 is however showing scaling difficulties: unlike classical superoptimisation which is usually designed to optimise small subroutines within programs, e.g. focusing on arithmetic instructions, single instruction multiple data (SIMD) etc., the technique should in principle be able to optimise quantum programs in their entirety. This leads to much larger programs that superoptimisation does not scale well to. An orthogonal scaling challenge has also been observed: in Xu, 2022. 2022. Quartz: Superoptimization of Quantum Circuits. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation, June 2022. Association for Computing Machinery, 625--640. doi: 10.1145/3519939.3523433, optimisation performance was observed to improve markedly with larger sets of rewrite rules. However, each rewrite rule requires a separate run of pattern matching to find all possible applications. As a result, performance peaks at around 50 000 rewrite rules, after which the additional overhead from pattern matching becomes dominant, deteriorating the compilation results.
6.2. Overview
The framework we propose can be summarised by the diagram below. It is composed of two main components: pattern matching is tasked with finding all possible applications of rewrite rules from the set of pre-defined rules. Importantly, for this computation, we introduce an offline pre-processing step that builds a data structure that is amenable to pattern matching from the supplied set of rewrite rules. This allows us to obtain significant speedups in the exploration phase of the framework. The design of the pattern matching engine is described in more detail in section 6.3, while the original contributions with a detailed presentation of the pre-processing computation and an analysis of the asymptotic speedup are presented in chapter 4.
The second component is charged with the application and prioritisation of the discovered rewrites. Depending on the cost function and heuristics employed, it may select a subset of rewrites or any combination of multiple rewrites to produce zero, one or many new candidate programs that are equivalent to the input program. We discuss several rewriting strategies in section 6.4. Beyond that, we argue that optimisation based on local rewriting must be done in parallel to scale to large input sizes. For this purpose, section 6.4 presents a refinement of our framework for the distributed setting. To enable this, a new distributed data structure inspired by equality saturation is presented in chapter 5.
The simple heuristic driving our optimisation loop, consisting of pattern matching and rewrite application, relies on a priority queue, which orders all candidate programs by increasing cost function. This allows us to keep track of all the best candidates seen so far, with the top of the queue always containing the current best optimised program. Optimisation can thus be terminated at any time based on a timeout. This corresponds to a simple backtracking search, which greedily chooses the best candidate at each step, and retreats to optimising previous candidates once all better candidates have been exhausted. Improvements to the search heuristic are an obvious area for future work, which we do not explore in this thesis but briefly discuss in our concluding chapter, chapter 7.
Overview of the proposed quantum compiler optimisation framework. The two main components (white square boxes in the figure) are discussed in details in sec. 4.2 and 4.3 respectively.
6.3. Pattern matching engine
As discussed in the review of superoptimisation (sec. 3.5), the adaptation of the technique to quantum computing benefits from the use of thousands of rewrite rules. However, the performance reaches a ceiling as the number of rewrite rules grows, when pattern matching becomes the bottleneck. This was the main motivation for the work in Mondada, 2024. 2024. Scalable Pattern Matching in Computation Graphs. In 15th International Workshop on Graph Computation Models, which speeds up pattern matching by introducing a pre-computed state automaton data structure. The result is pattern matching in a runtime that is independent on the number of rewrite rules being considered – reaching a 20x speedup on datasets of practical interest.
We are intersted in a fast pattern matching engine because it sits at the center of the “rewriter” component that appears in the diagram above – and in particular on which the offline pre-processing phase is based. Let us zoom in a bit on the rewriter component to understand. The workings of the “Rewriter” (offline) and “Pattern matching” (online) components of the overview diagram presented earlier can be refined into the following schema.
Detailed view of the “Rewriter” and “Pattern matching” components of the optimisation framework.
Instead of taking ready-to-apply rewrite rules as input to the offline pre-computation phase, we now take a large set of quantum programs, typically enumerated automatically. To construct rewrite rules, we cluster the programs into sets of equivalent programs. This is particularly simple in the case of small quantum circuits, as the program can be equivalently represented by a unitary, which can for instance be hashed and put into buckets to gather semantically equivalent programs together.
The valid rewrite rules correspond to transformations between programs within the same equivalence class. One may consider all possible rewrite rules within a class of size , or alternatively pick a representative and only consider rules to and from . An arbitrary rewrite between two elements of a class can then be expressed by composing two rewrite rules to and from 1.
The set of all patterns of interest are then all left hand sides of the rewrite rules – i.e. every program in any of the equivalence classes. These are passed off to the pattern matching engine, that will first pre-process the patterns offline to build a pattern matcher, and then use the matcher to find all matches in the data. With a simple map from the left hand sides to the right hand sides of the rewrite rules, we can then convert every pattern match into a rewrite to be considered during optimisation.
The inner workings of the pattern matching algorithm are discussed in-depth for the use case of portgraph matching in chapter 4. In this section, we instead present pattern matching abstractions and a novel user interface that surfaces the capabilities of the pattern matching engine in a declarative and extensible way. This allows the user to define pattern matching languages that are tailored to their domain of application and the rewrite rules of interest, without going into the details of the pattern matching engine.
Indexing schemes #
We abstract away the domain of definition of patterns and the host data, i.e. the data being matched, using a structure that we call an indexing scheme. The main object of interest within indexing schemes are index maps, partial maps from a key set to a value set Pattern matches will be expressed as instances of index maps – and are thus typically chosen so that there are (obvious) injective maps and for any pattern and data in the application domain. For simplicity of presentation – and in a blatant abuse of notation – we will assume in this section that these injective maps are the identity, i.e. that always and . As a result, whenever the domain of defintion of contains and is injective on it, the index map restircted to can be viewed as a pattern embedding of into :
Note that and is often a valid choice – this corresponds to concrete patterns that will match data exactly. Keeping the interface more general, however, allows for more expressive pattern languages that may capture entire subsets of as a single pattern.
Index maps enable pattern matching through the repeated expansion of their domain of definition: starting from the empty indexing map with , new index maps are created using an user-provided expansion function: Informally, this provides all the ways in which can be extended. This can be specified in terms of the following three properties:
-
for all (valid extensions).
-
For all valid embedding such that , there exists such that (preserve all embeddings).
-
for all (progress must be made).
where we introduced as a shorthand to mean that is equal to when restricted to :
Using this, the pattern matcher can proceed by keeping track of a set of index maps that satisfy the invariant This holds trivially for . The expansion step maintains this invariant, as a direct corollary of property 2. Inductive application of property 3 guarantees that all index maps with at most elements will be discovered after iterations. Writing for the index maps after the -th iteration and for the maximal cardinality of the patterns, we can thus formulate the following “completeness” property for our pattern matcher:
To make sure, on the other hand, that only valid embeddings2 are present in , we introduce constraints.
Constraints and Predicates #
At its simplest, a constraint is a boolean-valued (higher-order) function that is used to filter out invalid index maps: Given a set of constraints , the idea is then to interleave index map expansions with pruning using the constraints To obtain and efficient (and correct) implementation of pattern matching based on such constraints, it is however useful to specify some properties that the constraints should satisfy, in particular with respect to the extension of index maps3: is it valid to evaluate a constraint on index maps that are only partial embeddings? If so, on which partial maps can the constraint be successfully evaluated? And do index map extensions preserve constraint validity, or must constraints be evaluated repeatedly after each extension?
We constrain the expressivity of constraints with the introduction of predicates. A predicate is also a boolean-valued function4, but it is instead evaluated on a tuple of values in : We refer to as the arity of the predicate. We then (re-)define constraints as tuples of a predicate (of arity ), along with keys . A constraint can then only be evaluated on an index map if . It is then given by Not only does this define a clear condition of when it is valid to evaluate a constraint on a partial index map, it also ensures that for all index maps such that and , we have
This completes a bird’s eye view on how pattern embeddings are constructed and matched by the pattern matching engine. The key point is that this interface will allow us to match multiple patterns simultaneously by grouping patterns that share constraints together. Where necessary, pattern embeddings that satisfy conflicting constraints will be tracked separately using more than one set of index maps.
Making pattern matching efficient (aka. “implementation details”) #
The title of this section may tempt you to skip the next few paragraphs – don’t! There is more consideration sthat go into this than first meets the eye.
A reasonable proxy for the runtime of the pattern matcher is the total number of constraints that must be evaluated. This is the accumulation5 of the total number of constraints that the patterns being matched require, aggregated over all the index maps that we must keep track of throughout the computation. Our overall strategy to minimise this is summarised as the combination of the following three principles
-
Minimise the number of index maps to keep track of by maximising index map overlaps between patterns.
-
Express patterns using constraints that are shared or can be evaluated jointly across multiple patterns.
-
Prune away candidate index maps that do not extend to valid embeddings as early as possible.
To an extent, these are of course application domain dependent and can only be achieved with the willing cooperation of a dedicated user6 – principle 1 is for instance best achieved through some form of canonicalisation of the patterns, such that similar patterns can share the same keys. However, it remains up to the interface designer to encourage this behaviour and ensure that the pattern matching engine can make the most of a well thought out implementation. Continuing our example, we have already seen that our interface requires each pattern and data to be mapped explicitly to common key and value sets , which is the first step towards canonical identifiers.
We further restrict the number of index maps that must be kept track of by refining the expansion function we introduced earlier to be able to specify the key that an index map should be extended to. The set must satisfy the following modified properties:
-
for all (valid extensions).
-
for all valid embedding such that and , there exists such that (preserve all embeddings of ).
-
for all . (include ).
The most obvious change is in the third property, where instead of requiring that the domain of definition is enlarged in some way, we now require explicitly that be added to the domain of definition. The other change is a weakening of property 2. Unlike the original expansion function, in this case the expansion is no longer “complete”, in the sense that for some with , it may be that for any , even though .
With the function, we can extend index maps to cover exactly the subset of required to evaluate a constraint of interest, thus avoiding to extend index map beyond the necessary keys (and accumulate multiple versions of them in ).
Now, how do we choose which constraints we consider, and in which order? This is key not only to minimising the number of index maps when using the function, but also to maximise the number of constraints that can be shared across patterns.
Rather than relying on rigid decompositions of patterns into sets of constraints, we achieve this by selecting and applying one constraint at at time from the patterns. As we progress, we can pick constraints that will result in the best performance, dynamically and specially for the patterns under consideration.
Concretely, we establish a protocol whereby patterns are asked to nominate constraints that they would like to evaluate. The results are pooled together and the constraint that is shared by the most patterns is applied. The patterns can then be updated, conditioned on the application of the constraint and the process can be repeated until all patterns have been reduced to tautologies. Further benefitting performance, it suffices to run this once during the pre-processing phase ahead of any pattern matching, so that this optimisation implements principle 2 without incurring any runtime pattern matching cost.
To further maximise constraint overlaps, we introduce constraint classes. They are sets of constraints that the user can specify to indicate that the constraints share a logical relation – in statistics words, that the constraints are not independent. A particularly useful property of constraint classes would be a relation of pairwise mutual exclusion between elements of the class: the satisfaction of any of the constraints in the set precludes any other from being satisfied. More trivially, a constraint class can also group together constraints that should be evaluated jointly, for a more efficient implementation.
Patterns having an overlapping constraint class is thus a useful generalisation of shared constraints that is much more broadly applicable. This undoubtedly begs the question of how various constraints and constraint classes, possibly of greatly differing “value”, should be compared with and prioritised over one another – besides by counting the number of patterns that share them.
We propose a measure to quantify the “utility” of a constraint for matching a given pattern. Using this we can prioritise highly valuable constraints, which, when their evaluation is negative, result in early prunings of irrelevant index maps (principle 3). We define the expansion factor of a constraint as the expected relative change in the number of the index maps that results from conditioning on : This value is of course hair rippingly hard to compute precisely in most cases – presumably, it would require a model of the precise distribution of the index maps during pattern matching that would have to depend on the distribution that the data being matched is drawn from, along with all the previous constraints that have been applied.
We could spend the rest of this thesis (and then some) doing these computations, or we can choose to ignore all these intricacies, fix to be the subset of the index maps on which is defined, and just take the frequentist’s interpretation of the ratio within the expectation value as the probability that the constraint is satisfied: with drawn uniformly at random. Much easier! From here on, we take the latter as the definition of .
By asking the user to provide some heuristic approximation of , we can gauge the effect on the number of index maps of picking a constraint and conditioning on it as roughly
We propose a further generalisation of this metric to quantify the value of any subset of constraints in a constraint class. For a constraint class , we define the expansion factor of as In the (uninteresting) case of pairwise independent constraints in , this simplifies to , as one would expect.
Being a strict generalisation of the single constraint case, in the following (and in our implementation), we always consider expansion factors over sets of constraints. Importantly, however, we restrict expansion factors to only be defined over sets of constraints within a shared constraint class – this simplifies significantly the implementation as it does not require to handle any arbitrary combination of constraints, and makes constraint classes an important tool to express constraint dependencies in a way that the pattern matcher can reason about.
Lower values of will result in a smaller number of index maps. For every constraint we are considering applying, we can query the constraint classes that it is part of and group constraints by class. We can then obtain the expansion factor associated with the constraints in each of the constraint classes and pick the class with the smallest expansion factor. The following pseudo-code summarises the constraint selection procedure. Given a list of patterns, it returns a list of constraints from one constraint class such that the expansion factor is minimised.
fn best_constraints(patterns: [Pattern]) -> [Constraint] {
// For each pattern, collect all constraints that it nominates
// and group by constraint class
class_to_constraints = {}
for p in patterns {
for c in nominate_constraints(p) {
for cls in constraint_classes(c) {
constraints = class_to_constraints[cls] or []
constraints.append(c)
class_to_constraints[cls] = constraints
}
}
}
// Find class with smallest expansion factor
best_constraints = None
best_ef = None
for (cls, constraints) in class_to_constraints {
ef = expansion_factor(cls, constraints)
if best_ef == None || ef < best_ef {
best_constraints = constraints
best_ef = ef
}
}
return best_constraints
}
Finally, after selecting the constraints to be applied, the patterns are updated by applying the constraints one-by-one and constructing the pattern matcher for the (now simplified) patterns recursively:
fn build_matcher(patterns: [Pattern], root = None) -> Tree {
if root == None {
root = new_tree()
}
for c in best_constraints {
new_patterns = [condition_on(p, c) for p in patterns]
child = root.add_child(c)
build_matcher(new_patterns, child)
}
return root
}
Of course, the real implementation must keep track of additional information, such as the set of bound keys and the pattern matching at each node. Nodes that represent the same pattern matching state are also merged to reduce the size of the data structure.
This has certainly all been very abstract – and probably rather confusing.
To illustrate the ideas we just presented, the next section presents
an implementation of this interface for string pattern matching.
Among others this will include concrete implementations for the various functions
(nominate_constraints, constraint_classes, expansion_factor, condition_on etc.)
called in the pseudo-code.
An example #
The pattern matching framework we are proposing can of course model the matching problems on graphs and compiler IRs that we are mostly interested in. It can however be applied just as well to other domains.
Among the simplest possible use cases of this is pattern matching on strings.
Over the course of the previous paragraphs, we introduced all the main ideas
and concepts using a mathematical language.
In this section, we will use string matching as an excuse to revisit the same
ideas, this time staying much closer to the perspective of the user and the
actual programming interface as implemented in the Rust library portmatching7.
I suspect the reader will fall into one of two camps – either excited that
the rumblings in this thesis might at last show some early signs of coherence,
or utter disbelief that the presentation is about to become more applied still.
In either case, rest assured that no knowledge of Rust is expected – we will
mimick the real interface as much as possible, but the “code” we will write
will be straight-forward pseudo-code8
Let us start with the indexing scheme. For this we propose integer valued key
and value sets , .
To distinguish between and , we will write their
elements as PatternPos(i) and DataPos(i) respectively.
We use the indices to refer to the position of characters within the pattern
and data strings.
For a pattern and a data string with and characters
respectively, a pattern embedding is thus a map
that maps PatternPos(i) to DataPos(s + i) for some
and such that the pattern
matches the characters in between positions and , i.e. .
To define the indexing scheme, we provide the following two functions:
fn required_bindings(key: PatternPos) -> [PatternPos] {
if key == PatternPos(0) {
return []
} else {
return [PatternPos(0)]
}
}
fn list_bind_options(
key: PatternPos, data: String, known_bindings: Map<PatternPos, DataPos>
) -> [DataPos] {
if key == PatternPos(0) {
// Every position in data is valid
return [DataPos(i) for i in [0, 1, .. data.len() - 1] ]
} else {
PatternPos(pattern_offset) = key
DataPos(root_binding) = map.get(PatternPos(0))
if root_binding + key < data.len() {
// Valid position is obtained by adding offset to root binding
return [DataPos(root_binding + pattern_offset)]
} else {
// The only valid position would be beyond the end of data
return []
}
}
}
The first is independent of the data string being matched.
It is thus sufficient to call it once for each pattern key in the offline
precomputation phase.
The keys returned are the dependency of the key passed as argument:
they must be bound before key can itself be bound.
This defines a dependency graph of all key bindings, from which a total order
of key bindings can be determined (assuming the dependency graph is acyclic).
In our case, we declare that before any index can be bound,
first a binding for the index must be provided.
The latter index, on the other hand, can be bound without requiring any
prior bindings.
The second function captures the binding logic proper and corresponds to the
function introduced in the previous section.
The user implementing this function may safely assume that all required bindings,
as returned by required_bindings, have already been bound and can be accessed
using the binding map passed as third argument.
We see in the implementation why we made every other binding depend on
PatternPos(0): once the latter is bound (to any position in the data string),
all other pattern posistions resolve to a unique DataPos9, making the implementation
of list_bind_options very simple.
We now turn to the set of valid predicates, from which pattern constraints are defined. To add a bit of spice here (and incidentally support string matching capabilities more powerful than regular expressions), we consider a somewhat extended pattern language. Suppose our data strings are drawn from an alphabet , i.e. . Instead of taking pattern strings in , we instead extend the alphabet with a (disjoint) set of symbols and take patterns in . Characters of a pattern in must match the character they are mapped to in exactly. Characters in , meanwhile, can match any character; however, any two occurences of the same symbol in a pattern must map to identical characters in .
In our implementation, we fix to be lowercase letters 'a' – 'z' and identify
symbols in by prefixing a character with $, e.g. $x and $y.
For instance, the pattern ab$xc$x will match fooabccc starting from
position 3 (i.e. mapping PatternPos(0) to DataPos(3)),
but will not match abccd, as $x would have to be assigned once to c and
once to d.
To capture these matching semantics we introduce two predicates10
ConstVal(c)for allcinBindingEq
The first predicate is of arity 1: given an index DataPos(i), it will evaluate
to true if d[i] == c is satisfied.
This corresponds to matching characters of the pattern string within .
On the other hand, BindingEq, of arity 2, compares the characters at
two positions DataPos(i) and DataPos(j) for equality: d[i] == d[j].
Such a predicate is used for every occurence of a symbol of in the
pattern beyond the first one.
We provide this predicate evaluation in the simple function check.
fn check(predicate, args: [DataPos], data: String) -> bool {
if predicate == ConstVal(c) {
[DataPos(i)] = args
return d[i] == c
} else { // predicate == BindingEq
[DataPos(i), DataPos(j)] = args
return d[i] == d[j]
}
}
We then define our pattenrs by a set of constraints that must be satisfied.
Each constraint is expressed by a predicate along with one or two pattern keys
PatternPos(i), indicating which characters the predicate applies to.
Continuing the example of the pattern, ab$xc$x, we can express it by the
set of constraints
[
(ConstVal('a'), PatternPos(0)),
(ConstVal('b'), PatternPos(1)),
(ConstVal('c'), PatternPos(3)),
(BindingEq(PatternPos(2), PatternPos(4))
]
Note that in this latter representation the name of the variable ` is no longer specified – this simplifies the pattern. It will also increase the overlap of contraints between patterns by removing duplicates that differ in variable name only.
The task of the pattern matcher is then, given some data string, to
find a binding of the positions between PatternPos(0) and PatternPos(4)
to some data positions DataPos(i) … DataPos(i + 4) for some i
such that all constraints above are satisfied11.
A concrete decomposition of the pattern would include the ordering in which
the constraints should be evaluated. For all but the simplest cases, however,
fixing such a constraint decomposition and the evaluation order is likely
to yield inefficient pattern matchers.
As we discussed, the API instead expects the pattern decomposition to be
provided through the functions nominate_constraints, constraint_classes,
expansion_factor and condition_on.
For the implementations of nominate_constraints and condition_on, we
rely on the ConstraintPattern
type provided by portmatching.
Provided with the set of constraints above, an instance of ConstraintPattern
will keep track of which constraints have already been satisfied through
successive calls to condition_on. Calls to nominate_constraints will return
the set of constraints that are still left to be satisfied.
To conclude, we must indicate which constraints should be grouped together
and how to estimate their value.
We note that when evaluated on the same DataPos(i), any two ConstVal
constraints are mutually exclusive.
We can interpret a BindingEq arg1 arg2 constraint similarly, by viewing it as
identical to a ConstVal(c) arg2 constraint, where c is the character at
the position of arg1.
We hence choose to group together constraints that apply to the same position.
We introduce for this the classes AtPositionClass(i), for integer i.
fn constraint_classes(predicate, args: [DataPos]) -> [ConstraintClass] {
if predicate == ConstVal {
[DataPos(i)] = args
return [AtPositionClass(i)]
} else { // predicate == BindingEq
[DataPos(i), DataPos(j)] = args
// assuming i < j, and that DataPos(i) is bound before DataPos(j)
return [AtPositionClass(j)]
}
}
To simplify here, we chose to only assign BindingEq constraints to a single
class – this allows us to assume that the first key was already bound, and
therefore treat the constraint in the same way as ConstVal constraints.
We could also assign BindingEq to the two classes corresponding to the
two positions it applies to, but it then becomes harder to give a good estimate
for the expansion_factor.
Assuming every character in is equally likely to occur, this gives
the straight-forward approximation of the expansion factor 12
fn expansion_factor(constraints: [Constraint]) -> f64 {
return 1.0 / 26.0 * constraints.len()
}
There is still the branching factor!
Rewriting rule used in practice #
-
A bit like travelling by TGV. ↩︎
-
Given that we called the previous property completeness, we could call this soundness. ↩︎
-
It would be possible in theory to delay all evaluations of the constraints until the end of the pattern matching process, but not without incurring a blow up of . ↩︎
-
…and will also be used to filter index maps, just like constraints. ↩︎
-
Or an “integral”, as an engineer would call it. ↩︎
-
Thankfully, this often happens to be ourselves. ↩︎
-
Call it pseudo-Rust if you will – pseudo-code with a Rust-inspired syntax, but simple enough for anyone to follow. The complete, working implementation of this example is available within the portmatching crate in the module
portmatching::concrete::string. ↩︎ -
That may or may not exist in . ↩︎
-
To be precise,
ConstValis an entire family of predicates. ↩︎ -
Note: that the bound
DataPosindices will form a contiguous interval is already guaranteed by our design of the indexing scheme – replacing with an indexing scheme allowing non-contiguous but monotonic indices, we would obtain pattern matching on string subsequences. ↩︎ -
A better approximation in
expansion_factorshould take into account that multipleBindingEqconstraints can result in the same constraint, if the data contains multiple repetitions of the same character. One could for example treat the two constraint types independently, and assume thatBindingEqconstraints are satisfied with probability , independently of one another. ↩︎