Topological approximate Bayesian computation for parameter inference of an angiogenesis model

Abstract Motivation Inferring the parameters of models describing biological systems is an important problem in the reverse engineering of the mechanisms underlying these systems. Much work has focused on parameter inference of stochastic and ordinary differential equation models using Approximate Bayesian Computation (ABC). While there is some recent work on inference in spatial models, this remains an open problem. Simultaneously, advances in topological data analysis (TDA), a field of computational mathematics, have enabled spatial patterns in data to be characterized. Results Here, we focus on recent work using TDA to study different regimes of parameter space for a well-studied model of angiogenesis. We propose a method for combining TDA with ABC to infer parameters in the Anderson–Chaplain model of angiogenesis. We demonstrate that this topological approach outperforms ABC approaches that use simpler statistics based on spatial features of the data. This is a first step toward a general framework of spatial parameter inference for biological systems, for which there may be a variety of filtrations, vectorizations and summary statistics to be considered. Availability and implementation All code used to produce our results is available as a Snakemake workflow from github.com/tt104/tabc_angio.


Introduction
When analysing mathematical models of biological systems, we often aim to reverse engineer the parameters of the model by fitting to observed data.The Bayesian formalism provides a principled way to perform parameter inference that quantifies our uncertainty in the model parameters (see, for example, Kirk et al., 2015), but traditionally requires us to be able to write down an analytical function (the likelihood function) that returns the likelihood of a parameter vector given the observed data.
However, for many models of interest, there is no straightforward way to write down the likelihood function associated with the model.This is often due to the intractability of deriving a closed form expression for the model likelihood.In such situations, it may nevertheless be possible to apply a simulation-based inference approach termed Approximate Bayesian Computation (ABC; see, for example, Sisson et al., 2018), that substitutes a kernel on some statistics of the data for the model likelihood, and evaluates the fit of the model at a given set of parameter values through simulations.For given parameter realisations, the model is simulated, and the statistics of the simulated data compared with the same statistics of the observed data.Informally, regions of parameter space that correspond to simulated data sets whose statistics are "more similar" to those of the observed data will be associated with higher posterior probability than regions corresponding to simulated data sets with statistics that are "less similar" (where "similarity" is quantified using a pre-specified distance function).
Applying ABC, we can derive an approximate posterior distribution over the model parameters using standard sampling techniques such as rejection sampling.This approximate posterior distribution expresses our uncertainty in the model parameters, given the model and the observed data set.Recently, ABC parameter inference and model selection has been successfully developed for reaction-diffusion models (Warne et al., 2019).However, performing parameter inference for more general spatial models has been largely unexplored.
Topological data analysis (TDA) is a relatively new area of computational mathematics that quantifies the shape of data by computing topological properties of the data.There are various approaches of topological inference, e.g., level sets or mode clusters (Wasserman, 2018).The most prominent algorithm in TDA is persistent homology (PH; Carlsson, 2009;Edelsbrunner and Harer, 2010).PH takes in data and a metric, and outputs topological features (e.g., connected components and loops) and their persistence across different scales of the data.The computation crucially depends on the choice of filtration, which is a nested sequence of spaces built on the data, that is indexed by a scale parameter (Edelsbrunner and Harer, 2010;Ghrist, 2018).There are many software implementations for persistent homology (Otter et al., 2017); however, the software used is often selected based on the types of filtrations available within it.The choice of filtration for applications is an active area of research, and there is no one-size-fits-all filtration for biological applications (Stolz-Pretzer, 2019).The persistence of the topological features as well as where topological features appear and die in the filtration may provide insight into biological processes and models.
In previous work with spatial models of biological processes (Murray, 2003), TDA has been applied to test for spatial randomness (Robins and Turner, 2016), automatically detect zebra-fish patterns (McGuirl et al., 2020), characterise immune cell infiltration by changes in a chemotaxis parameter (Vipond et al., 2021), and cluster parameter regimes for angiogenesis (Nardini et al., 2021).Now we wish to address the inverse problem of recovering model parameters given some observed data, in the Bayesian formalism.ABC enables us to perform parameter inference in a statistical model on the basis of data summaries, even when there is no clear way to define a likelihood function for the model.One key challenge in ABC is the choice of summary statistic, as the statistic must capture the relevant information about the model parameters in the data to allow the parameters to be learnt.Here we show that TDA provides informative data summaries that enable parameter inference to be performed successfully in a spatial model.In particular, we consider as a case study the Anderson-Chaplain model of angiogenesis (Anderson and Chaplain, 1998).
In previous work in the literature Maroulas et al. (2020) model persistence diagrams as Poisson point processes and use this to allow a posterior to be inferred on a persistence diagram given some observed data and a suitable prior.This allows a posterior on topological features to be defined, and a scheme for performing Bayesian classification is developed, but it does not consider the case of performing inference on a parametric model, given an observed set of topological features.
In Sgouralis et al. (2017), Bayesian inference is applied in the processing of the data, but not in a topological context or for parameter inference in the model of interest.Instead various performance measures are evaluated for a small set of selected parameter combinations, not considering a distribution over parameters or a Bayesian posterior.
In this paper we first describe the model and data generation process applied, before describing TDA and ABC in general terms, and their specific application to the Anderson-Chaplain model.We demonstrate our suggested approach for parameter inference on simulated data from the Anderson-Chaplain model and compare the outputs to the results produced by other non-topological statistics.

Model Data
The Anderson-Chaplain model (Anderson and Chaplain, 1998) is a wellstudied spatio-temporal model of angiogenesis.Angiogenesis is the growth of new blood vessels from pre-existing vasculature.The model combines a system of partial differential reaction equations with discrete dynamics.The model considers production and consumption of fibronectin, the secretion of tumour angiogenic factors (TAF) from a tumour, and new vasculature forms from endothelial tip cells in response to gradients of fibronectin and TAF.
The Anderson-Chaplain model of angiogenesis has two key parameters, ρ and χ, coefficients for haptotaxis and chemotaxis respectively.These determine the relative contribution of fibronectin driven haptotaxis and TAF driven chemotaxis to the movement of tip cells in the model.Other parameters determine the dynamics of the distribution of fibronectin and TAF, and we keep these fixed as in Nardini et al. (2021).
Data were generated by simulating the Anderson-Chaplain model on a two-dimensional square lattice of resolution 201 by 201 (as in Anderson and Chaplain, 1998) using the implementation provided in Nardini et al. (2021), with a linear chemoattractant distribution that increases with the coordinate along the x axis.This produces sets of binary images (see figure 1) which are then further processed using the methods described below.

Topological Data Analysis
To characterise the k dimensional features of a topological space X we can consider the homology group in dimension k, H k (X), composed of elements that intuitively correspond to equivalence classes of cycles that can be continuously deformed into one another on X.In dimension one, the generators of the homology group correspond to one dimensional holes in X, or loops, while in dimension zero the generators of the homology group correspond to the connected components of X.
The topological spaces we are interested in can be represented using finite sets of simplices known as simplicial complexes K that are constructed by joining together individual simplices, potentially of different dimension, and are closed under the operation of taking faces.A zero dimensional simplex corresponds to a single vertex, a one dimensional simplex an edge, and a two dimensional simplex a triangle.Given a real valued function on K we can define a filtration as a sequence of homology groups in a given dimension k, with homomorphisms induced by inclusion where Ka = f −1 (−∞, a] and a 0 < a 1 < . . .< an, and Ka i ⊆ Ka j for i < j.Persistent homology then tracks the birth and death of elements of the homology groups as a varies.By choosing an appropriate definition of the simplicial complex and filtration built from the data, persistent homology can provide information about the topological features in data. We build the simplicial complex and filtration from the final timepoint of model simulation data following Nardini et al. (2021).All cells in the two-dimensional square lattice that have vasculature present are assigned a value of one, and zero elsewhere.The centroid of each non-zero cell is a 0-simplex.The simplicial complex is built on these 0-simplices based on so-called Moore neighborhoods: if any of the eight cells surrounding a vertex are also nonzero, then we connect them via 1-simplices (edges) for two points pairwise connected, or 2-simplices for three points pairwise connected by an edge.The union of these simplices form a simplicial complex.There are different ways to study vascular data at multiple scales using filtrations (Bendich et al., 2016;Stolz et al., 2020).Here, we construct sequences of filtered simplicial complexes using a sweeping plane filtration Bendich et al. (2016); Nardini et al. (2021).In the sweeping plane filtration, we move a vertical line from left to right across the 2D lattice domain and include simplices in the filtration only to the left of this line.This filtration can be considered a sublevel set filtration corresponding to a height function h : X → R on this simplicial complex.

Approximate Bayesian Computation
In Bayesian inference we aim to derive the posterior distribution of the parameters of a model given some observed data.To do so we first define a prior distribution on the model parameters, treating them as random variables.This describes our belief in the distribution of the parameters before having observed any data.We then perform a so called Bayesian update of the model having observed some data.This is done using the likelihood of the observed data given the model and parameters.From this we arrive at a posterior distribution that describes the conditional distribution of the parameters given the observed data.If we denote the model parameters by θ, and the data by x, we can first write the prior as p(θ), and the likelihood of the data as p(x|θ).In the Bayesian framework we apply Bayes rule to update the prior distribution having observed the data, giving us the posterior distribution as where p(x) is known as the evidence or marginal likelihood, and plays a key role in Bayesian model selection.Evaluation of the marginal likelihood is often computationally expensive or intractable.However, in many settings (e.g. when sampling from the posterior using Markov chain Monte Carlo techniques), it is sufficient to be able to write down the posterior up to proportionality p(θ|x) ∝ p(x|θ)p(θ). (3) This approach relies on the ability to calculate both the prior of the parameters p(θ), which is generally tractable, and the likelihood p(x|θ).However in many models of interest it is not tractable or not possible to directly evaluate p(x|θ), for example in population genetics (Beaumont et al., 2002), random graph models (Thorne and Stumpf, 2012) and some models of dynamical systems (Toni et al., 2009;Liepe et al., 2014).To allow us to perform Bayesian inference in these situations, an approach named Approximate Bayesian Computation (ABC) was developed, based on initial work in Fu and Li (1997) and Tavaré et al. (1997), developed further in Beaumont et al. (2002) and Marjoram et al. (2003), and expanded in many works, see e.In an ABC framework, we rely on the observation that given the ability to sample realisations y from p(x|θ), we can rewrite the posterior as where and by relaxing this to we can generate samples from an approximate posterior (which we shall refer to as the ABC posterior) by using a suitably small in Algorithm 1. Often when applying the rejection algorithm we fix the number of samples S and select such that the set of samples θs with ds < is some fraction αS.
The ABC rejection sampler algorithm requires us to define a distance on the data, D(x, y), and in some cases this may itself be intractable.It is then possible to substitute a summary statistic of the data, g(x) in place of the data itself, leading to a distance on these summary statistics D(g(x), g(y)) being considered.In the case where g is a sufficient statistic for the model, as → 0 this will be equivalent to applying a distance on Algorithm 1 ABC rejection sampler algorithm 1: for s ∈ 1, . . ., S do 2: Sample θs ∼ p(θ) 3: Simulate y ∼ p(y| θs) 4: Calculate ds ← D(g(y), g(x)) 5: end for 6: Return samples θs where ds < the x and y themselves.Often this is not the case, and this is another avenue through which ABC produces an approximation to the posterior rather than a true evaluation of the posterior itself.

Topological statistics for Approximate Bayesian Computation
In In some cases when calculating the persistence of the topological features of a filtration, it is possible for some features to persist indefinitely, so that their death in the filtration is represented as +∞.In our application, this causes information about certain topological features to be lost, for example loops and some connected components, as although we know when they are born in the filtration, we have no measure of their extent.For this reason, Nardini et al. (2021) computed persistence of a left to right sweeping plane filtration and right to left sweeping plane filtration of the simplicial complex built from the simulated model data (see Nardini et al. (2021) for details).By viewing the left to right filtration as a sublevel set filtration and the right to left filtration as a superlevel set filtration, more information (e.g., only finite bars that capture the extent of topological features) can be extracted as a consequence of duality and symmetry theorems (Cohen-Steiner et al., 2009).

Extended persistence
Here we propose a more elegant solution that applies the extended persistence of Cohen-Steiner et al. (2009), which forces all topological features to be of finite length.Extended persistence was developed to study cavities and protrusions in protein docking (Agarwal et al., 2006;Cohen-Steiner et al., 2009).Since then, Yim and Leygonie (2021) optimised spectral wavelets for graph classification using extended persistence, and extended a differentiability result for ordinary persistence to extended persistence.
In standard persistence, the sublevel sets Xa = f −1 (−∞, a] of the manifold X are nested and PH is defined through the corresponding linear sequence of homology groups.In extended persistence, we compute the homology of the sublevel sets, as well as the relative homology with respect to the superlevel sets X a = f −1 [a, ∞).This is motivated by the fact that the relative homology group H k (X, X a ) of dimension k is isomorphic to the cohomology group of Xa of dimension dim − k, denoted H dim−k (Xa), where dim is the dimension of the manifold Xa (Cohen-Steiner et al., 2009).
For a set of values a 0 , . . ., an that bound and fit between the critical points of f , the extended persistence in dimension k is defined as the persistence of the homology groups and relative homology groups as where H k (X, X a ) denotes the relative homology group of X and X a in dimension k (Edelsbrunner and Harer, 2010).This extended persistence can be broken down into multiple components (Cohen-Steiner et al., 2009), the ordinary part, formed of topological features that are both born and die within the homology groups of the sublevel sets of X, the relative part of features that are born and die in the relative homology groups, and the extended part of features that are born in the ordinary homology groups and die in the relative homology groups in the filtration.The birth time b of a feature may be larger than its death time d due to the possibility that the feature dies in the relative homology group H(X, X d ) with d < b.The extended part can be further divided into topological features that have b < d, termed extended+, and those with d < b, termed extended−.

Persistence images
The output of applying PH to a data set is often represented as a persistence diagram, that for a given dimension k consists of a plot of points (b, d), where b is the time of birth and d is the time of death d of each dimension k topological feature in the filtration.To allow for the straightforward application of methods from machine learning to these diagrams, Adams et al. (2017) developed the concept of a persistence image.This allows a persistence diagram to be represented as a vector in R n , so that for example it can be used in methods such as K-means clustering, as in Nardini et al. (2021).
To generate the persistence image corresponding to a persistence diagram represented as a multiset of points (b, d), the points are first transformed to give a multiset B of birth and persistence coordinates (b, d − b) (for extended persistence, we require a slightly different formulation -see below).We note that the persistent image formulation of Adams et al. (2017) ignores all infinite persistent features.A persistence surface in R 2 → R is then defined as the weighted sum of kernels applied to each birth/persistence coordinate f (x, y) = (b,p)∈B g(b, p)h(x, y; b, p). (8) From the persistence surface defined in eqn.8, an m × m array of values is created by discretizing f (x, y) into an m by m grid in a suitable range.This array can then by flattened to give a vector in R m 2 .As in Adams et al. (2017) we apply a Gaussian kernel for h with mean µ = (b, p) and fixed standard deviation σ.
We remark that extended persistence only has finite persistence; therefore no information (i.e., the infinite bars in ordinary persistence) is lost in the persistence images for extended persistent homology.

TABC
We use a set of topological statistics derived from the extended persistence of a filtration over the simplicial complex representing the data as the summary statistics in an ABC framework, in a method we title TABC, to perform topological posterior inference on the Anderson-Chaplain model of angiogenesis.In the TABC methodology, the summary statistics used in ABC are the persistence images in each dimension produced by the by the four components of the extended persistence of a filtration.To allow persistence images to be generated for the extended persistence, in components of the extended persistence with points in the persistence diagram (b, d) with d < b, we flip the coordinates to consider instead (d, b), which when transformed into a birth/persistence coordinate then represents the duration of persistence of the feature in the relative part, or the gap between birth in the ordinary homology and death in the relative homology of the feature in the extended− part.We generate persistence images of dimension 50 by 50 with a constant weight function for the persistence surface and the kernel of the persistence images set as a Gaussian distribution with standard deviation σ = 1, as we found this to work well.As the distance metric in the ABC algorithm we applied the Euclidean distance between the statistics.In our implementation we use the GUDHI library (http://gudhi.gforge.inria.fr/) to construct simplicial complexes, generate extended persistence diagrams and produce persistence images (with standard weighting g = 1).

Image-based statistics
For comparison we also consider four statistics based on the binary image data produced by the simulations, that were chosen with the aim of differentiating the different classes of behaviours observed in Nardini et al. (2021), without overlapping with features that could be considered as topological descriptors (for example numbers of connected components).These statistics are: • Mean X coordinate The mean X value of occupied pixels.
• Mean Y coordinate The mean Y value of occupied pixels.
• Maximum X coordinate The maximum X value of an occupied pixel.
• Mass The fraction of occupied pixels.
As with the topological statistics, we applied the Euclidean distance between vectors of statistics as the distance in the ABC rejection algorithm.

Results
We apply the TABC approach described above to parameter inference in the Anderson-Chaplain model.Taking 10000 samples from the prior on the two model parameters we simulated the Anderson-Chaplain model of angiogenesis for each sampled parameter pair.
To validate our approach we drew a further 100 parameter sets from the model prior and simulated data from each to take on the role of the observed data.A representative subset of these simulated data sets can be seen in figure 1, and cover a range of different behaviours.
Given these data, we applied the TABC approach described above to derive samples of 500 parameter values from the ABC posterior.To investigate the ability of our topological approach to accurately capture the relevant behaviour of the model, we generated ABC posterior predictive samples by simulating the model using parameter values drawn at random from the ABC posterior.These are shown in figure 1, and demonstrate that TABC enables the effective recovery of parameters that replicate the qualitative behaviour of the observed data.
It can be seen that the ABC posterior distributions for the two parameters demonstrate a degree of unidentifiability, in that in most cases the posterior follows a ridge shape with a strong correlation between the two parameters.This aligns with the results found in Nardini et al. (2021), where it was discovered that there were distinct classes of behaviour that occupied diagonal sections of the parameter space, as do our posterior distributions.Being able to identify such uncertainty in our parameter estimates is one of the key benefits of a Bayesian analysis, and it also provides insights into the behaviour of the model.For example we can see that ABC posterior predictive samples in figure 1 are representative of a given class of model behaviour, and that draws from across the potentially wide distribution of parameters indicated by the posterior will follow this behaviour.
The known parameter values used to generate the data on which the posterior distributions are based are marked in figure 1, and can be seen to be within the bulk of the ABC posterior mass.
To further quantify the efficacy of our approach, we compared statistics of the posterior distributions obtained from TABC with those generated by an ABC approach using only the image-based statistics described in section 3.7.We quantified the accuracy of the inferred parameters by taking the  mean root sum of squared errors (RSSE) between the posterior samples and the "true" parameters used to generate the data, as shown in table 1.Here the mean RSSE achieved by the topological posterior over the 100 simulated data sets is below that of the posterior generated using image-based statistics.We also calculated the mean entropy of the posterior distributions produced for each observed data point using both TABC, and ABC with image-based statistics.As can be seen in table 1, the entropy for the posterior derived from the topological features is lower than that derived from the image-based statistics.Taken together, the RSSE and entropy results suggest that the topological statistics used in TABC retain more of the information in the original data set, and hence that TABC is able to more accurately infer the parameters used to generate the data, than ABC using image-based statistics alone.

Conclusions
We have developed an approach for performing ABC in a topological context that is able to derive posterior distributions over model parameters that can accurately reproduce multiple different classes of behaviour and structure observed within the data.We applied extended persistence, which strictly quantifies more topological features than ordinary persistence.
Other topological shape statistics have focussed on sweeping across data in multiple different directions (Turner et al., 2014;Curry et al., 2018;Crawford et al., 2020).Their utility for parameter inference and model selection will be explored in future studies.
Evaluating the ABC posterior distributions we obtain, we find that by considering topological features in the data through the TABC approach we are able to reduce the posterior uncertainty in the parameter values, and to infer posterior distributions that are more closely focused around the parameters used to generate the data.
While we use persistence images here, there are other potential approaches to summarising topological data analysis for use in parameter inference.For example it is possible to directly derive distances between persistence diagrams in a number of ways (Atienza et al., 2020;Bubenik, 2015;Carrière et al., 2017Carrière et al., , 2015;;Chazal et al., 2014;Di Fabio and Ferri, 2015;Kerber et al., 2017;Lacombe et al., 2018;Royer et al., 2021), and these could be substituted for the Euclidean distance between the vectors of persistence images that we apply.In future work we will investigate the possibility of applying a distance function on persistence diagrams in the ABC likelihood and how this influences the efficiency of the algorithm.
For simplicity we have also only considered the simplest form of the ABC algorithm -many other increasingly sophisticated approaches exist, including Markov Chain Monte Carlo algorithms, Sequential Monte Carlo methods (Sisson et al., 2007) and rare event schemes (Prangle et al., 2018).It would be expected that for models with larger numbers of parameters, significant improvements in efficiency could be obtained by applying one of these approaches rather than a rejection sampler based ABC approach.Doing so would not require any changes to the topological aspects of TABC, only the encompassing sampling mechanism.
As with some other applications of ABC (e.g.Russell-Buckland et al., 2019), a potential strength of our approach is that it enables a form of qualitative inference to be performed; in our case by allowing combinations of parameters that result in model behaviour that is topologically similar to the observed data to be identified.Although we consider a specific application, to parameter inference in the Anderson-Chaplain model of angiogenesis, the TABC approach may be adapted to be widely applicable to parametric models having topological features in the data that are informative about model parameters, including in situations where a mixture of topological statistics and other complementary statistics could be used.
previous workNardini et al. (2021) applied topological statistics of simulated data (2-D binary images) to quantify different regimes in the parameter space of the Anderson-Chaplain model of angiogenesis.By constructing simplicial complexes from the output data of a spatial model, and using the same filtration asNardini et al. (2021), PH can be applied to describe the presence of topological features in the simulated data.

Fig. 1 .
Fig. 1.Visualisations of simulation output from the Anderson-Chaplain model for five parameter sets sampled from a uniform prior on the model parameters.The first column shows the observed data, while the second shows a contour plot of the posterior density inferred by applying the TABC methodology, with the red cross indicating the known parameter values used to generate the observed data.The remaining three columns show simulations of parameter values drawn from the ABC posterior predictive distribution.

Table 1 .
Mean of the root sum of squared errors and entropy of the posterior distribution inferred from simulated data for 100 parameter sets drawn from a uniform prior.Values for both the TABC based posterior and ABC on the image-based statistics are shown.