Thesis icon

Thesis

Geometric and topological inference from random samples

Abstract:

We study statistical inference problems in geometry and topology inspired by data science. Generally we consider an independently, identically distributed random sample X = (X1, . . . Xn) drawn from a measure µ over a set MRD. Then we consider the class of problem of inferencing a property P of M only using X. The following pairs of (M,P) are considered:

(1) M = Manifold, P = Dimension. We locally apply PCA (principal components analysis), and infer the dimension of M by counting how many of the variances fall under a threshold. While this method is widely used, a rigorous mathematical theorem guaranteeing its correctness was not established. We prove such a theorem.

(2) M = Manifold, P = Tangent spaces. The local PCA algorithm above can also be used to infer tangent spaces: a tangent space is estimated as a linear span of top principal components. Again for this standard well-known algorithm, we prove theorems guaranteeing the correctness of the algorithm.

(3) M = Stratified space, P = Singular points. A stratified space generalises manifolds, and possesses singularities at which there is no local resemblance to a Euclidean space. We present a fast algorithm that detects singularities using local hypothesis testing. The kernel method used in the algorithm is significantly faster than previous topological methods. Experimental results on both synthetic and real data are presented. Furthermore, we prove a theorem that guarantees the algorithm’s correctness in the case of union of two manifolds.

(4) M = Manifold, P = Homotopy type. A standard way to infer homotopy type of a manifold from a finite sample is via constructing a simplicial complex at a small distance threshold. However instead of stopping at a small threshold, we consider arbitrarily large connectivity thresholds and study anomalous topology arising from this. In particular, we study a very specific case of circle M = S1 , and show that Cech complexes arising from finite samples on M are homotopic to bouquets of high-dimensional spheres with high probability.

Actions

Access Document

Files:

Authors

More by this author
Institution:
University of Oxford
Division:
MPLS
Department:
Mathematical Institute
Role:
Author


DOI:
Type of award:
DPhil
Level of award:
Doctoral
Awarding institution:
University of Oxford


Language:
English
Deposit date:
2025-12-25
ARK identifier:

Terms of use


Views and Downloads






If you are the owner of this record, you can report an update to it here: Report update to this record

TO TOP