Thesis
Geometric and topological inference from random samples
- Abstract:
-
We study statistical inference problems in geometry and topology inspired by data science. Generally we consider an independently, identically distributed random sample X = (X1, . . . Xn) drawn from a measure µ over a set M ⊆ RD. Then we consider the class of problem of inferencing a property P of M only using X. The following pairs of (M,P) are considered:
(1) M = Manifold, P = Dimension. We locally apply PCA (principal components analysis), and infer the dimension of M by counting how many of the variances fall under a threshold. While this method is widely used, a rigorous mathematical theorem guaranteeing its correctness was not established. We prove such a theorem.
(2) M = Manifold, P = Tangent spaces. The local PCA algorithm above can also be used to infer tangent spaces: a tangent space is estimated as a linear span of top principal components. Again for this standard well-known algorithm, we prove theorems guaranteeing the correctness of the algorithm.
(3) M = Stratified space, P = Singular points. A stratified space generalises manifolds, and possesses singularities at which there is no local resemblance to a Euclidean space. We present a fast algorithm that detects singularities using local hypothesis testing. The kernel method used in the algorithm is significantly faster than previous topological methods. Experimental results on both synthetic and real data are presented. Furthermore, we prove a theorem that guarantees the algorithm’s correctness in the case of union of two manifolds.
(4) M = Manifold, P = Homotopy type. A standard way to infer homotopy type of a manifold from a finite sample is via constructing a simplicial complex at a small distance threshold. However instead of stopping at a small threshold, we consider arbitrarily large connectivity thresholds and study anomalous topology arising from this. In particular, we study a very specific case of circle M = S1 , and show that Cech complexes arising from finite samples on M are homotopic to bouquets of high-dimensional spheres with high probability.
Actions
Access Document
- Files:
-
-
(Preview, Dissemination version, pdf, 6.3MB, Terms of use)
-
Authors
- DOI:
- Type of award:
- DPhil
- Level of award:
- Doctoral
- Awarding institution:
- University of Oxford
- Language:
-
English
- Deposit date:
-
2025-12-25
- ARK identifier:
Terms of use
- Copyright holder:
- Sung Hyun Lim
- Copyright date:
- 2023
If you are the owner of this record, you can report an update to it here: Report update to this record