## File(s) under embargo

## 11

day(s)until file(s) become available

# Homological Representatives in Topological Persistence

Harnessing the power of data has been a driving force for computing in recently years. However, the non-vectorized or even non-Euclidean nature of certain data with complex structures also poses new challenges to the data science community. Topological data analysis (TDA) has proven effective in several scenarios for alleviating the challenges, by providing techniques that can reveal hidden structures and high-order connectivity for data. A central technique in TDA is called persistent homology, which provides intervals tracking the birth and death of topological features in a growing sequence of topological spaces. In this dissertation, we study the representative problem for persistent homology, motivated by the observation that persistent homology does not pinpoint a specific homology class or cycle born and dying with the persistence intervals. Furthermore, studying the representatives also leads us to new findings for related problems such as persistence computation.

First, we look into the representative problem for (standard) persistence homology and term the representatives as persistent cycles. We define persistent cycles as cycles born and dying with given persistence intervals and connect the definition to interval decomposition for persistence modules. We also look into the computation of optimal (minimum) persistent cycles which have guaranteed quality. We prove that it is NP-hard to compute minimum persistent p-cycles for the two types of intervals in persistent homology in general dimensions (p>1). In view of the NP-hardness results, we then identify a special but important class of inputs called weak (p+1)-pseudomanifolds whose minimum persistent p-cycles can be computed in polynomial time. The algorithms are based on a reduction to minimum (s,t)-cuts on dual graphs.

Second, we propose alternative persistent cycles capturing the dynamic changes of homological features born and dying with persistence intervals, which the previous persistent cycles do not reveal. We focus on persistent homology generated by piecewise linear (PL) functions and base our definition on an extension of persistence called the levelset zigzag persistence. We define a sequence of cycles called levelset persistent cycles containing a cycle between each consecutive critical points within the persistence interval. Due to the NP-harness results proven previously, we propose polynomial-time algorithms computing optimal sequences of levelset persistent p-cycles for weak (p+1)-pseudomanifolds. Our algorithms draw upon the idea of relating optimal cycles to min-cuts in a graph that we exploited earlier for standard persistent cycles. Note that levelset zigzag poses non-trivial challenges for the approach because a sequence of optimal cycles instead of a single one needs to be computed in this case.

Third, we investigate the computation of zigzag persistence on graph inputs, motivated by the fact that graphs model real-world circumstances in many applications where they may constantly change to capture dynamic behaviors of phenomena. Zigzag persistence, an extension of the standard persistence incorporating both insertions and deletions of simplices, is one appropriate instrument for analyzing such changing graph data. However, unlike standard persistence which admits nearly linear-time algorithms for graphs, such results for the zigzag version improving the general $O(m^\omega)$ time complexity are not known, where $\omega< 2.37286$ is the matrix multiplication exponent. We propose algorithms for zigzag persistence on graphs which run in near-linear time. Specifically, given a filtration of length m on a graph of size n, the algorithm for 0-dimension runs in $O(m\log^2 n+m\log m)$ time and the algorithm for 1-dimension runs in $O(m\log^4 n)$ time. The algorithm for 0-dimension draws upon another algorithm designed originally for pairing critical points of Morse functions on 2-manifolds. The correctness proof of the algorithm, which is a major contribution, is achieved with the help of representatives. The algorithm for 1-dimension pairs a negative edge with the earliest positive edge so that a representative 1-cycle containing both edges resides in all intermediate graphs.

## Funding

### TRIPODS+X:RES:Collaborative Research: Improving Templated Microstructures via Topological Data Analysis

Directorate for Computer & Information Science & Engineering

Find out more...### AF: Small: Expanding the Reach of Topological Data Analysis

Directorate for Computer & Information Science & Engineering

Find out more...### TRIPODS: Topology, Geometry, and Data Analysis (TGDA@OSU):Discovering Structure, Shape, and Dynamics in Data

Directorate for Computer & Information Science & Engineering

Find out more...