Seminar: Topological date analysis
Science today is marked by the collection of huge sets of data. The bottleneck is more and more the evaluation of this data.
In particular, one has to retrieve the decisive qualitative information in an efficient way (typically from noisy and high dimensional data which is presented in inappropriate coordinates).
Topology is (from the point of view of geometry in pure mathematics) the area which does precisely such a kind of job. In the last decades there has now also been a very active development to implement this in practical terms.
Example questions: given a scan of living tissue on the scale of cells: distinguish the different components (the membranes), detect connections (in particular their time evolution), but do this with noisy data and suppress irrelevant artefacts.
One suggestion to develop and apply algebraic topology to solve these problems will be the theme of the seminar, where we try to get all the way to some more or less real applications. The larger part, however, will be dedicated to develop the algebraic topology and geometry basics.
More precisely, our tool are homology groups (there is one for each integer n). Very roughly, these count n-dimensional holes in a topological space. E.g., an n-dimensinal sphere has precisely one n-dimensional hole, whereas a disk (of any dimension) has no hole at all (trivial homology).
To apply this to point sets (this is, what data measurement will produce), we construct a sequence of interesting topological spaces from this point set and apply homology to each of the spaces in the sequence. The persistent homology (our main tool) focuses on those homology features which persist for a long time in the sequence of spaces.
The course does not require previous knowledge of topology, it is in particular this which will be covered and learned in the seminar. We will then also see how this is used in practice, and learn about some fundamental theoretical features of persistent homology.
Examples for applications:
The most significant topological basics consist of
Specific for topological data analysis are
Nr | Thema | Quelle | Name | Termin | ||
0 | Kennenlernen in einer Göttinger Kneipe |
|
||||
1 | simplicial complexes | EH, (Z) | ||||
2 | homologie: intro | EH, (Z) | ||||
3 | Homologie II | EH, (Z) | ||||
4 | Simplicial complexes associated with point clouds I | EH III.2-III.4 | ||||
5 | Simplicial complexes associated with point clouds II | EH III.2-III.4 | ||||
6 | Simplicial complexes associated with point clouds III | EH III.2-III.4 | ||||
7 | Persistente Homologie | EH VII.1, Z, 6.1 | ||||
8 | Algorithmen für (persistente) Homologie | EH IV.2, VII.1, Z Kap 7 (?) | ||||
9 | Software (??) | (Rips), (GU) | ||||
10 | stability of persistent homology | EH VIII.2 | ||||
11 | Kombinatorische Morse-Theorie für Entrauschen | BLW 2-3 (evtl Z Kap 5) | ||||
12 | Optimaler Entrauschungsalgorithmus für Funktionen auf Flächen | BLW 4–5 | ||||
13 | Anwendung: 1-dimensionale Gen-Expression | EH IX.1 | ||||
14 | application: pockets in proteins | IX.2, GJ1 | ||||
15 | Application: a subtype of breast cancer | NLC | ||||
16 | Application: coverage in sensor networks | SG | ||||
Inhalt=geometrische, abstrakte Simplizialkomplexe, geometrische Realisierung, Triangulierung, Unterteilung, simpliziale Approximation.
Je nach Interesse behandelt wir dies mehr oder weniger ausführlich, mit 1-3 Vorträgen.
Eventuell: Algorithmus: Homologie von Unterkomplexen von R3 (EH V.4, DE): Grundidee=für die spezielle (aber praktisch relevante) Situation einer Teilmenge des R3 effizienten Algorithmus zur Homologieberechnung. Inhalt=Schwerpunkt der Algorithmus; ggf. knappe Behandlung von Alexander-Dualität.; mögliche Erweiterung: Nutzung für effiziente Algorithmen zu Persistenz (EH VII.2)
Als Hintergrund vielleicht ganz kurz die klassissche Morse-Theorie anreissen, wie in Z Kap 5 aufgebaut.
Literatur
(AGHLM) Attali, Glisse, Hornus,Lazarus, Morozov: Persistence-sensitive
simplication of functions on surfaces in linear time
(BLW) Bauer, Lange, Wardetzky: Optimal topological simplification of
discrete functions on surfaces.
(C) Carlsson, Gunnar: Topology and Data (Übersichtsartikel).
(DE) Delfinado and Edelsbrunner, H.: An efficient algorithm for Betti
Numbers of Simplicial Complexes
(E) Edelsbrunner, H.: The union of balls and its dual shape, Discrete
Comp. Geom. 13, 415-440 (1995)
(EH) Edelsbrunner, H. and J. Harer. Computational Topology. An
Introduction. Amer. Math. Soc., Providence, Rhode Island, 2009.
(Lehrbuch).
(EMP) Edelsbrunner, Moroov, Pascucci: Persistence-sensitive simplification
functions on 2-manifolds
(F1) R. Forman: Morse theory for cell complexes. Advances in Mathematics,
134(1):90–145, 1998.
(F2) R. Forman. A user’s guide to discrete Morse theory. Séminaire
Lotharingien de Combinatoire, B48c: 1–35, 2002.
(GJ1) Giesen and John: The flow complex: a data structure for geometric
modelling
(GJ2) Giesen and John: Computing the weighted flow complex
(H) A. Hatcher, Algebraic Topology. https://www.math.cornell.
edu/ hatcher/AT/AT.pdf
(M) Munkres, James: Elements of algebraic topology
(NLC) M. Nicolaua, A. J. Levineb, G. Carlsson, Topology based
data analysis identifies a subgroup of breast cancers with a unique
mutational profile and excellent survival, PNAS, 108 (17) 7265– 7270,
2011
(SG) V. de Silva, R. Ghrist, Coverage in sensor networks via persistent
homology, Algebraic and Geometric Topology 7, 339–358, 2007
(Z) Zomorodian, Afra: Computational topology (Lehrbuch)