-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.rmd
94 lines (65 loc) · 4.67 KB
/
README.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
output: github_document
---
```{r, include=FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-"
)
devtools::load_all()
```
[![CRAN](http://www.r-pkg.org/badges/version/tdaunif)](https://cran.r-project.org/package=tdaunif)
[![downloads](https://cranlogs.r-pkg.org/badges/tdaunif)](https://cran.r-project.org/package=tdaunif)
# **tdaunif**: Uniform manifold samplers for topological data analysis
This R package contains sampling functions for topological manifolds embedded (or immersed) in Euclidean space.
## motivation
### consolidation
Statistical topology and topological data analysis include a variety of methods for detecting topological structure from point cloud data sets, most famously persistent homology. (See [Weinberger (2011)](https://www.ams.org/notices/201101/rtx110100036p.pdf) for a brief introduction.)
These methods are often validated by applying them to point clouds sampled from spaces with known topology.
Functions that generate such samples are therefore valuable to developers of topological--statistical software.
Such samplers have not been easy to find in the R package ecosystem. Handfuls exist in several different packages, but no single package or code repository has assembled a large collection for convenient use. That is the goal of this package.
### uniformity
The simplest validations use points sampled uniformly from the surfaces of manifolds embedded in Euclidean space. Uniformity here is with respect to the surface area of the embedded manifold, but such uniform sampling is not trivial to do when the manifold can only be expressed via parameterization from a simpler parameter space, as when [a square is rolled and bent into a torus](https://en.wikipedia.org/wiki/Torus#Flat_torus). Uniform sampling on the parameter space may then produce uneven sampling on the manifold, regions of greater compression being oversampled relative to regions of greater expansion.
One especially important motivation for this package is therefore to include uniform samplers for popular embeddings of topological manifolds, including functions that allow users to build their own with minimal effort (see `help(manifold, package = "tdaunif")`).
## usage
### installation
The latest release of **tdaunif** can be installed from [CRAN](https://cran.r-project.org/):
```{r eval=FALSE}
install.packages("tdaunif")
```
You can install the current development version of **tdaunif** from GitHub using [the **remotes** package](https://github.com/r-lib/remotes):
```{r eval=FALSE}
remotes::install_github("tdaverse/tdaunif", build_vignettes = TRUE)
```
To stabilize this document, this chunk saves the default graphics settings (as required by CRAN) and initializes the random number generator:
```{r stabilize}
oldpar <- par(no.readonly = TRUE)
set.seed(0)
```
### illustration
[An intuitive embedding of the Klein bottle into 4-space](https://en.wikipedia.org/wiki/Klein_bottle#3D_pinched_torus_/_4D_M%C3%B6bius_tube) is adapted from the popular "donut" embedding of the torus in 3-space. In **tdaunif** this is called the "tube" parameterization. We can sample uniformly from this embedding (without noise) and examine the coordinate projections as follows:
```{r klein bottle}
x <- sample_klein_tube(n = 360, ar = 3, sd = 0)
pairs(x, asp = 1, pch = 19, cex = .5)
```
Compare these neat projections to those of the same sample with noise added in the ambient Euclidean space:
```{r klein bottle with noise}
x <- add_noise(x, sd = .2)
pairs(x, asp = 1, pch = 19, cex = .5)
```
For a thorough discussion of this sampler, see [my blog post](https://corybrunson.github.io/2019/02/01/sampling/) describing the method presented in [Diaconis, Holmes, and Shahshahani (2013)](https://projecteuclid.org/euclid.imsc/1379942050).
### another illustration
A subset of samplers allow stratified sampling, which relies on an analytic solution to the problem of length or area distortion created by most parameterizations. For example, the planar triangle sampler can be stratified to produce a more uniform arrangement of points than a properly uniform sample would produce:
```{r planar triangle, uniformly and with stratification}
equilateral <- cbind(c(0,0), c(0.5,sqrt(3)/2), c(1,0))
x <- sample_triangle_planar(n = 720, triangle = equilateral)
y <- sample_triangle_planar(n = 720, triangle = equilateral, bins = 24)
par(mfrow = c(1L, 2L))
plot(x, asp = 1, pch = 19, cex = .5)
plot(y, asp = 1, pch = 19, cex = .5)
```
See [Arvo (1995)](https://dl.acm.org/doi/10.1145/218380.218500) and Arvo's notes from [Siggraph 2001](https://www.cs.princeton.edu/courses/archive/fall04/cos526/papers/course29sig01.pdf) for a detailed treatment of this technique.
```{r restore}
par(oldpar)
```