{phutil}

The goal of the {phutil} package is to provide utility functions for persistence data analysis. In particular, the package defines a data structure for hosting persistence data. The package is part of the tdaverse suite of packages, which aims to provide a consistent interface for topological data analysis in R that nicely integrates with the tidymodels ecosystem.

Installation

You can install the development version of {phutil} from GitHub with:

# install.packages("pak")
pak::pak("tdaverse/phutil")

The ‘persistence’ class

The package currently provides a new data structure for hosting persistence data. The data set noisy_circle_points is a simulated data set consisting of 100 points sampled from a circle with additive Gaussian noise using a standard deviation of $0.05$ :

library(phutil)
head(noisy_circle_points)
#>               x          y
#> [1,]  0.6651721  0.6363172
#> [2,] -0.7481066 -0.6901262
#> [3,] -0.8288544 -0.5519689
#> [4,] -0.7650181 -0.7436499
#> [5,]  0.6337296 -0.7607464
#> [6,] -0.6077663 -0.7036492

The data set noisy_circle_ripserr is a persistence diagram computed using the ripserr package with the ripserr::vietoris_rips() and stored as an object of class ‘PHom’, which is a light wrapper around a data frame with 3 variables:

head(noisy_circle_ripserr)
#>   dimension birth       death
#> 1         0     0 0.008162723
#> 2         0     0 0.008548402
#> 3         0     0 0.009238432
#> 4         0     0 0.014004707
#> 5         0     0 0.020555701
#> 6         0     0 0.024621462

The data set noisy_circle_tda_rips is a persistence diagram computed using the TDA package with the TDA::ripsDiag() and stored as a list of length 1 containing the object diagram of class ‘diagram’, which is a matrix with 3 columns:

head(noisy_circle_tda_rips$diagram)
#>      dimension Birth     Death
#> [1,]         0     0 1.6322000
#> [2,]         0     0 0.3677392
#> [3,]         0     0 0.3325817
#> [4,]         0     0 0.3226673
#> [5,]         0     0 0.2096916
#> [6,]         0     0 0.1953531

The data structure adopted by the tdaverse suite of packages and provided by this package is designed to be a common interface for persistence data. This allows for a seamless integration of persistence data across different packages and ensures that the data is always in a consistent format.

Specifically, the ‘persistence’ class is a list with the following components:

pairs: A list of 2-column matrices containing birth-death pairs. The -th element of the list corresponds to the -th homology dimension. If there is no pairs for a given dimension but there are pairs in higher dimensions, the corresponding element(s) is/are filled with a numeric matrix.
metadata: A list of length 6 containing information about how the data was computed:
- orderered_pairs: A boolean indicating whether the pairs in the pairs list are ordered (i.e. the first column is strictly less than the second column).
- data: The name of the object containing the original data on which the persistence data was computed.
- engine: The name of the package and the function of this package that computed the persistence data in the form "package_name::package_function".
- filtration: The filtration used in the computation in a human-readable format (i.e. full names, capitals where need, etc.).
- parameters: A list of parameters used in the computation.
- call: The exact call that generated the persistence data.

The as_persistence() function is used to convert persistence data from different packages into the ‘persistence’ class. For example, the persistence data computed using the ripserr package can be converted into the ‘persistence’ class as follows:

as_persistence(noisy_circle_ripserr)
#> 
#> ── Persistence Data ────────────────────────────────────────────────────────────
#> ℹ There are 99 and 2 pairs in dimensions 0 and 1 respectively.
#> ℹ Computed from a Vietoris-Rips/Cubical filtration using `ripserr::<vietoris_rips/cubical>()`.
#> ! With unknown parameters.

Similarly, the persistence data computed using the TDA package can be converted into the ‘persistence’ class as follows:

as_persistence(noisy_circle_tda_rips)
#> 
#> ── Persistence Data ────────────────────────────────────────────────────────────
#> ℹ There are 100 and 2 pairs in dimensions 0 and 1 respectively.
#> ℹ Computed from a Vietoris-Rips filtration using `TDA::ripsDiag()`.
#> ℹ With the following parameters: maxdimension = 1 and maxscale = 1.6322.

Distances between persistence diagrams

The package also provides a function to compute the bottleneck distance between two persistence diagrams. The function bottleneck_distance() takes two persistence diagrams as input and computes the bottleneck distance between them:

library(phutil)
diag1 <- persistence_sample[[1]]
diag2 <- persistence_sample[[2]]
bottleneck_distance(diag1, diag2)
#> [1] 0.06197453

One can also compute the Wasserstein distance between two persistence diagrams using the wasserstein_distance() function:

wasserstein_distance(diag1, diag2)
#> [1] 1.53403

It is also possible to compute these distances between all pairs of persistence diagrams in a list using e.g. wasserstein_pairwise_distances():

wasserstein_pairwise_distances(persistence_sample[1:5])
#>          1        2        3        4
#> 2 1.534030                           
#> 3 2.301812 1.852241                  
#> 4 2.354967 2.362616 1.922799         
#> 5 2.374818 2.707366 1.985518 3.154120

It returns an object of class ‘dist’. It is parallelized using OpenMP and relies under the hood on the Hera C++ library which is BSD-licensed. The code interfacing R and C++ is generated by the header-only {cpp11} package which is MIT-licensed.

Contributions

Code of Conduct

Please note that the {phutil} package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Acknowledgments

This project was funded by an ISC grant from the R Consortium and done in coordination with Jason Cory Brunson and with guidance from Bertrand Michel and Paul Rosen. It builds upon conversations with Mathieu Carrière and Vincent Rouvreau who are among the authors of the GUDHI library. Package development also benefited from the support of colleagues at the Department of Mathematics Jean Leray and the use of equipment at Nantes University.

Name	Name	Last commit message	Last commit date
Latest commit astamm Merge pull request #33 from tdaverse/vignette Apr 27, 2025 2088778 · Apr 27, 2025 History 144 Commits
.github	.github	Fix known Windows CI issue with quarto.	Apr 26, 2025
R	R	Fix all TODOs and all but one FIXME in vignette.	Apr 26, 2025
data-raw	data-raw	Pre-compute large PDs and export them. Reformat vignette. Use HTML ou…	Apr 25, 2025
data	data	Pre-compute large PDs and export them. Reformat vignette. Use HTML ou…	Apr 25, 2025
inst/tinytest	inst/tinytest	Fix all TODOs and all but one FIXME in vignette.	Apr 26, 2025
man	man	Default tolerance to sqrt(.Machine$double.eps). Closes #32 .	Apr 25, 2025
sandbox	sandbox	Move not working validation example in sandbox folder, excluded from …	Apr 27, 2025
src	src	Added unit tests for persistence set class. Added air formatting.	Apr 24, 2025
tests	tests	Add toy data. Add persistence object description.	Mar 31, 2025
vignettes	vignettes	Merge branch 'vignette' of https://github.com/tdaverse/phutil into vi…	Apr 27, 2025
.Rbuildignore	.Rbuildignore	Move not working validation example in sandbox folder, excluded from …	Apr 27, 2025
.gitignore	.gitignore	Switch to quarto for vignettes.	Apr 26, 2025
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md	Updated README for acknowledgments and distances. Closes #26 .	Apr 24, 2025
DESCRIPTION	DESCRIPTION	Switch to quarto for vignettes.	Apr 26, 2025
LICENSE	LICENSE	Switch to MIT license. Closes #21 .	Apr 24, 2025
LICENSE.md	LICENSE.md	Switch to MIT license. Closes #21 .	Apr 24, 2025
NAMESPACE	NAMESPACE	Merge main branch.	Apr 24, 2025
README.Rmd	README.Rmd	Avoid use of tidyr in autoplot.bench_mark().	Apr 25, 2025
README.md	README.md	Avoid use of tidyr in autoplot.bench_mark().	Apr 25, 2025
_pkgdown.yml	_pkgdown.yml	Add CI, website and coverage report.	Apr 1, 2025
codecov.yml	codecov.yml	Add CI, website and coverage report.	Apr 1, 2025
phutil.Rproj	phutil.Rproj	Start from plt class.	Mar 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

{phutil}

Installation

The ‘persistence’ class

Distances between persistence diagrams

Contributions

Code of Conduct

Acknowledgments

About

Licenses found

Releases

Packages

Contributors 2

Languages

License

tdaverse/phutil

Folders and files

Latest commit

History

Repository files navigation

{phutil}

Installation

The ‘persistence’ class

Distances between persistence diagrams

Contributions

Code of Conduct

Acknowledgments

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages