This repository was archived by the owner on Jun 9, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6
Add additional covers: LandmarkBallCover & NeighborhoodCover #16
Open
yaraskaf
wants to merge
25
commits into
peekxc:master
Choose a base branch
from
yaraskaf:landmark-ballcover
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+558
−298
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Implementation of ball cover of lensed data space using landmark point set. Fixed bug in landmarks.cpp
Addition of ability to specify a radius rather than a number of balls to construct a LandmarkBallCover. Modified LandmarkBallCover.R to support use of epsilon parameter, added new landmark function to landmarks.R to find landmarks by radius rather than number.
…eccentricity as seed for landmark selection Modified construct_cover function in LandmarkBallCover.R to include selection of seed based on maximum eccentricity.
…umber of points When every ball has the same number of points, apply() returns a matrix rather than a list of lists, causing a crash. Replaced this function with splitting the matrix by column in this case.
…ompute eps-landmark set Replaced 1D euclidean distance with proxy::dist(), allowing arbitrary dist_method from proxy::pr_DB
Updated validate() method in LandmarkBallCover.R to ensure appropriate values of parameters for cover construction.
Implementation of cover by k-nearest neighbor sets.
…a cover set Fixed bug where some points are excluded from cover sets.
Allow neighborhoods of size greater than k when >k points have the same lens value. Now all points with the same lensed value end up in the same pullback set.
…ver.R Update headers, add reference to Dlotko paper.
Remove commented out code that computed landmarks in onle one dimensional data.
In this case, just take the unique balls as the cover sets. Duplicate lensed values should not form distinct centers.
Moved calculation of k-neighborhoods to NeighborhoodCover.R, eliminated unnecessary code, removed k-nhd functionality from landmarks.R
…ation Refactor code for epsilon-landmarks in landmarks.R, update documentation
Change list() to c() to avoid bug with pasting lists of parameters.
Thanks @yaraskaf ! One thing to note, since it came up in #7 , is that Dłotko calculated landmark sets from the point cloud X in the original data space, whereas the ball cover performs this calculation on the lensed (filtered) points f(X) in Z. That is to say, this is a type of cover for use with a mapper construction, rather than the alternative construction discussed in the Ball Mapper paper. (This caused confusion on our end so i thought it worth stating explicitly.) |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New Covers
LandmarkBallCover
As discussed in #7, the existing
BallCover.R
constructs an epsilon-ball around each lensed point, then unions all intersecting balls together. It would be useful to have a version of this cover that constructs an epsilon-landmark set in the lensed space, then uses balls centered at these landmarks as the cover-- this is the purpose ofLandmarkBallCover.R
.The landmark set can be chosen by specifying either
Radius:
epsilon
parameter sets radius of cover setNumber:
num_sets
parameter sets desired number of balls in the coverThe seed to use for the landmark selection algorithm can also be specified as one of
Specific point: set the index of the seed using the
seed_index
parameterRandom point: parameter
seed_method="RAND"
Max eccentricity: parameter
seed_method="ECC"
selects lensed point with highest eccentricityNeighborhoodCover
This is an additional cover type where open sets are formed by by k-neighborhoods about a landmark set. Instead of specifying a radius or a number of cover sets, the number of points/neighbors per set is specified via the
k
parameter. It also has the same options to select the seed as LandmarkBallCover.This might be useful when the lensed data has areas of high and low density-- there will be more cover sets in high density areas since sets are determined by a number of neighbors rather than a distance-based radius.
Demonstration
The following code was run after installing the current version of this
landmark-ballcover
branch.Simple example of functionality
First consider a simple test case to illustrate the difference between the covers:
The existing ball cover unions intersecting cover sets, resulting in two disconnected 0-simplices:
The landmark ball cover generates the following using the same epsilon:
The same output can be obtained by specifying a number of sets instead, i.e.
use_cover("landmark_ball", num_sets=4L)
.A different complex is generated using the neighborhood cover:
Seed options
For both new cover methods, there are three options for specifying the seed points:
(1) User-specified:
use_cover("landmark_ball", epsilon=1.1, seed_index=2L)
uses the second point inf_X
as the seed:(2) Eccentricity:
use_cover("landmark_ball", epsilon=1.1, seed_method="ECC")
uses the point inf_X
with the highest eccentricity as the seed:(3) Random:
use_cover("landmark_ball", epsilon=1.1, seed_method="RAND")
uses a random point inf_X
as the seed:Noisy circle
Both new covers can also recover the homology of a larger data set (over specific parameter ranges):
Other Modifications
The file
landmarks.R
was also updated to contain the epsilon-landmark selection algorithm from Dłotko's paper "Ball Mapper: A Shape Summary for Topological Data Analysis" (2019).Calling the function as
landmarks(f_X, n=N)
selectsN
landmarks using the originally implemented landmark method. Usinglandmarks(f_X, eps=epsilon)
selects however many landmarks are needed to create an epsilon-net using the Dłotko algorithm.Naming Issue
@corybrunson and I were thinking that it may be better to use the name "ball cover" for the landmark ball cover instead. Like we talked about in #7, the landmark ball cover might be the expected functionality for a method referencing a ball cover.
The existing
BallCover.R
could either be removed or renamed to something likeDisjointBallCover
-- thenLandmarkBallCover
could take the nameBallCover
. But I wanted to get your opinion before I did any refactoring since it may cause looking back at the commit history of "BallCover.R" to get a little confusing.