-
Notifications
You must be signed in to change notification settings - Fork 6
Ball Cover Functionality #7
Comments
The BallCover creates a ball of radius So when I can add more documentation on that. Is there a specific kind of functionality you're looking for? I can always add other covers. |
Thank you! That definitely makes more sense. @corybrunson, could you please chime in about what our original thoughts were? |
@20choudharysa thanks for flagging me. @peekxc thanks for the explanation. If i understand it correctly, the cover sets are then determined (not algorithmically, just mathematically) by single-linkage clustering on the lensed data, using the Our expectations were different: We expected the set of balls, one for each data point, to be the cover. This would be like performing this construction by Dłotko, except without bothering to find a more practical subset of the original points whose balls still covered the whole data set. As discussed there, a key feature of the resulting cover is that the sets overlap. As i understand |
Yep, that seems right, the balls centered at the points in the same connected components of the SL/eps-neighborhood graph/VR-complex/etc. forms the sets
I originally hastily made this cover after reading some notes from Carlsson (somewhere...) on Mapper where I guess originally in Mappers design they tried a couple different types of covers, one of which included a ball cover. I had some issues however in implementation, at the time I wasn't sure how to determine where to place the eps-balls while guaranteeing their union forms a valid cover. I ended up just placing a ball at each point, which as you pointed out will only produce -simplices, and yes this seems not too desirable aside from the pure simplification (I haven't done too many experiments). My work focuses on the fixed interval type of covers, so that's where I've spent most of my time. In retrospect, perhaps a better strategy would be to add an additional parameter which allows for overlap. I'm open to suggestions on this. One idea would be to iteratively choose landmark points (see example usage in the new vignette here) to form eps-balls around, continuing until every point is within eps or at least one landmark, or the user could specify the number of landmarks.
Seems fairly straightforward. I'll look into this now. |
Oh neat, i didn't know about that part of the origin story. I very much like the idea of following Dłotko's procedure. Their generation of epsilon nets seems to me to be equivalent to the generation of landmark points, on the assumption that "minimizes" should be "maximizes" in your vignette (line 128 in the |
Yeah actually I was just thinking that, trying to determine if they are indeed in the same. I'm a little confused by what dist is in Algorithm 2 of Dłotko's paper, since C is a set. If it's the minimum from p to any points in the set, then it seems equivalent. I used the argmax to summarize section 2.3 of this. |
That's what i'm trained to think of as the distance from a point to a set. : ) So, yes, i believe they're equivalent (which of course makes sense given that they're trying to accomplish the same thing). |
Gotcha I could reformat the Ball cover to do something like that (sorry I'm new to things like eps-nets). I'll need to piece together some things first, not sure if I'll end up needing the RANN package dependency for this still or not. Edit: It might be a few days before I can get to this. Feel free to submit a PR @20choudharysa if you want |
A good introduction to the use of \epsilon-nets is: |
@peekxc is the ball cover a priority for you? If not, since you've already implemented landmarking, this could be useful exercise on our end. Though it would be some weeks before we get to it. |
@corybrunson It's currently 3rd on the priority list, I'm not sure when I can get to it. Shouldn't be too hard if euclidean distance is desired: could just augment landmark procedure to accept either an integer N or a distance epsilon to determine when to stop the calculation of new landmarks. Then the procedure could be used as-is I think, combined with e.g. the RANN package for fixed-radius searching. My other two priorities:
|
@peekxc I'd be very interested in item 1 as well, especially for the landmark maxmin method. p-norms, or their pth powers would be great. |
Just as an FYI-- @corybrunson and I have nearly finished an implementation of the BallCover as discussed above, including the optimization (selection of a "landmark" set) described in the Dłotko article. We'll make that available via PR soon! |
@yaraskaf Sounds good, I look forward to it! Once you submit the PR, could you please mark it as editable by me that way I can make it play nice with the upcoming nerve computation? I've made some moderately-sized changes in the upcoming version on the nerve computation to make it more general and efficient to address (2) above, which will affect how the Ball Cover interacts with the Mapper instances. I did end up modifying the BallCover on the upcoming version to use general metrics (and to actually do something useful, lol). It's not super efficient though; just uses maxmin landmark selection (in C++ for euclidean, otherwise in R), and then proxy's pairwise dist overload. Implementation looks something like (in the cover construction method):
Not sure if this is similar to what you have or completely different? @mpcasey2718 I can add any specific distance measures to the maxmin procedure. The upcoming version includes support for theoretically any distance measure (even user-supplied) in pure R. Doing the same in C++ requires specifically coding up the other distances needed, which maybe one day I'll code up the major ones in an efficient way like this so that I can do something like this. But first I gotta pass my classes :) |
@peekxc Great, thanks for your response! I'll definitely make the PR editable by you. The basic cover construction is pretty similar to the code you posted, but just to give you a little more detail on the functionality we implemented:
Does that sound reasonable to you? |
That sounds awesome! |
Hi again,
@corybrunson, @20demkowiczbr and I have been using the
BallCover
and we are having difficulty understanding its functionality. We created a data set containing three points (unit distance apart) and performed three mapper constructions with three differentepsilon
values (0.5, 1, 1.5). We are unsure as to why with an epsilon value of 1 and 1.5, the simplicial complex obtained is a 0-simplex.Thanks
Created on 2019-08-06 by the reprex package (v0.3.0)
The text was updated successfully, but these errors were encountered: