Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

custom clustering now takes position matrix instead of distance^2 matrix #87

Open
wants to merge 48 commits into
base: custom_cluster
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
f1f52ed
neighbor spelling (assuming we're sticking to American spelling!)
AdamOrmondroyd May 19, 2022
1c40b21
custom clustering now takes position matrix instead of distance^2 mat…
AdamOrmondroyd May 23, 2022
634a4b8
corrected order of dimensions of position_matrix
AdamOrmondroyd May 24, 2022
d98cd08
renamed calculate_similarity_matrix() to calculate_position_matrix()
AdamOrmondroyd May 24, 2022
a04231a
renamed calculate_similarity_matrix() to calculate_position2_matrix()
AdamOrmondroyd May 24, 2022
d9571a3
Merge branch 'ormorod-custom_cluster' of github.com:Ormorod/PolyChord…
AdamOrmondroyd May 24, 2022
9568985
corrected index in default_cluster
AdamOrmondroyd May 25, 2022
92b3809
added new dimension to cluster input as position_matrix is not square…
AdamOrmondroyd May 26, 2022
db02cc2
changed cluster interfaces to add extra dimension for position_matrix
AdamOrmondroyd May 26, 2022
876de66
Merge branch 'master' into ormorod-custom_cluster
williamjameshandley May 26, 2022
f5f8c93
Updated CI to work on pull request to all branches
williamjameshandley May 26, 2022
eeb83ac
changed cluster interfaces to add extra dimension for position_matrix
AdamOrmondroyd May 26, 2022
4d68c9d
changed distance2_matrix to no longer be square
AdamOrmondroyd May 26, 2022
b6885cb
added extra dimension to cluster argument for non-square distance2_ma…
AdamOrmondroyd May 26, 2022
4f638f2
Merge branch 'ormorod-custom_cluster' of github.com:Ormorod/PolyChord…
AdamOrmondroyd May 26, 2022
7da5423
renamed m to nDims, n to nPoints, and position_matrix (and distance2_…
AdamOrmondroyd May 26, 2022
fa5c42d
changed neighbour to ~~correct~~ UK spelling
AdamOrmondroyd May 26, 2022
5184835
forgot to change the interface in clustering itself
AdamOrmondroyd May 26, 2022
3ed846c
another clustering interface using the wrong dimension of points
AdamOrmondroyd May 26, 2022
e4622f5
swapped indices over
AdamOrmondroyd May 26, 2022
1b90073
Merge branch 'ormorod-custom_cluster' of github.com:Ormorod/PolyChord…
AdamOrmondroyd May 26, 2022
23d8db8
Revert "swapped indices over"
AdamOrmondroyd May 26, 2022
4a9bf0a
now uses c indexing in c and python, and fortran indexing in fortran
AdamOrmondroyd May 26, 2022
2f0c339
Merge branch 'PolyChord:master' into master
AdamOrmondroyd May 28, 2022
f68dcb3
similarity matrix renamed to distance^2 in comments
AdamOrmondroyd Jun 13, 2022
da78633
wrap_cluster() adds one to the cluster list, as Python algos number c…
AdamOrmondroyd Jun 30, 2022
512fca1
Merge branch 'ormorod-custom_cluster' of github.com:Ormorod/PolyChord…
AdamOrmondroyd Jun 30, 2022
be3467a
example cluster now returns -1
AdamOrmondroyd Jun 30, 2022
eb412c3
Merge branch 'master' into ormorod-custom_cluster
AdamOrmondroyd Jul 29, 2022
9fbf7a3
Merge branch 'custom_cluster' of github.com:PolyChord/PolyChordLite i…
AdamOrmondroyd Jul 29, 2022
14efda1
example to test custom clustering
AdamOrmondroyd Jul 29, 2022
3029e60
equal weights of each Gaussian peak, corrected comments
AdamOrmondroyd Jul 29, 2022
88572e3
remove mutual_nearest_neighbours function
AdamOrmondroyd Jul 29, 2022
b49fb80
replace sklearn dependence with compute_knn
AdamOrmondroyd Jul 29, 2022
f8dc320
compute_knn() computed the full nn ordering, then returned the first …
AdamOrmondroyd Jul 29, 2022
4d95077
realised the for loop in the copied knn clustering is actually a whil…
AdamOrmondroyd Jul 29, 2022
2d99c89
Removed unused num_clusters_old from native clustering
AdamOrmondroyd Jul 29, 2022
1a561de
comment reminds that clustering algorithm should be 0-indexed
AdamOrmondroyd Jul 29, 2022
f67543d
updated run_pypolychord.ipynb with py2nb
AdamOrmondroyd Jul 29, 2022
8954de3
correct docstring to clarify cluster function must number clusters fr…
AdamOrmondroyd Sep 22, 2022
ebf3b91
further correction to clustering in docstring
AdamOrmondroyd Sep 22, 2022
e1102fb
change cluster function example comments to docstring
AdamOrmondroyd Sep 22, 2022
4d3adcc
remove unnecessary np.sqrt import
AdamOrmondroyd Sep 22, 2022
70628a5
Correct root=root -> root=0 in make_resume_file (#92)
AdamOrmondroyd Feb 7, 2023
6cf0fc4
PWD->CURDIR
williamjameshandley Feb 16, 2023
3084a81
Updated PWD to CURDIR in setup.py
williamjameshandley Feb 16, 2023
1854372
Update python3.6 CI (#96)
AdamOrmondroyd Feb 20, 2023
cfc77c4
Merge branch 'master' into ormorod-custom_cluster
AdamOrmondroyd Feb 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ on:
push:
branches: [ master ]
pull_request:
branches: [ master ]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you merge PolyChord:master into Ormorod:ormorod-custom_cluster so that the diff is cleaner?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be good to add a python test of clustering if you have time.

schedule:
- cron: "0 0 * * *"

Expand Down
8 changes: 4 additions & 4 deletions Makefile_cray
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ LDSHARED = $(LD) -shared
# Archive tool
AR = ar rv

FFLAGS += -fpp -fpic -qopenmp -dynamic
CXXFLAGS += -fpic -qopenmp -dynamic
FFLAGS += -cpp -fpic -fopenmp -dynamic
CXXFLAGS += -fpic -fopenmp -dynamic
CFLAGS += -fpic
#setting for Edison/Cori on NERSC
ifneq ($(NERSC_HOST),)
FFLAGS += -axMIC-AVX512,CORE-AVX2
CXXFLAGS += -axMIC-AVX512,CORE-AVX2
FFLAGS += -ffree-line-length-none -fallow-argument-mismatch
CXXFLAGS += -ffree-line-length-none -fallow-argument-mismatch
endif

ifdef IPO
Expand Down
20 changes: 10 additions & 10 deletions pypolychord/_pypolychord.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -117,24 +117,24 @@ void dumper(int ndead, int nlive, int npars, double* live, double* dead, double*
/* Callback to the cluster */
static PyObject *python_cluster = NULL;

void cluster(double* distance2_matrix, int* cluster_list, int n)
void cluster(double* points, int* cluster_list, int nDims, int nPoints)
{
/* create a python version of distance2_matrix */
npy_intp shape[] = {n,n};
PyObject *array_distance2_matrix = PyArray_SimpleNewFromData(2, shape, NPY_DOUBLE, distance2_matrix);
if (array_distance2_matrix ==NULL) throw PythonException();
PyArray_CLEARFLAGS(reinterpret_cast<PyArrayObject*>(array_distance2_matrix), NPY_ARRAY_WRITEABLE);
/* create a python version of points */
npy_intp shape[] = {nPoints,nDims};
PyObject *array_points = PyArray_SimpleNewFromData(2, shape, NPY_DOUBLE, points);
if (array_points ==NULL) throw PythonException();
PyArray_CLEARFLAGS(reinterpret_cast<PyArrayObject*>(array_points), NPY_ARRAY_WRITEABLE);

/* create a python version of cluster_list */
npy_intp shape1[] = {n};
npy_intp shape1[] = {nPoints};
PyObject *array_cluster_list = PyArray_SimpleNewFromData(1, shape1, NPY_INT, cluster_list);
if (array_cluster_list==NULL) {Py_DECREF(array_distance2_matrix); throw PythonException();}
if (array_cluster_list==NULL) {Py_DECREF(array_points); throw PythonException();}

/* Compute cluster_list from the cluster */
PyObject_CallFunctionObjArgs(python_cluster,array_distance2_matrix,array_cluster_list,NULL);
PyObject_CallFunctionObjArgs(python_cluster,array_points,array_cluster_list,NULL);

/* Garbage collect */
Py_DECREF(array_cluster_list); Py_DECREF(array_distance2_matrix);
Py_DECREF(array_cluster_list); Py_DECREF(array_points);
}


Expand Down
15 changes: 9 additions & 6 deletions pypolychord/polychord.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ def default_dumper(live, dead, logweights, logZ, logZerr):
pass


def default_cluster(distance2_matrix):
return np.zeros(distance2_matrix.shape[0],dtype=int)
def default_cluster(points):
return np.zeros(points.shape[0],dtype=int)


def run_polychord(loglikelihood, nDims, nDerived, settings,
Expand Down Expand Up @@ -112,8 +112,8 @@ def run_polychord(loglikelihood, nDims, nDerived, settings,

Parameters
----------
distance2_matrix: numpy.array
squared distances between points. Shape (nPoints, nPoints)
points: numpy.array
positions of points. Shape (nPoints, nDims)

Returns
-------
Expand Down Expand Up @@ -197,8 +197,11 @@ def wrap_loglikelihood(theta, phi):
def wrap_prior(cube, theta):
theta[:] = prior(cube)

def wrap_cluster(distance2_matrix, cluster_list):
cluster_list[:] = cluster(distance2_matrix)
def wrap_cluster(points, cluster_list):
cluster_list[:] = cluster(points)

settings.grade_dims = [int(d) for d in settings.grade_dims]
settings.nlives = {float(logL):int(nlive) for logL, nlive in settings.nlives.items()}

# Run polychord from module library
_pypolychord.run(wrap_loglikelihood,
Expand Down
4 changes: 2 additions & 2 deletions run_pypolychord.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,8 @@ def dumper(live, dead, logweights, logZ, logZerr):

#| Optional cluster function allow user-defined clustering

def cluster(distance2_matrix):
npoints = distance2_matrix.shape[0]
def cluster(points):
npoints = points.shape[0]
clusters = np.ones(npoints, dtype=int)

# <do some clustering algorithm to assign clusters>
Expand Down
10 changes: 5 additions & 5 deletions src/polychord/c_interface.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,14 +46,14 @@ void run_polychord(
double (*c_loglikelihood_ptr)(double*,int,double*,int),
void (*c_prior_ptr)(double*,double*,int),
void (*c_dumper_ptr)(int,int,int,double*,double*,double*,double,double),
void (*c_cluster_ptr)(double*,int*,int),
void (*c_cluster_ptr)(double*,int*,int,int),
Settings s, MPI_Comm& comm)
#else
void run_polychord(
double (*c_loglikelihood_ptr)(double*,int,double*,int),
void (*c_prior_ptr)(double*,double*,int),
void (*c_dumper_ptr)(int,int,int,double*,double*,double*,double,double),
void (*c_cluster_ptr)(double*,int*,int),
void (*c_cluster_ptr)(double*,int*,int,int),
Settings s)
#endif
{
Expand Down Expand Up @@ -125,7 +125,7 @@ void run_polychord(
double (*c_loglikelihood_ptr)(double*,int,double*,int),
void (*c_prior_ptr)(double*,double*,int),
void (*c_dumper_ptr)(int,int,int,double*,double*,double*,double,double),
void (*c_cluster_ptr)(double*,int*,int),
void (*c_cluster_ptr)(double*,int*,int,int),
Settings s)
{
int flag;
Expand Down Expand Up @@ -228,5 +228,5 @@ void default_prior(double* cube, double* theta, int nDims)

void default_dumper(int,int,int,double*,double*,double*,double,double) {}

void default_cluster(double* distance2_matrix, int* cluster_list, int n)
{ for(int i=0;i<n;i++) cluster_list[i] = 0; }
void default_cluster(double* points, int* cluster_list, int nDims, int nPoints)
{ for(int i=0;i<nPoints;i++) cluster_list[i] = 0; }
24 changes: 12 additions & 12 deletions src/polychord/calculate.f90
Original file line number Diff line number Diff line change
Expand Up @@ -79,34 +79,34 @@ function calculate_posterior_point(settings,point,logweight,evidence,volume) res
end function calculate_posterior_point


!> This function computes the similarity matrix of an array of data.
!> This function computes the distance^2 matrix of an array of data.
!!
!! Assume that the data_array can be considered an indexed array of vectors
!! Assume that the points can be considered an indexed array of vectors
!! V = ( v_i : i=1,n )
!!
!! The similarity matrix can be expressed very neatly as
!! The distance^2 matrix can be expressed very neatly as
!! d_ij = (v_i-v_j) . (v_i-v_j)
!! = v_i.v_i + v_j.v_j - 2 v_i.v_j
!!
!! The final term can be written as a data_array^T data_array, and the first
!! The final term can be written as a points^T points, and the first
!! two are easy to write. We can therefore calculate this in two lines with
!! instrisic functions
function calculate_similarity_matrix(data_array) result(similarity_matrix)
function calculate_distance2_matrix(points) result(distance2_matrix)

real(dp), intent(in), dimension(:,:) :: data_array
real(dp), intent(in), dimension(:,:) :: points

real(dp), dimension(size(data_array,2),size(data_array,2)) :: similarity_matrix
real(dp), dimension(size(points,1),size(points,1)) ::distance2_matrix

integer :: i


similarity_matrix = spread( &
[ ( dot_product(data_array(:,i),data_array(:,i)), i=1,size(data_array,2) ) ], &
dim=2,ncopies=size(data_array,2) )
distance2_matrix = spread( &
[ ( dot_product(points(i,:),points(i,:)), i=1,size(points,1) ) ], &
dim=2,ncopies=size(points,1) )

similarity_matrix = similarity_matrix + transpose(similarity_matrix) - 2d0 * matmul( transpose(data_array),data_array )
distance2_matrix = distance2_matrix + transpose(distance2_matrix) - 2d0 * matmul(points,transpose(points))

end function calculate_similarity_matrix
end function calculate_distance2_matrix



Expand Down
56 changes: 32 additions & 24 deletions src/polychord/clustering.f90
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ module KNN_clustering
contains

!> This function returns a clustering from a similarity matrix based on
!! 'nearest neighbor' clustering.
!! 'nearest neighbour' clustering.
!!
!! Points belong to the same cluster if they are in either of each others k
!! nearest neighbor sets.
!! nearest neighbour sets.
!!
!! The algorithm computes the k nearest neihbor sets from the similarity
!! matrix, and then tests
Expand Down Expand Up @@ -43,7 +43,7 @@ recursive function NN_clustering(distance2_matrix) result(cluster_list)
! 10 degrees of separation are usually fine, we'll expand this if necessary
k = min(nlive,10)

! compute the k nearest neighbors for each point
! compute the k nearest neighbours for each point
knn(:k,:) = compute_knn(distance2_matrix,k)

! Set up the cluster list
Expand Down Expand Up @@ -115,7 +115,7 @@ function do_clustering_k(knn) result(c)
! If they're not in the same cluster already...
if(c(i)/=c(j)) then
! ... check to see if they are within each others k nearest neihbors...
if( neighbors( knn(:,i),knn(:,j) ) ) then
if( neighbours( knn(:,i),knn(:,j) ) ) then

! If they are then relabel cluster_i and cluster_j to the smaller of the two
where(c==c(i).or.c==c(j))
Expand All @@ -138,10 +138,10 @@ function compute_knn(distance2_matrix,k) result(knn)
!> The data to compute on
real(dp), intent(in),dimension(:,:) :: distance2_matrix

!> The number of nearest neighbors to compute
!> The number of nearest neighbours to compute
integer, intent(in) :: k

! The indices of the k nearest neighbors to output
! The indices of the k nearest neighbours to output
integer, dimension(k,size(distance2_matrix,1)) :: knn

integer :: nPoints,i,j
Expand All @@ -155,7 +155,7 @@ function compute_knn(distance2_matrix,k) result(knn)
knn=0

do i=1,nPoints
! Find the k nearest neighbors for each point
! Find the k nearest neighbours for each point
distance2s = huge(1d0)
do j=1,nPoints
! If the distance between i and j is too large to be considered,
Expand All @@ -175,22 +175,22 @@ function compute_knn(distance2_matrix,k) result(knn)
end function compute_knn


! Return whether they're each others n nearest neighbor list
function neighbors(knn1,knn2) result(same_list)
! Return whether they're each others n nearest neighbour list
function neighbours(knn1,knn2) result(same_list)
implicit none
integer,intent(in), dimension(:) :: knn1
integer,intent(in), dimension(:) :: knn2

logical :: same_list

! Check to see if they're in each others neighbors lists
! Check to see if they're in each others neighbours lists
same_list= any(knn1==knn2(1)) .or. any(knn2==knn1(1))

end function neighbors
end function neighbours



! Return the number of matches in the n nearest neighbor list
! Return the number of matches in the n nearest neighbour list
function matches(knn1,knn2)
implicit none
integer,intent(in), dimension(:) :: knn1
Expand Down Expand Up @@ -254,7 +254,7 @@ module cluster_module
function do_clustering(settings,RTI,cluster,sub_dimensions)
use settings_module, only: program_settings
use run_time_module, only: run_time_info,add_cluster
use calculate_module, only: calculate_similarity_matrix
use calculate_module, only: calculate_distance2_matrix
use KNN_clustering, only: NN_clustering
implicit none

Expand All @@ -266,10 +266,10 @@ function do_clustering(settings,RTI,cluster,sub_dimensions)
integer,dimension(:),optional,intent(in) :: sub_dimensions

interface
function cluster(distance2_matrix) result(cluster_list)
function cluster(points) result(cluster_list)
import :: dp
real(dp), intent(in), dimension(:,:) :: distance2_matrix
integer, dimension(size(distance2_matrix,1)) :: cluster_list
real(dp), intent(in), dimension(:,:) :: points
integer, dimension(size(points,1)) :: cluster_list
end function
end interface

Expand All @@ -278,14 +278,16 @@ function cluster(distance2_matrix) result(cluster_list)


! Similarity matrix
real(dp),dimension(sum(RTI%nlive),sum(RTI%nlive)) :: distance2_matrix
real(dp),dimension(settings%nDims,sum(RTI%nlive)) :: points
integer,dimension(sum(RTI%nlive)) :: clusters

integer :: num_clusters
integer :: num_old_clusters

! number of live points
integer :: nlive
! number of dimensions
integer :: nDims

integer :: i_cluster

Expand All @@ -295,24 +297,30 @@ function cluster(distance2_matrix) result(cluster_list)
! Get the number of old clusters
num_old_clusters=RTI%ncluster

! get the number of dimensions
nDims = settings%nDims

i_cluster = 1
do while(i_cluster<=num_old_clusters)

nlive = RTI%nlive(i_cluster) ! Get the number of live points in a temp variable

if(nlive>2) then
! Calculate the similarity matrix for this cluster
! get the position matrix for this cluster
if(present(sub_dimensions)) then
distance2_matrix(:nlive,:nlive) =&
calculate_similarity_matrix(RTI%live(sub_dimensions,:nlive,i_cluster))
points(:nlive, :nDims) =&
transpose(RTI%live(sub_dimensions,:nlive,i_cluster))
else
distance2_matrix(:nlive,:nlive) =&
calculate_similarity_matrix(RTI%live(settings%h0:settings%h1,:nlive,i_cluster))
points(:nlive, :nDims) =&
transpose(RTI%live(settings%h0:settings%h1,:nlive,i_cluster))
williamjameshandley marked this conversation as resolved.
Show resolved Hide resolved
end if

clusters(:nlive) = cluster(distance2_matrix(:nlive,:nlive))
clusters(:nlive) = cluster(points(:nlive, :nDims))

! default to KNN clustering
if (any(clusters(:nlive)<=0)) then
clusters(:nlive) = NN_clustering(distance2_matrix(:nlive,:nlive))
clusters(:nlive) = NN_clustering(&
calculate_distance2_matrix(points(:nlive, :nDims)))
end if
num_clusters = maxval(clusters(:nlive))

Expand Down
Loading