Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Correlation matrix #29

Open
wants to merge 90 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
0ef94ec
Add matrix_evtcorrelator.cl
pierrepaleo May 11, 2023
e5290eb
Start MatrixEventCorrelator
pierrepaleo May 11, 2023
31117fb
Add n_times, which can be different than n_frames
pierrepaleo May 11, 2023
0a718c6
Add get_index()
pierrepaleo May 11, 2023
b7973aa
Fix get_index()
pierrepaleo May 12, 2023
a1529ef
Rename simplest kernel
pierrepaleo May 12, 2023
d6737e8
Add build_correlation_matrix_flattened
pierrepaleo May 12, 2023
8371fdd
WIP: build_correlation_matrix_flattened_wg
pierrepaleo May 12, 2023
e636093
WIP: build_correlation_matrix_flattened_wg
pierrepaleo May 15, 2023
aca1502
Good result with build_correlation_matrix_image() kernel
pierrepaleo May 16, 2023
7d4ae70
Each thread works on at most N_frames/2 items horizontally
pierrepaleo May 22, 2023
2fcbd4f
Work on all q-bins simultaneously
pierrepaleo May 22, 2023
da4f99f
Add sparse.py module
pierrepaleo May 26, 2023
77af79c
Add test_sparse - data path to be changed
pierrepaleo May 26, 2023
e93084e
Merge branch 'event_matrix' of github.com:silx-kit/dynamix into event…
pierrepaleo May 26, 2023
028f331
Add OpenCL kernels from computation from time representation
pierrepaleo May 26, 2023
cf3a5bd
Add space_compact_to_time_compact_stage2_sort()
pierrepaleo May 26, 2023
27dda6a
time-compaction starts to be OK
pierrepaleo May 26, 2023
85b6ca4
Add some comments to opencl kernel
pierrepaleo May 26, 2023
6d8503d
Move some OpenCL code in various files
pierrepaleo May 30, 2023
345d7af
Start GPU class SpaceToTimeCompaction
pierrepaleo May 30, 2023
ac03da7
WIP SpaceToTimeCompaction
pierrepaleo May 30, 2023
83efdcc
Move compile_kernels() to utils
pierrepaleo May 31, 2023
6da1a2d
Fix some OpenCL kernel calls
pierrepaleo May 31, 2023
df651d1
SpaceToTimeCompaction OK
pierrepaleo May 31, 2023
1ddd630
Clean-up
pierrepaleo May 31, 2023
1f024fe
Rename kernels
pierrepaleo May 31, 2023
1e35e9c
Debuggin build_correlation_matrix_v2b
pierrepaleo Jun 6, 2023
4964d9f
v2 and v3 OK
pierrepaleo Jun 6, 2023
1526a03
v1 OK, can be accelerated up to X2
pierrepaleo Jun 7, 2023
2b92fe9
v4 (times repr.) OK
pierrepaleo Jun 7, 2023
16710cc
Clean-up
pierrepaleo Jun 8, 2023
dccfb83
Use two classes: SMatrixEventCorrelator and TMatrixEventCorrelator
pierrepaleo Jun 29, 2023
d197d3e
Split 'matrix_evtcorrelator.cl' into several files
pierrepaleo Jun 29, 2023
4e76ae5
Fix dtype
pierrepaleo Jun 29, 2023
46bc08e
Factorize get_timings()
pierrepaleo Jun 29, 2023
4900510
Clean-up
pierrepaleo Jun 29, 2023
d44958f
Add comments on work balancing for kernels
pierrepaleo Jun 29, 2023
a5163c3
Clean-up
pierrepaleo Jul 3, 2023
b215f26
d_sums should have size (n_bins, n_frames)
pierrepaleo Jul 3, 2023
6357846
build_correlation_matrix_v3: compute on all bins
pierrepaleo Jul 3, 2023
2e8435b
Add flat_to_square() for correlation matrix
pierrepaleo Jul 3, 2023
2361212
Add build_flattened_scalar_correlation_matrix() kernel
pierrepaleo Jul 3, 2023
76bcdb8
Move build_flattened_scalar_correlation_matrix to utils.cl
pierrepaleo Jul 3, 2023
0424d48
Add SMatrixEventCorrelator._correlate_sums
pierrepaleo Jul 3, 2023
b9f3af9
Use more consistent dtypes in opencl kernels
pierrepaleo Jul 4, 2023
8607172
Refactoring, use base array setter and more consistent dtypes
pierrepaleo Jul 4, 2023
d7a9f3d
WIP: SpaceToTimeCompactionV2
pierrepaleo Jul 5, 2023
8727625
Refactor base correlator: set_dtype
pierrepaleo Jul 6, 2023
a5b29c6
TMatrixEventCorrelator: use t_XX internal variables
pierrepaleo Jul 6, 2023
1aa8193
Times-based matrix correlator: do an optional sort stage
pierrepaleo Jul 6, 2023
ba77609
Add concatenate_space_compacted_data()
pierrepaleo Jul 6, 2023
4decabe
Add dense_to_space_multi
pierrepaleo Jul 6, 2023
b87ecbd
Apply black
pierrepaleo Jul 6, 2023
33d0c05
Add test_from_space_to_times_opencl_v2
pierrepaleo Jul 27, 2023
1048346
Use pre-sort for TMatrixEventCorrelator
pierrepaleo Jul 27, 2023
45ac8a7
Add comment
pierrepaleo Jul 27, 2023
5a86019
Fix error message
pierrepaleo Jul 28, 2023
b722a76
get_index(): prevent overflow for large sizes
pierrepaleo Jul 28, 2023
27bad49
Fix output dtype for scalar corr matrix output
pierrepaleo Aug 2, 2023
39b2317
Add compute_final_twotimes_function
pierrepaleo Aug 2, 2023
ce65b0b
Add some documentation
pierrepaleo Aug 2, 2023
19e1f95
Fixes
pierrepaleo Sep 5, 2023
08c6ee6
"Start" final reduction in opencl
pierrepaleo Sep 5, 2023
dc32807
Add get_timings() to SpaceToTimeCompaction
pierrepaleo Sep 5, 2023
de9fb97
Alias SpaceToTimeCompaction to SpaceToTimeCompactionV2
pierrepaleo Sep 5, 2023
f2b610c
Fix
pierrepaleo Sep 5, 2023
85207ea
Rename base class
pierrepaleo Sep 11, 2023
aaa9f22
factorize compute_normalized_ttcf()
pierrepaleo Sep 11, 2023
fdac26c
Remove code related to former implementations of matrix calculation (…
pierrepaleo Sep 11, 2023
b3f8c8f
Update setup.py so that fortran files can be compiled
pierrepaleo Sep 11, 2023
4149dd1
Clean-up
pierrepaleo Sep 11, 2023
c8b8a85
Clean-up
pierrepaleo Sep 11, 2023
54753ce
Start benchmark_event_matrix
pierrepaleo Sep 11, 2023
34ce67e
WIP benchmark_event_matrix.py
pierrepaleo Sep 11, 2023
6c1934c
WIP: benchmark_event_matrix
pierrepaleo Sep 12, 2023
55b3f45
Fix setup.py to embed .h files
pierrepaleo Sep 12, 2023
df9cc53
Fix std calculation
pierrepaleo Sep 12, 2023
fa4b4ad
Benchmark on several datasets
pierrepaleo Sep 12, 2023
94e04b9
Validate dataset04
pierrepaleo Sep 12, 2023
aad7a7c
Add get_normalized_ttcf()
pierrepaleo Sep 14, 2023
d461985
build_correlation_matrices() to return both num and denom
pierrepaleo Sep 14, 2023
d4ec3fb
API changes
pierrepaleo Sep 15, 2023
cea359a
Benchmarking
pierrepaleo Sep 18, 2023
95939e7
Fix typo
pierrepaleo Sep 18, 2023
451c65c
Clean-up
pierrepaleo Sep 18, 2023
aa26023
Add get_g2_and_std_v1()
pierrepaleo Sep 19, 2023
444339c
Remove unused function
pierrepaleo Sep 19, 2023
ce9c8f7
Send scale factors to device
pierrepaleo Sep 19, 2023
f94fc8c
Add get_correlation_function_gpu()
pierrepaleo Sep 19, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 131 additions & 40 deletions dynamix/correlator/common.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,29 @@
import numpy as np
from os import linesep

import pyopencl.array as parray
from pyopencl.tools import dtype_to_ctype
from silx.opencl.common import pyopencl as cl
from silx.opencl.processing import OpenclProcessing, KernelContainer
from ..utils import _compile_kernels


class BaseCorrelator(object):
class BaseCorrelator:
"Abstract base class for all Correlators"

default_extra_options = {
# dtype for indices offsets
# must be extended to unsigned long (np.uint64) if total_nnz > 4294967295
"offset_dtype": np.uint32,
# dtype for intermediate result (correlation in integer).
# must be extended to unsigned long for large nnz_per_frame and/or events counts
"res_dtype": np.uint32,
"sums_dtype": np.uint32,
# dtype for q-mask.
# must be extended to int if number of q-bins > 127
"qmask_dtype": np.int8,
}


def __init__(self):
self.nframes = None
self.shape = None
Expand All @@ -19,12 +33,30 @@ def __init__(self):
self.weights = None
self.scale_factors = None

def _set_parameters(self, shape, nframes, qmask, scale_factor, extra_options):

def _set_dtype(self, dtype):
# Configure data types - important for OpenCL kernels,
# as some operations are better performed on integer types (ex. atomic),
# but some overflow can occur for large/non-sparse data.

# data type for data. Usually the data values lie in a small range (less than 255)
self.dtype = dtype
# Other data types
self._offset_dtype = self.extra_options["offset_dtype"]
self._qmask_dtype = self.extra_options["qmask_dtype"]
self._res_dtype = self.extra_options["res_dtype"]
self._sums_dtype = self.extra_options["sums_dtype"]
self._pix_idx_dtype = np.uint32 # won't change (goes from 0 to N_x*N_y)
self._output_dtype = np.float32 # won't change


def _set_parameters(self, shape, nframes, qmask, scale_factor, extra_options, dtype):
self._configure_extra_options(extra_options)
self._set_dtype(dtype)
self.nframes = nframes
self._set_shape(shape)
self._set_qmask(qmask=qmask)
self._set_scale_factor(scale_factor=scale_factor)
self._configure_extra_options(extra_options)

def _set_shape(self, shape):
if np.isscalar(shape):
Expand All @@ -36,21 +68,21 @@ def _set_shape(self, shape):
def _set_qmask(self, qmask=None):
self.qmask = None
if qmask is None:
self.bins = np.array([1], dtype=np.int32)
self.bins = np.array([1], dtype=self._qmask_dtype)
self.n_bins = 1
self.qmask = np.ones(self.shape, dtype=np.int32)
self.qmask = np.ones(self.shape, dtype=self._qmask_dtype)
else:
self.qmask = np.ascontiguousarray(qmask, dtype=np.int32)
self.qmask = np.ascontiguousarray(qmask, dtype=self._qmask_dtype)
self.n_bins = self.qmask.max()
self.bins = np.arange(1, self.n_bins + 1, dtype=np.int32)
self.bins = np.arange(1, self.n_bins + 1, dtype=self._qmask_dtype)
self.output_shape = (self.n_bins, self.nframes)

def _set_weights(self, weights=None):
if weights is None:
self.weights = np.ones(self.shape, dtype=self.output_dtype)
self.weights = np.ones(self.shape, dtype=self._output_dtype)
return
assert weights.shape == self.shape
self.weights = np.ascontiguousarray(weights, dtype=self.output_dtype)
self.weights = np.ascontiguousarray(weights, dtype=self._output_dtype)
raise ValueError("Advanced weighting is not implemented yet")

def _set_scale_factor(self, scale_factor=None):
Expand All @@ -75,15 +107,14 @@ def _configure_extra_options(self, extra_options):
"""
:param extra_options: dict
"""
self.extra_options = {}
if extra_options is not None:
self.extra_options.update(extra_options)
self.extra_options = self.default_extra_options.copy()
self.extra_options.update(extra_options or {})


class OpenclCorrelator(BaseCorrelator, OpenclProcessing):

def __init__(
self, shape, nframes, qmask=None, dtype=np.int8, weights=None,
self, shape, nframes, qmask=None, dtype=np.uint8, weights=None,
scale_factor=None, extra_options={},
ctx=None, devicetype="all", platformid=None, deviceid=None,
block_size=None, memory=None, profile=False
Expand Down Expand Up @@ -129,8 +160,11 @@ def __init__(
the same normalization is used for all bins), or an array which
must have the same length as the number of bins.
extra_options: dict, optional
Advanced extra options. None available yet.

Advanced extra options. Available options are:
- "offset_dtype": np.uint32
- "res_dtype": np.uint32,
- "sums_dtype": np.uint32,
- "qmask_dtype": np.int8,

Other parameters
-----------------
Expand All @@ -147,24 +181,35 @@ def __init__(
self._allocate_memory()

def _set_parameters(self, shape, nframes, dtype, qmask, weights, scale_factor, extra_options):
BaseCorrelator._set_parameters(self, shape, nframes, qmask, scale_factor, extra_options)
self._set_dtype(dtype)
BaseCorrelator._set_parameters(self, shape, nframes, qmask, scale_factor, extra_options, dtype)
self._set_cl_dtypes()
self._set_weights(weights)
self.is_cpu = (self.device.type == "CPU")
self.extra_options = self.default_extra_options.copy().update((extra_options or {}))

def _set_dtype(self, dtype="f"):
# add checks ?
self.dtype = dtype
self.output_dtype = np.float32 # TODO custom ?
self.sums_dtype = np.uint32 # TODO custom ?
def _set_cl_dtypes(self):
self.c_dtype = dtype_to_ctype(self.dtype)
self.c_sums_dtype = dtype_to_ctype(self.sums_dtype)
self.c_sums_dtype = dtype_to_ctype(self._sums_dtype)
self.idx_c_dtype = "int" # TODO custom ?
self._res_c_dtype = dtype_to_ctype(self._res_dtype)
self._dtype_compilation_flags = [
"-DDTYPE=%s" % self.c_dtype,
"-DOFFSET_DTYPE=%s" % (dtype_to_ctype(self._offset_dtype)),
"-DQMASK_DTYPE=%s" % (dtype_to_ctype(self._qmask_dtype)),
"-DRES_DTYPE=%s" % self._res_c_dtype
]


def _allocate_memory(self):
# self.d_norm_mask = parray.to_device(self.queue, self.weights)
if self.qmask is not None:
self.d_qmask = parray.to_device(self.queue, self.qmask)
self._allocated = {}
# send scale_factors to device
self.allocate_array("d_scale_factors", (self.n_bins,), dtype=np.float64)
for bin_idx, scale_factor in enumerate(self.scale_factors.values()):
self.d_scale_factors[bin_idx] = scale_factor


def _set_data(self, arrays):
"""
Expand Down Expand Up @@ -194,22 +239,68 @@ def _reset_arrays(self, arrays_names):
setattr(self, array_name, old_array)
setattr(self, old_array_name, None)


# -----------

def allocate_array(self, array_name, shape, dtype=np.float32):
"""
Allocate a GPU array on the current context/stream/device,
and set 'self.array_name' to this array.

Parameters
----------
array_name: str
Name of the array (for book-keeping)
shape: tuple of int
Array shape
dtype: numpy.dtype, optional
Data type. Default is float32.
"""
if not self._allocated.get(array_name, False):
new_device_arr = parray.zeros(self.queue, shape, dtype)
setattr(self, array_name, new_device_arr)
self._allocated[array_name] = True
return getattr(self, array_name)

def set_array(self, array_name, array_ref):
"""
Set the content of a device array.

Parameters
----------
array_name: str
Array name. This method will look for self.array_name.
array_ref: array (numpy or GPU array)
Array containing the data to copy to 'array_name'.
"""
if isinstance(array_ref, parray.Array):
current_arr = getattr(self, array_name, None)
setattr(self, "_old_" + array_name, current_arr)
setattr(self, array_name, array_ref)
elif isinstance(array_ref, np.ndarray):
self.allocate_array(array_name, array_ref.shape, dtype=array_ref.dtype)
getattr(self, array_name).set(array_ref)
else:
raise ValueError("Expected numpy array or pyopencl array")
return getattr(self, array_name)

def get_array(self, array_name):
return getattr(self, array_name, None)


# -----------


# Overwrite OpenclProcessing.compile_kernel, as it does not support
# kernels outside silx/opencl/resources
def compile_kernels(self, kernel_files=None, compile_options=None):
kernel_files = kernel_files or self.kernel_files

allkernels_src = []
for kernel_file in kernel_files:
with open(kernel_file) as fid:
kernel_src = fid.read()
allkernels_src.append(kernel_src)
allkernels_src = linesep.join(allkernels_src)

compile_options = compile_options or self.get_compiler_options()
try:
self.program = cl.Program(self.ctx, allkernels_src).build(options=compile_options)
except (cl.MemoryError, cl.LogicError) as error:
raise MemoryError(error)
else:
self.kernels = KernelContainer(self.program)
_compile_kernels(self, kernel_files=kernel_files, compile_options=compile_options)


def get_timings(self):
if not(self.profile):
raise RuntimeError("Need to instantiate this class with profile=True")
evd = lambda e: (e.stop - e.start)/1e6
return {e.name: evd(e) for e in self.events}


6 changes: 3 additions & 3 deletions dynamix/correlator/dense.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,12 +199,12 @@ def _allocate_arrays(self):
self.d_sums = parray.zeros(
self.queue,
self.output_shape,
self.sums_dtype
self._sums_dtype
)
self.d_sums_f = parray.zeros(
self.queue,
self.output_shape,
self.output_dtype,
self._output_dtype,
)
self.d_output = parray.zeros(
self.queue,
Expand All @@ -217,7 +217,7 @@ def _normalize_sums(self):
self.d_sums_f[:] *= self.scale_factors[0]
else:
for i, factor in enumerate(self.scale_factors.values()):
self.d_sums_f[i] /= np.array([factor], dtype=self.output_dtype)[0]
self.d_sums_f[i] /= np.array([factor], dtype=self._output_dtype)[0]
self.d_sums_f.finish()

def correlate(self, frames):
Expand Down
Loading