Skip to content

The project is about exploiting the heavily parallelized architecture of a GPU, using the Nvidia CUDA library, in order to calculate the autocorrelation function of a given set of data

Notifications You must be signed in to change notification settings

feDann/AutocorrelationCUDA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutocorrelationCUDA

Based on the previous work of Lapo Falcone and Gianpaolo Cugola. The original work with a more in depth explaination can be found here

Overview

This CUDA application is designed to efficiently calculate the autocorrelation function for a matrix of sensors using a multi-tau approach. It is specifically tailored for FCS data analysis, which is a powerful technique for studying the dynamics of fluorecent molecules in a solution.

Features

  • Fast computation of the autocorrelation function using CUDA parallel processing
  • Support for processing a full time series or packets of data, allowing flexibility in data handling
  • Customizable parameters to fine-tune the analysis for your specific experiment

Requirements

Before using this application, make sure you have the following requirements:

  • nvcc compiler

Getting Started

Building the Application

  1. Clone the repository to your local machine
  2. Make sure that the ARCH flags in the Makefile reflects your GPU architecture
  3. Run the make command

Running the Application

To calculate the autocorrelation, you need to provide your input data in a compatible format. The input data should be a matrix of sensors reading, where each column represents a sensor's time series data.

The application can be run using the following command:

./bin/main -p [PACKET_LENGTH] -l [NUM_BINS] -g [BIN_SIZE] -i [INPUT_FILE] -r -o [OUTPUT_FILE]

All the available flags are:

[--debug, -d]           Activate debug prints

[--results, -r]         Prints to stdout the results of the autocorrelation

[--packets, -p]         Number of instant used per packets

[--input_file, -i]      Name of the input file containing the sensor data
                        for the calculation of the autocorrelation

[--output-file, -o]     Name of the output file. Correlation result will be saved into a csv file

[--iterations, -I]      Number of times that the calculation is repeated. If it is greater than
                        one the calculation of the autocorrelation will be repeated multiple
                        times on the same data

[--sensors, -s]         Number of Sensors present in the matrix of sensors

[--groups, -l]          Number of bins used for each correlator of the sensor matrix

[--group_size, -g]      Size of the bins for each correlator

[--help, -h]            Print this help message

Correlator Data Structure

The correlator data structure is designed with optimization in mind. Among the most crucial parameters are: bin_size and num_bins. These parameters not only define the fundamental structure of the correlator but also grant the flexibility to achieve varying levels of accuracy in results and computation speed by adjusting their values.

  • shift_registers: This array holds the actual sensor data that is inserted into the correlator.
  • correlations: It stores the computed correlation results.
  • shift_register_positions: This array keeps track of the positions where new values should be inserted for each bin. This aids in the shift procedure by enabling a round-robin approach, eliminating the need to shift the entire memory position of the array.
  • accumulators: It contains values that are forwarded to the next bin in the correlator

This organization is shown in the following image:

The memory layout for the shift_registers and correlations arrays is carefully arranged to minimize the potential for bank conflicts. Each thread, when calculating correlations, accesses a different memory bank, enhancing parallel processing efficiency. This design choice ensures that the CUDA Autocorrelation application makes the most of GPU capabilities.

The memory layout is visualized in the following diagram:

About

The project is about exploiting the heavily parallelized architecture of a GPU, using the Nvidia CUDA library, in order to calculate the autocorrelation function of a given set of data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Cuda 74.6%
  • C++ 18.9%
  • Shell 5.3%
  • Makefile 1.2%