-
Notifications
You must be signed in to change notification settings - Fork 19
Differences between the CPU and GPU implementations
The ppafm
package contains two different implementations of the probe particle model, the C++ implementation that runs on the CPU, and the OpenCL implementation that enables GPU acceleration. There are certain differences between the two versions which are listed on this page. Generally the CPU version can be seen as the reference implementation that the GPU version tries to match while offering higher computational throughput.
The following features are currently not implemented for the GPU version:
- Kelvin probe force microscopy (KPFM) simulations
- Inelastic tunneling spectroscopy (IETS) simulations
- Command-line interface (CLI)
The following are currently not implemented for the CPU version:
- Graphical user interface (GUI). However, note that you can save the
params.ini
file from the GUI to use in the CLI for the CPU version. - Ability to use mismatching grids in input files (e.g. Hartree potential and tip electron density).
The GPU version was written primarily with execution speed in mind, so it uses some faster approximations in some places in the computation. This means that the results may not be identical between the two versions in some cases, but the difference in the final image should not be significant. If you find an example where the difference is more than you think it should be, please do file an issue about it.
The following differences exist in the numerical implementation in the GPU version compared to the CPU version:
- The gradient in the GPU version is computed using the finite difference method (O(N)) as opposed to the fast Fourier transform (O(NlogN)) used in the CPU version.
- The GPU version resamples the input file grids to match the specified force-field grid using linear interpolation. This enables the use of a coarser and smaller force-field grid regardless of the density and size of the input grid, which can significantly speed-up the calculation, and as a side benefit allows the use of mismatching input grids. However, in some cases the interpolation error can result in visible artifacts in the final images. If this is the case, try adjusting the
pixPerAngstrome
parameter to a higher value. - The GPU version uses single precision floating point throughout, whereas the CPU version uses double precision.