CLIJ was successfully tested on a variety of Intel, Nvidia and AMD GPUs. See the full list of tested systems
No. Common Intel Core and AMD Ryzen processors contain built-in GPUs which are compatible with CLIJ. However, as dedicated graphics cards come with their own GDDR-memory, additional speed-up can be gained by utilizing dedicated GPUs though.
CLIJ was successfully tested on Windows 10, MacOS, Fedora linux and Ubuntu linux. Current GPU and OpenCL drivers must be installed.
Maybe. As Windows 7 and 8 were discontinued before the first CLIJ release, we didn't test it. Theoretically, both systems support OpenCL and GPU vendors provide drivers for it. Thus, depending on the vendor, it mike work without issues. However, we had reports that serious crashes happened on Windows 7 systems with several different GPUs.
Yes, if the GPU-configuration allows shared usage. There have been reports of failing CLIJ initialization on Windows Server 2019 when multiple people can access it remotely via the "Remote Desktop Session Host". A potential solution is to replace the Java Runtime Environment that is delivered with Fiji.
- Delete the "java" folder inside your Fiji.app directory (or move it out if this directory if you want to keep a copy).
Then there are 2 options for the replacement of the Java Runtime Environment
- Install a current Java version
- Set the environment variable "JAVA_HOME" (see also ImageJ FAQ ) in the Control Panel › System and Security › System › Advanced Settings › Advanced › Environment Variables. System-administrator priviledges may be necessary if you want this to work for everyone using the Windows Server.
Thanks to Thomas Zobel for finding this out!
This second option is similar but use a combination of 2 Java JDK from AdoptOpenJDK.
See the original solution on the forum.
-
first install the OpenJDK 16 (Hotspot)
-
Then install the OpenJDK 8 and tick the option "set JAVA_HOME" in the installer
-
Check your system environment variables Control Panel › System and Security › System › Advanced Settings › Advanced › Environment Variables
JAVA_HOME
should be like *C:\Program Files\AdoptOpenJDK\jdk-8.0.292.10-hotspot*- Remove the entries related to the OpenJDK 16 from the
Path
variable
-
In the Fiji.app directory, if you don't have an ImageJ.cfg file then start Fiji and run the Edit > Options > Memory and Threads menu, which creates it.
You can cancel/close the window. The file should have been created. Close Fiji -
Edit ImageJ.cfg with a text editor
-
Replace the second line which was something like bin/jre/javaw.exe
with the equivalent from the OpenJDK 16, which should be something like
C:\Program Files\AdoptOpenJDK\jdk-16.0.0.36-hotspot\bin\javaw.exe
Importantly dont put quotes " " around this file path.
Save the file and close it. -
You can now start Fiji and test CLIJ
In order to exploit GPU-accelerated image processing, one should
- Run as many operations as possible in a block without back and forth pulling/pushing image data to/from GPU memory.
- Process images larger than 10 MB (rule of thumb, depends on actual CPU/GPU hardware). Background: Image processing on the CPU can be pretty fast if the accessed memory is smaller than the cache of the CPU. When processing exceeds this cache size, using GPU might become beneficial.
- Process many images of the same size and type subsequently, because in that way compiled GPU-code can be reused.
- Reuse memory. Releasing and allocating memory takes time. Try to reuse memory if possible.
- Use a dedicated graphics card. When deciding for the right GPU, check the memory bandwidth. Image processing is usually memory-bound. The faster the memory access, the faster images can be processed. The computing power / clock rate of the GPU and number of compute cores is of secondary interest.
- Some CLIJ marked with "Box" in their name filters are implemented separable (Gaussian blur, minimum, maximum, mean filters). Separable filters are faster than others (e.g. marked with "Sphere").
- Further speedup can be achieved by combining filters on OpenCL kernel level. This means implementing OpenCL kernels containing whole workflows. This custom OpenCL code can be distributed as custom CLIJ plugin. A plugin template can be found here: https://github.com/clij/clij-plugin-template/
The simplest way for measuring the speedup of workflows is using time measurements before and after execution, e.g. in ImageJ macro:
time = getTime(); // gives current time in milliseconds
// ...
// my workflow
// ...
print("Processing the workflow took " + (getTime() - time) + " msec"));
However, in order to make these measurements reliable, some hints shall be given:
- Measure the timing of execution in a loop several times. The first execution(s) may be slower than subsequent executions because of so called warmup effects.
- Exclude file input/output from the time measurements to exclude hard drive read/write speed from the performance benchmarking of your workflow.
- Also measure the similarity of the ImageJ and CLIJ workflows results. For example: Some CLIJ_*Box filters are potentially much faster than CLIJ_*Sphere filters, which are more similar to ImageJs filters. In this case, performance can be gained by paying with reduced workflow result similarity.
ImageJ macros benchmarking CPU/GPU performance can be found here and here
For more professional benchmarking, we recommend the OpenJDK Java Microbenchmark Harness (JMH). As the name suggests, this involves Java programming. You find more details here
To give an overview, some of CLIJs operations have been benchmarked with JMH
With some limitations, yes. You find details and installation instructions here
If you use CLIJ from ImageJ macro, you cannot execute it in parallel from several threads. If you use CLIJ from any other programming language, please use one CLIJ instance per thread. By using multiple threads in combination with multiple CLIJ instances, you can also execute operations on multiple graphics cards at a time.
Yes. When processing images of the same size and type, it is recommended to reuse memory instead of releasing memory and reallocating memory in every iteration. An example macro demonstrating this can be found here
No. While algorithms on the CPU can make use of double-precision, common GPUs only support single precision for floating point numbers. Furthermore, following priorities were set while developing CLIJs filters:
- Mathematical correctness
- Consistency, e.g. results in 2D and 3D should be reasonably similar
- Simplicity of code to ease maintenance
- Performance
- Similarity of results generated with ImageJ
For example, the minimum filter of ImageJ takes different neighborhoods into account when being applied in 2D and 3D. CLIJs filters are consistent in 2D and 3D. Thus, results may differ between ImageJ and CLIJ as shown in Figure 1. Figure 1: Comparing CLIJs mean filter (center) and ImageJs mean filter (right) in 2D (top) and 3D (bottom). The result can be reproduced by running the this example macro with radius = 1:
CLIJ in general uses the strategy clamp to edge
assuming pixels outside the image have the same pixel value as the closest border pixel of the image. For transforms such as rotation, translation, scaling, and affine transforms, 'zero-padding' is applied assuming pixels having value 0 out of the image.
No. All numeric spatial parameters in CLIJ such as radius and sigma are always entered in pixels. There is no operation in CLIJ which makes use of any physical units.
Pixel coordinates in X, Y and Z are zero-based indiced.
Not directly. CLIJ supports two and three dimensional images. If the third dimension represents channels or frames, these images can be processed using CLIJs 3D filters. In order to process 4D or 5D images, it is recommended to split them into 3D blocks. There are functions like pushCurrentZStack and pushCurrentSlice to simplify this.
No. There are no in-place operations implemented in CLIJ. No built-in operation overwrites its input images. However, when implementing your own custom OpenCL-code and wrapping it into CLIJ plugins, in-place operations may be supported depending on used hardware, driver version and supported OpenCL version.
No. The currently active image window in ImageJ plays no role in CLIJ. Input and output images must be specified in macros by name explicitly.
If a specified output image does not exist in GPU memory, it will be generated automatically with a size defined by the executed operation with respect to input image and given parameters.
If a specified output image exists already in GPU memory, it will be overwritten. If the output image has the wrong size, it will not be changed.
CLIJ operations called from ImageJ macro have no return values. They either process pixels and save results to images or they save their results to ImageJs results table.
Binary output images are filled with pixel values 0 and 1. Any input image can serve as binary image and will be interpreted by differentiating 0 and non-zero values. In order to pull a binary image back to ImageJ which is compatible, use pullBinary()
. This delivers a binary 8-bit image with 0 and 255 as pixel values.
Are there performance benefits expected when calling OpenCL kernels directly via ClearCL instead of CLIJ?
Yes. CLIJ brings OpenCL-kernel caching and the possibility of image/pixel-type-independent OpenCL. These benefits come with small performance loss. Calling an OpenCL kernel via ClearCL directly may be about a millisecond faster than calling it via CLIJ. Example code demonstrating this is available here
The CLIJ Java API offers methods for processing ClearCLBuffers and ClearCLImages. What's the difference?
Images and buffers are defined in the OpenCL standard. We tried to have as many operations as possible compatible to both, images and buffers. Differences are:
- When applying affine transforms and warping to images, linear interpolation is used. When using buffers, the nearest neighbor pixel delivers the resulting intensity.
- Images are not generally supported by GPU devices running OpenCL 1.1.
- For filters which access the local neighborhood of pixels, using images brings performance gain.
We recommend using buffers in general for maximum device compatibility.
Yes. As operations executed on the GPU anyway don't make use of user interface elements, CLIJs operations in general run headless and need no user interaction. Furthermore, it can be run from the command line and in cloud systems using docker.
CLIJ was the first official release of the project in 2019. In 2020 CLIJ2 released and in 2021 CLIJ2.5. With the 2021 release, CLIJ became obsolete. If your project still uses CLIJ commands, you should upgrad to CLIJ2. CLIJ2 and CLIJ2.5 are backwards-compatible. Read more about the release cycle. CLIJx is the experimental sibling to CLIJ2 with functions that are yet under development. It is in general recommended to avoid using these experimental methods. However, we're developing it in public and thus, you're welcome to try and feedback is very welcome.
The CLIJ2-assistant (a) is an officially supported user interface in Fiji, that is installed when activating the clij
and clij2
update sites.
The CLIJx-assistant (b) is its experimental sibling with additional functionality that is under development.
It can be installed with the clijx-assistant
update site in Fiji.
Visually, both can be differentiated by the headline in the right-click menu and through the additional menu entries (arrows):
You can pack your clij dependent project together with clij and all its dependencies in an uber-jar. This is also how clicy and clatlab projects work. It's basically just an entry in their pom.xml files.
Yes, there is a template-plugin available for clij2, where you can input your code. Reminder: It's OpenCL, not C ;-) Furthermore, CLIJ brings some convenience functions (actually defines) to make OpenCL easier to use. You find a full list online.
CLIJ projects are best managed, compiled and deployed with maven. Just import the pom.xml in your Eclipse, IntelliJ or other IDE as project.
Yes, as CLIJ is build on OpenCL and OpenCL also runs on CPUs. One may have to install special drivers to make it work.
Yes, as images are basically managed in variables:
Ext.CLIJ2_push(input);
if (user_input) {
Ext.CLIJ2_gaussianBlur2D(input, output, 1, 1);
} else {
output = input;
}
Ext.CLIJ2_pull(output);
Errors may pop up when processing big images on NVidia cards on Windows (CL_INVALID_COMMAND_QUEUE, CL_INVALID_PROGRAM_EXECUTABLE, CL_MEM_OBJECT_ALLOCATION_FAILURE): The issue is related to a timeout of the operating system interrupting processing on the GPU. Add these keys to the windows registry and restart the machine (warning, don't do this if you're not sure. Ask you IT department for support. Read the BSD3 license file for details on what why we're not responsible for your actions on your computer ):
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers]
"TdrDelay"=dword:0000003c
"TdrDdiDelay"=dword:0000003c
Here is more information about what TDR is: https://docs.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys
Similar to NVidia drives (see above), issues may appear due to a timeout when processing large images. The issue is related to a timeout of the operating system interrupting processing on the GPU. Add these keys to the windows registry and restart the machine (warning, don't do this if you're not sure. Ask you IT department for support. Read the BSD3 license file for details on what why we're not responsible for your actions on your computer ). Similar to the solution above, enter a new key in the registry of Windows in this path
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers
The key should be called TdrDelay
and have a value of 8.
Sources: https://community.amd.com/thread/180166 https://support.microsoft.com/en-us/help/2665946/display-driver-stopped-responding-and-has-recovered-error-in-windows-7
On an "Intel(R) HD Graphics Kabylake Desktop GT1.5" used from Ubuntu Linux 20.04 it was observed that some operations lead to empty images.
Furthermore, a warning is shown on std err Beignet: "unable to find good values for local_work_size[i], please provide\n" " local_work_size[] explicitly, you can find good values with\n" " trial-and-error method."
.
This issue can be solved using the device "Intel HD Graphics Gen 9 NEO". Also, please refer to the installation instructions for linux.
When creating CLIJ instances and closing them repeatedly, it crashes after about 40 attempts. This test allows reproducing the issue on specified hardward. Workaround: Don't close the CLIJ instance and keep working with the singleton instance.
On some MacOS systems with modern AMD Graphics Cards, CLIJ causes a crash which leads to the operating system restarting the session and logging out the user. Reason is an energy saving mode. To solve this problem, turn off "Automatic graphics switching" under "System Preference" > "Energy Saver". Thanks to Tanner Fadero for reporting this bug and its solution.
If CLIJx_deconvolveRichardsonLucyFFT
outputs a java.lang.UnsatisfiedLinkError as shown here,
installation of the Visual Studio Redist package might help.
CLIJ doesn't start on Ubuntu linux with an error message that a class called ClearCLBackendJOCL cannot be initialized. This may be cause by a missing GPU driver. Please refer to the installation instructions for linux.
CLIJ throws various exceptions, like CL_OUT_OF_HOST_MEMORY on Linux. Please refer to the installation instructions for linux. Furthermore, when exploring such issues in 2019, on Fedora 27 Linux, this command list helped:
sudo yum install ocl-icd-devel
sudo yum install cmake
sudo yum install llvm
sudo yum install llvm-devel
sudo yum install libdrm libdrm-devel
sudo yum install libXext-devel
sudo yum install libXfixes-devel
sudo yum install clang-devel
git clone https://github.com/intel/beignet.git
cd beignet/
mkdir build
cd build
cmake ../
make
sudo make install
More info can be found on the website of the beignet project.
In case you are trying to create a conda environment and you receive this Warning message: WARNING: No ICDs were found.
or you wish to have several ICDs detected, you can also install via conda the ocl-icd-system
. It will make sure your system-wide ICDs are also visible in your conda environment (tested on Linux-Mint 20.1). See here for more informations
conda install -c conda-forge ocl-icd-system
Yes, just delete the file clij2_assistant_autostart.ijm
from the folder Fiji.app/plugins/Scripts/Plugins/AutoRun/
.
The CLIJ2 assistant exposes CLIJ2 functions only and allows code-export to scripting languages supported by CLIJ2 (Macro, Javascript, Groovy, Jython, Matlab). The CLIJx assistant additionally offers experimental CLIJx functions and export to scripting languages such as QuPath-Groovy for cluPath and clesperanto-Python. While the CLIJ2-assistant gets delivered via the clij and clij2 update sites in Fiji, the installation of the CLIJx-assistant needs multiple update sites installed.
If you experience an error like the following on, you may be using the 32-bit version of ImageJ/Fiji. Only the 64-bit version is supported.
Exception in thread "Run$_AWT-EventQueue-0" java.lang.UnsatisfiedLinkError: no SimpleITKJava in java.library.path
at java.lang.ClassLoader.loadLibrary(Unknown Source)
at java.lang.Runtime.loadLibrary0(Unknown Source)
at java.lang.System.loadLibrary(Unknown Source)
at org.itk.simple.SimpleITKJNI.<clinit>(SimpleITKJNI.java:257)
at org.itk.simple.PixelIDValueEnum.<clinit>(PixelIDValueEnum.java:12)
...