v1.1.0
🔥 Exciting News
Fortran API
I'm super excited to announce the Fortran API! This was single-handedly designed and built by @awehrfritz, so huge thanks!! The API is not finalized but most likely not changing much in the future since the design matches our other language APIs.
For more information, the initial PR can be found here: #341
Collaboration
For a lot of the OCCA development, most of the work was done by a very small group of people. The project has grown over the last few years from it being a research project to it being used by a few organizations.
During this release, we added CMake support. While it's not directly adding any development features, it will enable the use of the OCCA library to a greater audience which some might say is even more impactful than adding features. What makes this even more exciting is how many unrelated collaborators took part in this work!
Lots of PRs that made this happen: #310, #313, #319, #323, #329, #344, #345, #357
Many thanks to
⚠️ Breaking Changes
-
[de598e6] OCCA now compiles with C++11. C++ projects will need the
-std=c++11
flag for most compilers added to compilation. -
[f4fea62] Renamed
occa::hash_t
methodshash_t::toString()
→hash_t::getString()
hash_t::toFullString()
→hash_t::getFullString()
-
[#322] Updates
occaFree
to take in the argument by reference rather than valueoccaFree(value)
↓
occaFree(&value)
🃏Experimental
-
[#341] The Fortran API
-
[08b3a68] Adds
OCCA_JIT
andOCCA_JIT_WITH_SCOPE
macro. Examples for C++ and C can be found:For Example:
OCCA_JIT( (entries, a, b, ab), ( for (int i = 0; i < entries; ++i; @tile(16, @outer, @inner)) { ab[i] = 100 * (a[i] + b[i]); } ) );
-
[0a77696] Adds
okl-mode.el
for editing OKL kernels in Emacs 🎉
⭐️ Features
-
[f813c34] Adds templated
malloc
for easier use while keeping backwards compatibilityOriginal malloc
occa::memory mem = occa::malloc(10 * sizeof(float), src);
↓
Initial dtype malloc
occa::memory mem = occa::malloc(10, occa::dtype::float_, src);
↓
New malloc
occa::memory mem = occa::malloc<float>(10, src);
-
[92ffb58] Adds templated
umalloc
for easier use while keeping backwards compatibilityfloat *a = (float*) occa::umalloc(10, occa::dtype::float_, src); void *b = occa::umalloc(10 * sizeof(float), src);
↓
float *a = occa::umalloc<float>(10, src); void *b = occa::umalloc(10 * sizeof(float), src);
-
[c61d636] Adds templated
ptr
for easier use. Defaults to the return value ofvoid*
for backwards compatibility.occa::memory mem = occa::malloc(10, occa::dtype::float_, src); float *ptr = (float*) mem.ptr();
↓
occa::memory mem = occa::malloc(10, occa::dtype::float_, src); float *ptr = mem.ptr<float>();
-
[c61d636] Adds
use_host_pointer
to memory props to auto-wrap source pointers duringmalloc
callsfloat *hostPtr = new float[10]; occa::memory mem = occa::malloc<float>(10, occa::dtype::float_, hostPtr, "use_host_pointer: true"); mem.ptr<float>() == hostPtr;
-
Adds polyfills to test compilation of locally unsupported modes
-
[284aff8] Adds method to get the kernel hash
C++
occa::kernel::hash()
which returns aocca::hash_t
objectC
occaKernelGetHash
andoccaKernelGetFullHash
which return hash as aconst char*
-
[f2f21a3] Adds Metal backend for GPGPU in MacOS
- Requires MacOS to be at least 10.4 (Mojave)
- Requires XCode version to be at least 10.2.1
- Metal does not support
double
orlong
types - Issues with global
typedef
due to missing address space qualifiers
-
[386bc4c] Adds
occa translate --launcher
to get the host code needed to launch the device kernels (CUDA, HIP, OpenCL, Metal modes) -
[#246] Adds the
@directive
preprocessor attribute to add directives inside macros, such asOCCA_JIT
@directive("#pragma ivdep")
↓
#pragma ivdep
-
[#265] Adds
OCCA_CONFIG
config file to set defaults. There is aconfig.defaults.json
file with explanation of possible properties that can be set, including mode-specific properties. -
[#266] Allows HIP to compile CUDA kernels (Thanks @noelchalmers!)
-
[#270] Adds
occa::null
for passing aNULL
equivalent toocca::kernel
s (occaNull
in C) -
[#284] Adds
OCCA_LDFLAGS
along withkernel/compiler_linker_flags
(Thanks @stgeke!) -
[#308] Adds
OCCA_SHARED_FLAGS
along withkernel/compiler_shared_flags
-
[#308] Adds support to build native C kernels (disabling OKL with
okl/enabled
set tofalse
and settingkernel/compiler_language
toC
which defaults toC++
) (Thanks @amikstcyr!) -
[#346] Supports
#include
of standard C and C++ headers in OKL kernels. Note this will print warnings since adding these headers is not a portable solution across supported backends. -
[#347] Adds some standard defines on OKL kernels so users can check if the kernel is being processed by an OKL kernel or not. This is useful when reusing source code for OCCA kernels and non-OCCA kernels.
-
[#349] Keeps some comments around after applying OKL transformation for cleaner generated kernels.
-
[#354] Adds
OKL_KERNEL_HASH
define to help debug which kernel is currently being run (Tip:printf
andstd::cout
are available inSerial
andOpenMP
modes!) -
[#349][#355][#358][#364] Keeps comments around when transpiling kernels
🐛 Bugs Fixed
- [ebdb659] Updates to HIP backend (Thanks @noelchalmers!)
- [ac117fb] Fixed caching bugs (Thanks Nigel Nunn!)
- [5420005] Use
.dylib
instead of.so
on MacOS (Thanks @thilinarmtb!) - [ce4df26] Properly copy over artifacts when building with
PREFIX
(Thanks @thilinarmtb!) - [#243] Properly avoid overriding and duplicating compiler shared flags(Thanks @noelchalmers!)
- [f23ce88] Avoids writing lockfile when checking compiler vendor
- [3df3955] Properly fixed untyped umalloc in C
- [4d5d5bc] Kernels from strings were badly generating the launcher kernel
- [27a7420] OpenCL translation was converting the const pointer typedefs
const
qualifier &rarr__constant
- [#261] Invalid read in
json
->properties
unsafe cast (Thanks for pointing it out @stgeke!) - [#265] Fixes object/mode specific properties from not propagating
- [86dead2] OpenCL timing was done backwards, resulting in negative times. (Thanks @tcew!)
- [#293] Fixed some reference counting issues with the
kernelBuilder
- [#400] CUDA context was not being set in a few places (Thanks @amikstcyr!)