v1.0.7
⚠️ Breaking Changes
-
[cd68708] Updated
wrapMemory
to take in anocca::device
andocca::properties
Before
occa::cpu::wrapMemory(void* ptr, const udim_t bytes)
After
occa::cpu::wrapMemory(occa::device device, void* ptr, const udim_t bytes, occa::properties props)
-
[959ec4a] Renamed
occaSetDeviceFromInfos
to fit the rest of the methodsBefore
occaSetDeviceFromInfos(const char *info)
After
occaSetDeviceFromString(const char *info)
-
[7735c66] Removed some redundant stream methods
Before
occa::device::freeStream(occa::stream) // C++ occaDeviceFreeStream(occaStream) // C
After (Not new)
occa::stream::free() // C++ occaFree(occaStream) // C
-
[f81054d] Removed
occa::opencl::event()
and moved it toocca::opencl::streamTag::clEvent
-
[f81054d] Removed
occa::cuda::event()
and moved it toocca::cuda::streamTag::cuEvent
-
[f81054d] Removed
occa::streamTag::tagTime
. Tags can only be used for:- Waiting for queued tasks to finish (e.g. launched kernels or memory copies)
- Time gaps between 2 tags
⭐️ Features
- [daf0300] Faster
make
build and addedmake info
@v-dobrev - [1024a62] Switched garbage collection strategy to
NULL
out existing device/kernel/memory objects when one is freed. This switchesSEGFAULT
issues toocca::exception
errors that can be more easily debugged. - [527494c] Linalg methods reuse device buffers for reductions
- [ce46013] Loading cached kernels are sped up by avoiding locks if possible
- [e27b29e] Added
occaJson
- [fdd2d7c] Added
occaCreateDeviceFromString
- [fdd2d7c] Added CLI to C exampleOpenCL mode
- [959ec4a] Added UVA methods to C API
- [7735c66] The
occa::stream
class can now be extended - [f81054d] The
occa::streamTag
class can now be extended
🐛 Bugs Fixed
- [99ce6fb] Linalg properly deletes array allocations @jdahm
- [b7384bc] Kernel hashes is generated only from needed props (e.g. ignores
verbose
) - [780a06a] OpenCL
__global
,__local
, and__kernel
are properly inserted in the beginning - [dba0db9]
memory::slice
was improperly freeing UVA pointers in - [3260a05] The
verbose
property was being overwritten in CUDA mode