GitHub - LaszloHars/CUDA.jl: Julia Programming interface for CUDA

CUDA.jl

Julia Programming interface for CUDA.

This package wraps key functions in CUDA Driver API for Julia. While this remains work in progress, simple use is ready.

Note: This package was tested on Ubuntu (13.04 or above) and Mac OS X (10.8+). It has not been tested on Windows.

Setup

Install CUDA driver, and make sure libcuda is in your library loading path.

Note: libcuda is a shared library for the CUDA driver.
Checkout this package in Julia:
```
Pkg.add("CUDA")
```
Test whether it works by running the example script in examples/ex1.jl.
Enjoy.

Example

The following example shows how one can use this package to add two matrices on GPU.

Write CUDA Kernel

First you have to write the computation kernel in CUDA C and save it in a .cu file. Here is a kernel for addition:

// filename: vadd.cu
// a simple CUDA kernel to add two vectors

extern "C"   // ensure function name to be exactly "vadd"
{
	__global__ void vadd(const float *a, const float *b, float *c)
	{
		int i = threadIdx.x + blockIdx.x * blockDim.x;
		c[i] = a[i] + b[i];
	}
}

You may compile the kernel to a PTX file using nvcc, as

nvcc -ptx vadd.cu

This command would generate a PTX file named vadd.ptx.

Run the Kernel in Julia

The following script demonstrates how one can load the kernel and run it in Julia.

using CUDA

# select a CUDA device
dev = CuDevice(0)

# create a context (like a process in CPU) on the selected device
ctx = create_context(dev)

# load the PTX module (each module can contain multiple kernel functions)
md = CuModule("vadd.ptx")

# retrieve the kernel function "vadd" from the module
vadd = CuFunction(md, "vadd")

# generate random arrays and load them to GPU
a = round(rand(Float32, (3, 4)) * 100)
b = round(rand(Float32, (3, 4)) * 100)
ga = CuArray(a)
gb = CuArray(b)

# create an array on GPU to store results
gc = CuArray(Float32, (3, 4))

# run the kernel vadd
# syntax: launch(kernel, grid_size, block_size, arguments)
# here, grid_size and block_size can be an integer or a tuple of integers
launch(vadd, 12, 1, (ga, gb, gc))

# download the results from GPU
c = to_host(gc)   # c is a Julia array on CPU (host)

# release GPU memory
free(ga)
free(gb)
free(gc)

# print the results
println("Results:")
println("a = \n$a")
println("b = \n$b")
println("c = \n$c")

# finalize: unload module and destroy context
unload(md)
destroy(ctx)

This is a relatively low-level API and is designed for people who have some understanding of CUDA programming to write/migrate CUDA codes in Julia. Compared to CUDA C, the interface has been greatly simplified.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
examples		examples
src		src
LICENSE.md		LICENSE.md
README.md		README.md
REQUIRE		REQUIRE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUDA.jl

Setup

Example

Write CUDA Kernel

Run the Kernel in Julia

About

Releases

Packages

License

LaszloHars/CUDA.jl

Folders and files

Latest commit

History

Repository files navigation

CUDA.jl

Setup

Example

Write CUDA Kernel

Run the Kernel in Julia

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages