Skip to content

LaszloHars/CUDA.jl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CUDA.jl

Julia Programming interface for CUDA.

This package wraps key functions in CUDA Driver API for Julia. While this remains work in progress, simple use is ready.

Note: This package was tested on Ubuntu (13.04 or above) and Mac OS X (10.8+). It has not been tested on Windows.

Setup

  1. Install CUDA driver, and make sure libcuda is in your library loading path.

    Note: libcuda is a shared library for the CUDA driver.

  2. Checkout this package in Julia:

    Pkg.add("CUDA")
  3. Test whether it works by running the example script in examples/ex1.jl.

  4. Enjoy.

Example

The following example shows how one can use this package to add two matrices on GPU.

Write CUDA Kernel

First you have to write the computation kernel in CUDA C and save it in a .cu file. Here is a kernel for addition:

// filename: vadd.cu
// a simple CUDA kernel to add two vectors

extern "C"   // ensure function name to be exactly "vadd"
{
	__global__ void vadd(const float *a, const float *b, float *c)
	{
		int i = threadIdx.x + blockIdx.x * blockDim.x;
		c[i] = a[i] + b[i];
	}
}

You may compile the kernel to a PTX file using nvcc, as

nvcc -ptx vadd.cu

This command would generate a PTX file named vadd.ptx.

Run the Kernel in Julia

The following script demonstrates how one can load the kernel and run it in Julia.

using CUDA

# select a CUDA device
dev = CuDevice(0)

# create a context (like a process in CPU) on the selected device
ctx = create_context(dev)

# load the PTX module (each module can contain multiple kernel functions)
md = CuModule("vadd.ptx")

# retrieve the kernel function "vadd" from the module
vadd = CuFunction(md, "vadd")

# generate random arrays and load them to GPU
a = round(rand(Float32, (3, 4)) * 100)
b = round(rand(Float32, (3, 4)) * 100)
ga = CuArray(a)
gb = CuArray(b)

# create an array on GPU to store results
gc = CuArray(Float32, (3, 4))

# run the kernel vadd
# syntax: launch(kernel, grid_size, block_size, arguments)
# here, grid_size and block_size can be an integer or a tuple of integers
launch(vadd, 12, 1, (ga, gb, gc))

# download the results from GPU
c = to_host(gc)   # c is a Julia array on CPU (host)

# release GPU memory
free(ga)
free(gb)
free(gc)

# print the results
println("Results:")
println("a = \n$a")
println("b = \n$b")
println("c = \n$c")

# finalize: unload module and destroy context
unload(md)
destroy(ctx)

This is a relatively low-level API and is designed for people who have some understanding of CUDA programming to write/migrate CUDA codes in Julia. Compared to CUDA C, the interface has been greatly simplified.

About

Julia Programming interface for CUDA

Resources

License

Stars

Watchers

Forks

Packages

No packages published