cuGraph - GPU Graph Analytics

The RAPIDS cuGraph library is a collection of graph analytics that process data found in GPU Dataframes - see cuDF. cuGraph aims to provide a NetworkX-like API that will be familiar to data scientists, so they can now build GPU-accelerated workflows more easily.

For more project details, see rapids.ai.

NOTE: For the latest stable README.md ensure you are on the latest branch.

import cugraph

# assuming that data has been loaded into a cuDF (using read_csv) Dataframe
gdf = cudf.read_csv("graph_data.csv", names=["src", "dst"], dtype=["int32", "int32"] )

# create a Graph using the source (src) and destination (dst) vertex pairs the GDF  
G = cugraph.Graph()
G.add_edge_list(gdf, source='src', destination='dst')

# Call cugraph.pagerank to get the pagerank scores
gdf_page = cugraph.pagerank(G)

for i in range(len(gdf_page)):
	print("vertex " + str(gdf_page['vertex'][i]) + 
		" PageRank is " + str(gdf_page['pagerank'][i]))

Supported Algorithms:

Algorithm	Scale	Notes
PageRank	Multi-GPU
Personal PageRank	Single-GPU
Katz Centrality	Single-GPU
Jaccard Similarity	Single-GPU
Weighted Jaccard	Single-GPU
Overlap Similarity	Single-GPU
SSSP	Single-GPU	Updated to provide path info
BFS	Single-GPU	Also BSP version
Triangle Counting	Single-GPU
K-Core	Single-GPU
Core Number	Single-GPU
Subgraph Extraction	Single-GPU
Spectral Clustering - Balanced-Cut	Single-GPU
Spectral Clustering - Modularity Maximization	Single-GPU
Louvain	Single-GPU
Ensemble Clustering for Graphs (ECG)	Single-GPU
Renumbering	Single-GPU
Basic Graph Statistics	Single-GPU
Weakly Connected Components	Single-GPU
Strongly Connected Components	Single-GPU

cuGraph Notice

The current version of cuGraph has some limitations:

Vertex IDs need to be 32-bit integers.
Vertex IDs are expected to be contiguous integers starting from 0.

cuGraph provides the renumber function to mitigate this problem. Input vertex IDs for the renumber function can be either 32-bit or 64-bit integers, can be non-contiguous, and can start from an arbitrary number. The renumber function maps the provided input vertex IDs to 32-bit contiguous integers starting from 0. cuGraph still requires the renumbered vertex IDs to be representable in 32-bit integers. These limitations are being addressed and will be fixed soon.

Release 0.11 includes a new 'Graph' class that could cause errors to existing code. Please see the Trainsition Guide

Getting cuGraph

Intro

There are 3 ways to get cuGraph :

Quick start with Docker Demo Repo
Conda Installation
Build from Source

Quick Start

Please see the Demo Docker Repository, choosing a tag based on the NVIDIA CUDA version you’re running. This provides a ready to run Docker container with example notebooks and data, showcasing how you can utilize all of the RAPIDS libraries: cuDF, cuML, and cuGraph.

Conda

It is easy to install cuGraph using conda. You can get a minimal conda installation with Miniconda or get the full installation with Anaconda.

Install and update cuGraph using the conda command:

# CUDA 10.0
conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cugraph cudatoolkit=10.0

# CUDA 10.1
conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cugraph cudatoolkit=10.1

# CUDA 10.2
conda install -c nvidia -c rapidsai -c numba -c conda-forge -c defaults cugraph cudatoolkit=10.2

Note: This conda installation only applies to Linux and Python versions 3.6/3.7.

Build from Source and Contributing

Please see our guide for building and contributing to cuGraph.

Documentation

Python API documentation can be generated from docs directory.

Open GPU Data Science

The RAPIDS suite of open source software libraries aim to enable execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.

Apache Arrow on GPU

The GPU version of Apache Arrow is a common API that enables efficient interchange of tabular data between processes running on the GPU. End-to-end computation on the GPU avoids unnecessary copying and converting of data off the GPU, reducing compute time and cost for high-performance analytics common in artificial intelligence workloads. As the name implies, cuDF uses the Apache Arrow columnar data format on the GPU. Currently, a subset of the features in Apache Arrow are supported.

Name		Name	Last commit message	Last commit date
Latest commit History 2,171 Commits
.github		.github
ci		ci
conda		conda
cpp		cpp
datasets		datasets
docs		docs
img		img
notebooks		notebooks
python		python
thirdparty		thirdparty
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
TRANSITIONGUIDE.md		TRANSITIONGUIDE.md
build.sh		build.sh
conda_build.sh		conda_build.sh
print_env.sh		print_env.sh
readthedocs.yml		readthedocs.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cuGraph - GPU Graph Analytics

Supported Algorithms:

cuGraph Notice

Getting cuGraph

Intro

Quick Start

Conda

Build from Source and Contributing

Documentation

Open GPU Data Science

Apache Arrow on GPU

About

Releases

Packages

Languages

License

efajardo-nv/cugraph

Folders and files

Latest commit

History

Repository files navigation

cuGraph - GPU Graph Analytics

Supported Algorithms:

cuGraph Notice

Getting cuGraph

Intro

Quick Start

Conda

Build from Source and Contributing

Documentation

Open GPU Data Science

Apache Arrow on GPU

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages