Skip to content

NGramGraph implementation using OpenCL to speed up workload-heavy graph operations

License

Notifications You must be signed in to change notification settings

panosfoto/NGramGraphParallel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NGramGraphParallel

A generic implementation of n-gram graphs, designed to be able to handle not only texts (or text strings in general) but any custom data format, without changes to the main structure of the code.
Also, in the future support for OpenCL will be added to speed up workload-heavy graph operations with parallelization.

Features

  • Flexibility: The basic n-gram graph representation (see George Giannakopoulos' thesis, Chapter 3 for additional information) is expanded and defined generic, which results in data-agnostic and reusable code. In detail, the "text" and "n-gram" are replaced by "payload" and "atom"; In other words, an entity and the smallest pieces this entity can be split into. This enables custom represantation of other data types (e.g. DNA), which leads to clearer representation and maybe better performance than the equivalent string one.
  • Scalability: Through parallelization, workload-heavy graph operations can speed up significantly. Real life scenarios may contain a large amount of data, and parallelizing specific operations (e.g. graph comparisons) can bring execution time for big datasets down to reasonable levels. Parallelization is not implemented yet.

Dependencies

  • Boost : NGramGraphParallel uses Boost's Graph library for its graphs. To be able to run the code, you must have boost installed in your system. You can download the boost libraries here.
  • OpenCL : The parallelization of the workload is going to use OpenCL. Compatible hardware and their appropriate driver(s) are required. (not implemented yet, so not necessary)

Test execution

Inside the project's top directory:
make all
./test

Version

v0.1 This version is the result of a 3-month internship. Only basic functionalities are implemented. This version includes:

  • The generic n-gram graphs representation.
  • Implementation for both generic and specialized (only for text) graphs.
  • Basic graph operations and functionality.
  • Support for DOT language representation of n-Gram graphs.

License

NGramGraphParallel is licenced under Apache License 2.0.

About

NGramGraph implementation using OpenCL to speed up workload-heavy graph operations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published