SEA is a project to research and implement an open C++ library for Space-Efficient (Graph) Algorithms (SEA). Besides the running time of algorithms, their space requirements cause a problem if dealing with huge data sets or executing them on tiny devices where memory is heavily limited. Therefore, we want to provide algorithms and data structures to tackle this problem by treating space as a scarce resource.
- Motivation
- Algorithms and Data Structures
- Build
- Using the Library
- Project Structure
- Research
- License
- Acknowledgments
Space-efficient algorithms in practice are justified by two main arguments. The first one is that due to memory limitation of a real computer a space-efficient algorithm runs on instances where a non space-efficient algorithm would crash due to out of memory errors. This is especially interesting where all cores of a computer should be used and multiple algorithms compete for the available memory. The second reason is that space-efficient algorithms can run faster on huge instances because they use less memory and therefore cause less cache faults. (A cache fault occurs when a system must move some of the data stored in the memory to a slower one, i.e., from the L1/L2/L3 cache of the CPU to the RAM or from the RAM to the disc drive.)
We now show these two points on an example. Assume we need to compute an Eulerian tour in an outerplanar graph. Without going into detail, we use the Hierholzer algorithm where one key step of the algorithm is the creation of recursive graph instances. The first step to make the algorithm space-efficient is to handle the recursive graph instances with a subgraph stack. We have implemented a folklore algorithm and a space-efficient version of it and run tests to measure the running time and the space requirements of both algorithms.
The next image shows the space requirements of the creation of subgraph instances using a space-efficient subgraph stack (blue line) and a naive subgraph stack, which creates subgraphs instances by copying vertices and edges of the original graph that remain in the subgraph. We run the test on a machine with 4gb of RAM and a small swap memory. Note that the naive subgraph stack did not run on large instances because it crashed after running out of memory, while the space-efficient subgraph stack could still run.
Next we show the runtime results of the computation of Eulerian tours mentioned above. The green line shows the runtime of just the creation of the recursive graph instances without even running Hierholzer algorithm on them. The red line shows the runtime of a naive Hierholzer algorithm that uses a space-efficient subgraph stack, i.e., a space-efficient version to create subgraphs. The blue line shows the runtime of a space-efficient Hierholzer algorithm that uses a space-efficient subgraphs stack.
Note that space-efficient algorithms neither guarantee better time performance in general nor does over optimization of memory requirements leads to better running times. As often in life, a reasonable mix does the trick.
This section gives you a brief overview over the implemented algorithms and data structures. For a detailed documentation click on an algorithm. For some data structures and algorithms we also provide a folklore implementation that we use to compare the memory consumption and runtime efficiency.
Algorithm | Runtime | Memory (bits) | Details |
---|---|---|---|
Depth First Search | O(n+m) | O(n+m) | here |
Depth First Search | O(n+m) | O(n log(log n)) | here |
Depth First Search | O((n+m) log n) | O((log(3)+ε) n) | here |
Reverse DFS | O(n+m) | O(n log(log n)) | here |
Breadth First Search | O(n+m) | O(n) | here |
Cut-Vertex | O(n+m) | O(n+m) | here |
Biconnected-Component | O(n+m) | O(n+m) | here |
Outerplanar Detection | O(n log(log n)) | O(n) | here |
- InitializedArray: An array consisting of fields that in total can be initialized with an user defined value in constant time by using O(1) computer words. The array provides constant time access (read/write) to fields.
- Graph(G = {V, E}): An adjacency list graph representation that occupies O((n + m) log n) bits.
- Bitset: A bitset of n bits that supports access in O(1) time and occupies O(n) bits.
- AVL tree: A self-balancing binary tree with O(log(n)) time for search, insertion and removal of a node.
- Choice Dictionary: A bitset that supports a choice operation in O(1) time that returns the position of a bit set to 1. The choice dictionary occupies O(n) bits.
- Rank-Select: A bit sequence that supports the operations rank(k) and select(k) in O(1) time and occupies O(n) bits. rank(k) returns the number of set bits up to index k, and select(k) returns the index of the k-th set bit.
- Ragged Dictionary: A set of n/log(n) key-value tuples with O(log(log(n))) time for get, insert and remove operations. The ragged dictionary occupies O(n) bits.
- Static-Space Storage: A sequence of n bit packs of variable size that can be accessed in O(1) time and occupies O(n + N) bits. N is the total usable size of the static-space storage.
- Subraph Stack: Initialized with an n-vertex m-edge graph the stack allows to remove vertices and edges and provides a resulting graph using O(n + m) bits.
- Install CMake and a C++ compiler for your specific operating system.
- Build a make file for your system by using CMake ->
cmake .
- Build the artifacts by executing make ->
make
Now, the include
folder contains the necessary header files and the lib
folder contains the build library.
If you encounter any bugs, missing or misleading documentation, do not hesitate to create an issue ticket.
- Copy the
include/sealib
folder into your own project's include path. - Copy the
lib/libsealib.so
file into your own project's library path. Make sure that the environment variableLD_LIBRARY_PATH
also points there, or else the shared library won't be found. - Include the header files you want to use in your code.
- Compile with correct
-I
and-L
flags, and use-std=c++11
.
#include <vector>
#include <sealib/_types.h>
#include <sealib/iterator/dfs.h>
#include <sealib/graph/graphcreator.h>
using namespace Sealib;
bool reachable(DirectedGraph &g, uint64_t a, uint64_t b) {
bool ret = false;
std::vector<bool> started(100);
std::vector<bool> done(100);
DFS::nloglognBitDFS(g,
[&](uint64_t u) {
started[u]=1;
if (u == b && started[a] && !done[a]) {
ret = true;
}
},
DFS_NOP_EXPLORE, DFS_NOP_EXPLORE,
[&](uint64_t u) { done[u]=1; });
return ret;
}
int main(void) {
DirectedGraph g = GraphCreator::kOutdegree(100, 30);
return reachable(g, 10, 25);
}
Compile with:
clang++ -I<include-path> -L<libary-path> -std=c++11 -o reachable reachable.cpp -lsealib
Run the executable:
export LD_LIBRARY_PATH=<library-path>
./reachable
.
├── CMakeLists.txt # CMake build script
├── LICENSE # Licence description
├── README.md # You are reading this file now
├── third-party # Third party libraries
├── include # The library's header files (*.h)
├── src # The library's source files (*.cpp)
├── src-view # The source files for the visualization (*.cpp)
├── test # The test files
├── lib # The library files
└── bin # Executable files to test the project
We publish most of our research on arXiv.org.
- Sorting and Ranking of Self-Delimiting Numbers with Applications to Tree Isomorphism Full Version
- Space-Efficient Vertex Separators for Treewidth Full Version
- Extra Space during Initialization of Succinct Data Structures and Dynamical Initializable Arrays MFCS 2018: 65:1-65:16 | Full Version
- Linear-Time In-Place DFS and BFS on the Word RAM CIAC 2019 | Full Version
- A Space-Optimal c-Color Choice Dictionary ISAAC 2018: 66:1-66:12 | Full Version
- Space-Efficient Biconnected Components and Recognition of Outerplanar Graphs MFCS 2016: 56:1-56:14 | Full Version
- Space-Efficient Basic Graph Algorithms STACS 2015: 288-301 | Full Version
Licensed under the GNU General Public License version 3. For detailed license information look inside the LICENSE file.
Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 379157101.