Skip to content

rodrigo-pena/gssr

Repository files navigation

Graph Signal Sampling and Recovery (GSSR)

Python functionality for recovering graph signals from subsampled measurements.

Currently maintained by Rodrigo C. G. Pena.

The installation guide below contains instructions for setting up your environment to use the tools in the repository.

DOI

Description

A graph is a collection of vertices and edges. The latter can be weighted, often indicating the degree of similarity between the connected vertices.

Any numerical quantities attached to the vertices of a graph can be considered as a graph signal. For example, in a social network with $n$ vertices, attach a value of $+1$ to all the users who watched The Social Network, and $-1$ otherwise. The $n$-dimensional vector gathering the $\pm 1$ values for all the vertices is a graph signal.

Querying each vertex for its signal value can be expensive in large networks, meaning one can usually afford to measure these values only at a few selected locations. This is called subsampling and is, mathematically, a linear operation. Let $\mathbf{x} \in \mathbb{R}^{n}$ be a graph signal and construct a matrix $\mathbf{A} \in \mathbb{R}^{m \times n}$ as follows. If vertex $i$ is sampled, then add a corresponding row to $\mathbf{A}$ filled with zeros, except at position $i$, where it has a $1$. The matrix thus constructed is called the measurement matrix, and $\mathbf{x} \mapsto \mathbf{Ax}$ is the subsampling operation. Setting the likelihoods with which each vertex is sampled is called a sampling design.

To recover $\mathbf{x}$, the signal of interest, from its measurements $\mathbf{Ax}$, one needs a decoder. No decoder can recover every graph signal because for any $\mathbf{x}$ there are infinitely many vectors $\mathbf{z} \in \mathbb{R}^{n}$ satisfying the measurement constraints $\mathbf{Az} = \mathbf{Ax}$. However, for some sub-classes of signals within $\mathbb{R}^{n}$, one may find a decoder that almost always suceeds provided the number of measurements is large enough. This idea is fundamental in research areas such as Compressed Sensing.

This repository contains utilities to conveniently load graphs and signals, and implement sampling designs and decoders.

Content

  • graphs_signals.py: Functions related to loading and representing graphs and their signals. See the data/ folder in this repository for information on the relevant datasets. You need to follow the instructions therein to download the required files for working with each dataset.
  • sampling.py: Functions related to vertex sampling designs.
  • recovery.py: Functions related to recovery programs (decoders).
  • phase_transition.py: Functions related to the phase transition experiments commonly done in Compressed Sensing. They allow one to observe how the recovery error varies with a set of parameters (normally the number of measurements).
  • plotting.py: Functions related to plotting of graphs and signals.
  • utils.py: A miscellaneous collection of helper functions.

Installation guide

Remark: You can also click the binder badge below to run the included notebooks from your browser without installing anything.

Binder

For a local installation, you will need git, python >= 3.6, jupyter, and packages from the python scientific stack. I recommend using conda for creating and managing a separate environment for this repository. If unfamiliar with this process, you can follow the instructions below.

  1. Download the Python 3.x Miniconda installer and run it using default settings (another option is to download the bulkier Anaconda installer).
  2. Open a terminal window and install git with conda install git.
  3. Within the terminal, navigate to the directory where you want to keep the contents from this repository (e.g., run cd ~/Documents/github/).
  4. Download this repository to the current directory by typing git clone https://github.com/rodrigo-pena/gssr on the terminal.
  5. Create a new environment by typing conda create --name gssr on the terminal.
  6. Activate the environment with conda activate gssr (or activate gssr, or source activate gssr).
  7. You should notice gssr typed somewhere on the newest terminal line, indicating that you are within the environment. Install the required packages by running conda install -c conda-forge python=3.6 jupyterlab and then pip install -r requirements.txt.
  8. Run python test_install.py to check if all the requirements have been correctly installed.

Using the repository

After the (one-time) creation of the environment, you can do the following every time you want to work with this repository:

  1. Open a terminal and navigate to the directory where the repository has been downloaded.
  2. Activate the environment with conda activate gssr (or activate gssr, or source activate gssr).
  3. Start Jupyter with jupyter lab. The command should open a new tab in your web browser.
  4. Edit and run the scripts and notebooks from your browser.

You can check standard_pipeline.ipynb for a basic idea of how to use the function modules in the repository.

Acknowledgments

The contents of this repository benefitted from the following resources:

License

The content is released under the terms of the MIT License.

Attribution

If you are using this software for your research, for the sake of reproducibility, please cite the version you used as indexed by its Zenodo record. Alternatively, you may cite it generically as

@software{gssr,
  author       = {Rodrigo C. G. Pena},
  title        = {Graph Signal Sampling and Recovery (GSSR)},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.6566207},
  url          = {https://github.com/rodrigo-pena/gssr}
}