Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add gfss l2 cost #326

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
376 changes: 376 additions & 0 deletions docs/examples/graph-signal-cpd.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,376 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Graph signals change point detection with the Graph Fourier Scan Statistic: a low-pass band filter\n",
"\n",
"## Introduction\n",
"\n",
"Graph signal processing (GSP) is the study of multivariate signals $y_t \\in \\mathbb{R}^d$ lying on the nodes of a graph $\\mathcal{G} = (\\mathcal{V}, \\mathcal{E}, \\mathcal{W})$ (see for instance [[Stankovic2019](#Stankovic2019), [Shuman2013](#Shuman2013)]). As in standard signal processing, it is possible to define a Graph Fourier Transform and to generalize the notion of spectral filtering. The intuition behind the graph spectral frequencies is that a signal whose (graph) spectrum is located at low frequencies is \"smoother\" than a signal whose energy is concentrated on high frequencies. By \"smoother\", we refer to the notion of smoothness with respect to the structure of the graph [[Shuman2013](#Shuman2013)]: the smoother a signal, the closer its values on neighbor nodes. \n",
"\n",
"Thus, by applying a low-pass filter on the graph spectral domain, one is likely to remove from a graph signal the noise and/or uncorrelated information across the nodes. The authors of [[Ferrari2019](#Ferrari2019)] leverage this idea to define the Graph Fourier Scan Statistic (GFSS) algorithm (derived from the statistic introduced in [[Sharpnack2016](#Sharpnack2016)]). When a graph structure is available, this is one possible way of using notions coming from the field of GSP to enhance change point detection (CPD) for multivariate signals.\n",
"\n",
"In what follows, we focus on the above approach and we show how to apply it with `ruptures`. This example relies on the class [CostGFSSL2](../user-guide/costs/costgfssl2.md), which results from the combination of the GFSS and the least squared deviation.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Illustration\n",
"\n",
"First, we briefly illustrate the behavior of the GFSS and we justify the usage of the cost function `CostGFSSL2`. For a formal definition, please see [CostGFSSL2](../user-guide/costs/costgfssl2.md). \n",
"\n",
"The application of the GFSS amounts to a low-pass graph spectral filtering parametrized by the so-called cut-sparsity $\\rho$. The corresponding filter is displayed below:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"import numpy as np\n",
"\n",
"rho = 1\n",
"filter = lambda x, rho: np.minimum(1, np.sqrt(rho / x))\n",
"x = np.linspace(0, 10, 100)\n",
"filtered = [1] + list(filter(x[1:], rho))\n",
"\n",
"fig, ax = plt.subplots(1, 1, figsize=(6, 3))\n",
"ax.plot(x, filtered, label=\"GFSS filter\")\n",
"ax.axvline(x=rho, linestyle=\"--\", c=\"k\", label=\"$\\\\rho$\")\n",
"ax.set_xlabel(\"eigenvalues $\\lambda$\")\n",
"ax.legend()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Objectives\n",
"\n",
"As explained above, applying a low-pass filter to a graph signal shrinks the high-frequency components, thus attenuating the signal components that are not smooth with respect to the graph structure. Based on this statement, we deduce two potential (related) benefits of applying the cost `CostGFSSL2`:\n",
"\n",
"1. *Scenario 1*: as in [[Ferrari2019](#Ferrari2019)], detecting mean changes that are localized on clusters of the graph only. Formally, let $e_t \\sim \\mathcal{N}(0, \\sigma^2 I)$ a white noise process. If we denote $m_t(i)$ the mean of the process at node $i$ and $\\mathcal{C}$ a well connected subset of $\\mathcal{V}$ (a cluster), one may try to detect $t_r$ such that:\n",
"\n",
"$$\n",
"y_t = m_t + e_t \\quad \\text{ with } ~ m_t = \n",
"\\begin{cases}\n",
" m & \\forall t < t_r \\\\ \n",
" m + \\delta & \\forall t \\geq t_r \n",
"\\end{cases}\n",
"\\quad \\text{ and } ~ \\delta_i = \n",
"\\begin{cases}\n",
" c > 0 & \\forall i \\in \\mathcal{C} \\\\ \n",
" 0 & \\text{ otherwise } \n",
"\\end{cases}\n",
"$$\n",
"\n",
"2. *Scenario 2*: attenuating changes induced by spatially white noise (with high variance) or changes that may be due to individual dysfunctions of the observed system, for instance a geographical censors network, a social network... in graphs showing clusters."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"We generate a synthetic graph matching the above description, namely with a highly clusterized structure. The following graph contains $120$ nodes split between $6$ clusters."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import ruptures as rpt # our package\n",
"import networkx as nx # for graph utils\n",
"\n",
"from ruptures.costs.costgfssl2 import CostGFSSL2 # the gfss based cost function"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Graph generation\n",
"\n",
"nb_nodes = 120\n",
"cluster_nb = 6\n",
"mean_cluster_size = 20\n",
"inter_density = 0.02 # density of inter-clusters edges\n",
"intra_density = 0.95 # density of intra-clusters edges\n",
"graph_seed = 9 # for reproducibility\n",
"G = nx.gaussian_random_partition_graph(\n",
" n=nb_nodes,\n",
" s=mean_cluster_size,\n",
" v=2 * mean_cluster_size,\n",
" p_in=intra_density,\n",
" p_out=inter_density,\n",
" seed=graph_seed,\n",
")\n",
"coord = nx.spring_layout(G, seed=graph_seed) # for plotting consistency"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Vizualization of the graph clusters\n",
"\n",
"clusters_seed = 20\n",
"clusters = nx.algorithms.community.louvain.louvain_communities(G, seed=clusters_seed)\n",
"colors_dct = {0: \"r\", 1: \"b\", 2: \"g\", 3: \"orange\", 4: \"purple\", 5: \"brown\"}\n",
"cluster_idx_arr = np.zeros((nb_nodes))\n",
"\n",
"for cl_ind in range(len(clusters)):\n",
" for node_ind in list(clusters[cl_ind]):\n",
" cluster_idx_arr[node_ind] = cl_ind\n",
"\n",
"colors_l = [colors_dct[cluster_idx_arr[node_ind]] for node_ind in range(nb_nodes)]\n",
"\n",
"fig, ax = plt.subplots(1, 1, figsize=(8, 5))\n",
"ax.set_title(\"Clusters vizualization\")\n",
"nx.draw_networkx(G, pos=coord, with_labels=True, node_color=colors_l, ax=ax)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Robustness with respect to noise in Scenario 1\n",
"\n",
"We aim at detecting mean changes localized over one cluster only, as described in the scenario 1 presented in [the objectives](#objectives). Therefore, the generated signal undergoes a single mean-change over only one of the graph clusters, while it remains unchanged (in mean) over the other nodes. In this experiment, we progressively increase the noise level of the generated signal and we compare the change points that are detected by the standard [`CostL2`](../../user-guide/costs/costl2) and those obtained with the [`CostGFSSL`](../user-guide/costs/costgfssl2.md). We expect the GFSS filtering to increase the robustness of the change-point detection.\n",
"\n",
"We assume to be in the most general case, namely as if we did not know the number of changes to be detected. In this framework, we use the [Pelt](../user-guide/detection/pelt.md) algorithm and we set the penalty coefficient $\\beta$ to an arbitrary constant such that the experiments we show are relevant. The rigorous calibration of this parameter would require more work but is not the topic of this example. \n",
"\n",
"The last parameter to be chosen is the cut-sparsity $\\rho$ used in the definition of the GFSS [[Sharpnack2016](#Sharpnack2016), [Ferrari2019](#Ferrari2019)]. Empirically, as this parameter acts as a frequency threshold, one should look at the eigenvalues (spatial frequencies) to get an idea for the magnitude of $\\rho$. For the relevancy of our two experiments we set: $\\beta = 500$ and $\\rho = \\lambda_1 / 2$ where $\\lambda_1$ is the first eigenvalue."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Some eignevalues\n",
"from scipy.linalg import eigh\n",
"\n",
"eigvals, eigvects = eigh(nx.laplacian_matrix(G).toarray())\n",
"print([eigvals[i] for i in [0, 1, 5, 10, 20, 50, 100, -1]])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## SIGNAL GENERATION AND HYPER-PARAMETERS SETUP\n",
"\n",
"n_dims = nb_nodes\n",
"dims_with_change = np.array([dim for dim in clusters[-1]])\n",
"n_dims_with_change = len(dims_with_change)\n",
"n_samples = 100\n",
"signal_seed = 10\n",
"noise_std_values = [0.1, 1, 2, 3, 4]\n",
"\n",
"# Algorithm hyper-parameters\n",
"pen = 500\n",
"rho = eigvals[1] / 2\n",
"\n",
"print(f\"Number of nodes: {nb_nodes}, \")\n",
"print(f\"The dimensions with changes are: {[dim for dim in dims_with_change]} \")\n",
"print(f\"We set the cut-sparsity rho = {rho}. \\n\")\n",
"\n",
"print(\"RESULTS: \\n-------- \\n\")\n",
"for noise_std in noise_std_values:\n",
"\n",
" signal_with_change, gt_bkps = rpt.pw_constant(\n",
" n_samples, n_dims_with_change, 1, noise_std=noise_std, seed=signal_seed\n",
" )\n",
" signal, _ = rpt.pw_constant(\n",
" n_samples, n_dims, 0, noise_std=noise_std, seed=signal_seed\n",
" )\n",
" signal[:, dims_with_change] = signal_with_change\n",
"\n",
" ## CHANGE POINT DETECTION\n",
"\n",
" # with graph and GFSS\n",
" laplacian_mat = nx.laplacian_matrix(G).toarray()\n",
" cost_rpt_pelt = CostGFSSL2(laplacian_mat, rho)\n",
" graph_algo = rpt.Pelt(custom_cost=cost_rpt_pelt, jump=1, min_size=1).fit(signal)\n",
" my_graph_bkps = graph_algo.predict(pen=pen)\n",
"\n",
" # without graph\n",
" algo = rpt.Pelt(model=\"l2\", jump=1, min_size=1).fit(signal)\n",
" my_bkps = algo.predict(pen=pen)\n",
"\n",
" print(\n",
" \"NOISE_STD=\",\n",
" noise_std,\n",
" \"\\tGROUNDTRUTH\",\n",
" gt_bkps,\n",
" \"\\tWITH GFSS: \",\n",
" my_graph_bkps,\n",
" \"\\tSTANDARD L2: \",\n",
" my_bkps,\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the current value of the penalty coefficient and the chosen value of $\\rho$, we observe that the [CostGFSSL](../user-guide/costs/costgfssl2.md) cost function only detects the targeted change points while the standard [CostL2](../../user-guide/costs/costl2) cost is very sensitive to high noise levels. This result is necessarily dependent on the values of the hyper-parameters (other parametrization would lead to good results with the standard L2 cost function). In the [conclusion](#conclusions), we eventually justify why the above result should be seen as satisfying."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Influence of $\\rho$ in Scenario 2\n",
"\n",
"In the scenario 2, we assume that a very small number of nodes undergo a mean-change during the observation of the signal. The location of such nodes is random among the different clusters of the graph and may account for the occasional breakdown of some censors in a geographical network or an industrial system. In this framework, one may want to detect a precise physical phenomena and to avoid detecting random events like censor breakdown. \n",
"\n",
"The purpose of the following experiment is to compare the outputs of the CPD algorithm when applied with [`CostL2`](../../user-guide/costs/costl2) and [`CostGFSSL`](../user-guide/costs/costgfssl2.md) with respect to the value of $\\rho$. We use the same value of $\\beta$ than above and we explore different values of $\\rho$."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"## SIGNAL GENERATION AND HYPER-PARAMETERS SETUP\n",
"\n",
"rng = np.random.default_rng(seed=10)\n",
"\n",
"# signal hyper-parameters\n",
"n_dims = nb_nodes # number of dimensions equals the number of nodes\n",
"n_dims_with_change = n_dims // 25 # number of nodes undergoing a mean change\n",
"dims_with_change = np.arange(n_dims)\n",
"rng.shuffle(dims_with_change) # randomizing the nodes with change\n",
"dims_with_change = dims_with_change[:n_dims_with_change]\n",
"\n",
"# building the signal itself\n",
"n_samples = 100\n",
"signal_seed = 10\n",
"noise_std = 1\n",
"\n",
"signal_with_change, gt_bkps = rpt.pw_constant(\n",
" n_samples, n_dims_with_change, 1, noise_std=noise_std, seed=signal_seed\n",
")\n",
"signal, _ = rpt.pw_constant(n_samples, n_dims, 0, noise_std=noise_std, seed=signal_seed)\n",
"signal[:, dims_with_change] = signal_with_change\n",
"\n",
"print(f\"The dimensions with changes are: {[dim for dim in dims_with_change]}\")\n",
"\n",
"# algorithm hyper-parameters\n",
"pen = 500\n",
"\n",
"rho_values = [\n",
" eigvals[1] / 10,\n",
" eigvals[1] / 2,\n",
" eigvals[1],\n",
" 2 * eigvals[1],\n",
" eigvals[60],\n",
" eigvals[-1],\n",
"]\n",
"\n",
"for rho in rho_values:\n",
"\n",
" ## CHANGE POINT DETECTION\n",
"\n",
" # with graph and GFSS\n",
" laplacian_mat = nx.laplacian_matrix(\n",
" G\n",
" ).toarray() # extract the laplacian matrix as an numpy array\n",
" cost_rpt_pelt = CostGFSSL2(laplacian_mat, rho)\n",
" graph_algo = rpt.Pelt(custom_cost=cost_rpt_pelt, jump=1, min_size=1).fit(signal)\n",
" my_graph_bkps = graph_algo.predict(pen=pen)\n",
"\n",
" # without graph\n",
" algo = rpt.Pelt(model=\"l2\", jump=1, min_size=1).fit(signal)\n",
" my_bkps = algo.predict(pen=pen)\n",
"\n",
" print(\n",
" \"RHO=\",\n",
" round(rho, ndigits=3),\n",
" \"\\tGROUNDTRUTH\",\n",
" gt_bkps,\n",
" \"\\tWITH GFSS: \",\n",
" my_graph_bkps,\n",
" \"\\tSTANDARD L2: \",\n",
" my_bkps,\n",
" )"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When increasing the value of $\\rho$, the [CostGFSSL](../user-guide/costs/costgfssl2.md) cost function detects the \"unwanted\" mean-changes, but for low $\\rho$ the results are as expected. Similarly to above, this result is dependent on the parametrization, the proportion of nodes undergoing a mean change and the connectivity of the graph. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Conclusions\n",
"\n",
"- to say that we found a common set of hyper-parameters values for which both experiments give the expected results, even though this set of parameters depends on the graph itsefl. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References\n",
"\n",
"<a id=\"Ferrari2019\">[Ferrari2019]</a>\n",
"Ferrari, A., Richard, C., and Verduci, L. (2019). Distributed Change Detection in Streaming Graph Signals. IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), pages 166–170.\n",
"\n",
"<a id=\"Sharpnack2016\">[Sharpnack2016]</a>\n",
"Sharpnack, J., Rinaldo, A., and Singh, A. (2016). Detecting Anomalous Activity on Networks With the Graph Fourier Scan Statistic. EEE Transactions on Signal Processing, 64(2):364–379.\n",
"\n",
"<a id=\"Shuman2013\">[Shuman2013]</a>\n",
"Shuman, D. I., Narang, S. K., Frossard, P., Ortega, A., and Vandergheynst, P. (2013). The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. EEE Signal Processing Magazine, 30(3):83–98.\n",
"\n",
"<a id=\"Stankovic2019\">[Stankovic2019]</a>\n",
"Ljubisa Stankovic, Danilo P. Mandic, Milos Dakovic, Ilia Kisil, Ervin Sejdic, and Anthony G. Constantinides (2019). Understanding the Basis of Graph Signal Processing via an Intuitive Example-Driven Approach [Lecture Notes]. IEEE Signal Processing Magazine, 36(6):133–145.\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading