diff --git a/docs/articles/Tutorials.html b/docs/articles/Tutorials.html
index e54b419..46c3929 100644
--- a/docs/articles/Tutorials.html
+++ b/docs/articles/Tutorials.html
@@ -83,7 +83,7 @@ <h1 data-toc-skip>Tutorials for k-means clustering inference</h1>
 <div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://yiqunchen.github.io/CADET/">CADET</a></span><span class="op">)</span></span>
 <span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://ggplot2.tidyverse.org" class="external-link">ggplot2</a></span><span class="op">)</span></span></code></pre></div>
-<p>We first generate data according to <span class="math inline">\(\mathbf{X} \sim {MN}_{n\times q}(\boldsymbol{\mu},
+<p>We first generate data according to <span class="math inline">\(\mathbf{X} \sim MN_{n\times q}(\boldsymbol{\mu},
 \textbf{I}_n, \sigma^2 \textbf{I}_q)\)</span> with <span class="math inline">\(n=150,q=2,\sigma=1,\)</span> and <span class="math display">\[\begin{align}
 \label{eq:power_model}
 \boldsymbol{\mu}_1 =\ldots = \boldsymbol{\mu}_{50} = \begin{bmatrix}
@@ -95,9 +95,9 @@ <h1 data-toc-skip>Tutorials for k-means clustering inference</h1>
 \boldsymbol{\mu}_{101}=\ldots = \boldsymbol{\mu}_{150} = \begin{bmatrix}
 \delta/2 \\ 0_{q-1}
 \end{bmatrix}.
-\end{align}\]</span> Here, we can think of <span class="math inline">\({C}_1 = \{1,\ldots,50\},{C}_2 =
-\{51,\ldots,100\},{C}_3 = \{101,\ldots,150\}\)</span> as the “true
-clusters”. In the figure below, we display one such simulation <span class="math inline">\(\mathbf{x}\in\mathbb{R}^{100\times 2}\)</span>
+\end{align}\]</span> Here, we can think of <span class="math inline">\(C_1 = \{1,\ldots,50\},C_2 = \{51,\ldots,100\},C_3
+= \{101,\ldots,150\}\)</span> as the “true clusters”. In the figure
+below, we display one such simulation <span class="math inline">\(\mathbf{x}\in\mathbb{R}^{100\times 2}\)</span>
 with <span class="math inline">\(\delta=10\)</span>.</p>
 <div class="sourceCode" id="cb2"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/base/Random.html" class="external-link">set.seed</a></span><span class="op">(</span><span class="fl">2022</span><span class="op">)</span></span>
@@ -187,9 +187,8 @@ <h3 id="inference-for-a-single-feature-mean-after-k-means-clustering">Inference
 <span><span class="co">#&gt;   cluster_1 cluster_2 test_stat  p_selective       p_naive</span></span>
 <span><span class="co">#&gt; 1         2         3  4.464756 8.514513e-29 2.171388e-110</span></span></code></pre></div>
 <p>In the summary, we have the empirical difference in means of the
-second feature between the two clusters, i.e.,<span class="math inline">\(\sum_{i\in
-{\hat{{G}}}}\mathbf{x}_{i,2}/|\hat{{G}}| - \sum_{i\in
-\hat{G}'}\mathbf{x}_{i,2}/|\hat{G}'|\)</span>
+second feature between the two clusters, i.e.,<span class="math inline">\(\sum_{i\in \hat{G}}\mathbf{x}_{i,2}/|\hat{{G}}| -
+\sum_{i\in \hat{G}'}\mathbf{x}_{i,2}/|\hat{G}'|\)</span>
 (<code>test_stats</code>), the naive p-value based on a z-test
 (<code>p_naive</code>), and the selective <span class="math inline">\(p\)</span>-value (<code>p_selective</code>). In
 this case, the test based on <span class="math inline">\(p_{\text{selective}}\)</span> can reject this null
@@ -221,13 +220,12 @@ <h3 id="inference-for-k-means-clustering-when-the-null-hypothesis-holds">Inferen
 <span><span class="co">#&gt;             2  0  0 50  0</span></span>
 <span><span class="co">#&gt;             3 25  0  0 25</span></span></code></pre></div>
 <p>By inspection, we see that the blue clusters (labeled as cluster 1)
-and the grey clusters (labeled as cluster 4) have the same mean. Now
-<span class="math inline">\(p_{\text{selective}}\)</span> yields a much
+and the grey clusters (labeled as cluster 4) have the same mean. Now the
+selective <span class="math inline">\(p\)</span>-value yields a much
 more moderate <span class="math inline">\(p\)</span>-value, and the test
-based on <span class="math inline">\(p_{2,\text{selective}}\)</span>
-cannot reject the null hypothesis when it holds. By contrast, the naive
-<span class="math inline">\(p\)</span>-value is tiny and leads to an
-anti-conservative test.</p>
+based on it cannot reject the null hypothesis when it holds. By
+contrast, the naive <span class="math inline">\(p\)</span>-value is tiny
+and leads to an anti-conservative test.</p>
 <div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">cluster_1</span> <span class="op">&lt;-</span> <span class="fl">1</span></span>
 <span><span class="va">cluster_2</span> <span class="op">&lt;-</span> <span class="fl">4</span></span>
diff --git a/docs/articles/Tutorials_hier.html b/docs/articles/Tutorials_hier.html
index 59677d3..4ac46b2 100644
--- a/docs/articles/Tutorials_hier.html
+++ b/docs/articles/Tutorials_hier.html
@@ -156,10 +156,10 @@ <h3 id="inference-for-a-single-feature-mean-after-hierarchical-clustering">Infer
 <div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">cluster_1</span> <span class="op">&lt;-</span> <span class="fl">1</span></span>
 <span><span class="va">cluster_2</span> <span class="op">&lt;-</span> <span class="fl">3</span></span>
-<span><span class="va">cl_1_2_inference_demo</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/test_hier_clusters_exact_1f.html">test_hier_clusters_exact_1f</a></span><span class="op">(</span>X<span class="op">=</span><span class="va">X</span>, link<span class="op">=</span><span class="st">"average"</span>, hcl<span class="op">=</span><span class="va">hcl</span>, K<span class="op">=</span><span class="fl">3</span>, k1<span class="op">=</span><span class="fl">1</span>, k2<span class="op">=</span><span class="fl">2</span>, feat<span class="op">=</span><span class="fl">1</span><span class="op">)</span></span>
-<span><span class="fu"><a href="https://rdrr.io/r/base/summary.html" class="external-link">summary</a></span><span class="op">(</span><span class="va">cl_1_2_inference_demo</span><span class="op">)</span></span>
+<span><span class="va">cl_inference_demo</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/test_hier_clusters_exact_1f.html">test_hier_clusters_exact_1f</a></span><span class="op">(</span>X<span class="op">=</span><span class="va">X</span>, link<span class="op">=</span><span class="st">"average"</span>, hcl<span class="op">=</span><span class="va">hcl</span>, K<span class="op">=</span><span class="fl">3</span>, k1<span class="op">=</span><span class="va">cluster_1</span>, k2<span class="op">=</span><span class="va">cluster_2</span>, feat<span class="op">=</span><span class="fl">1</span><span class="op">)</span></span>
+<span><span class="fu"><a href="https://rdrr.io/r/base/summary.html" class="external-link">summary</a></span><span class="op">(</span><span class="va">cl_inference_demo</span><span class="op">)</span></span>
 <span><span class="co">#&gt;   cluster_1 cluster_2 test_stat  p_selective      p_naive</span></span>
-<span><span class="co">#&gt; 1         1         2  4.464756 8.870985e-08 1.774197e-07</span></span></code></pre></div>
+<span><span class="co">#&gt; 1         1         3  9.910708 2.868783e-26 4.596766e-31</span></span></code></pre></div>
 <p>In the summary, we have the empirical difference in means of the
 first feature between the two clusters, i.e.,<span class="math inline">\(\sum_{i\in
 {\hat{{G}}}}\mathbf{x}_{i,2}/|\hat{{G}}| - \sum_{i\in
@@ -179,15 +179,17 @@ <h3 id="inference-for-a-single-feature-mean-after-hierarchical-clustering-when-t
 assignment as before, there is no difference in means in the second
 feature between cluster 1 (grey cluster) and cluster 3 (blue
 cluster).</p>
-<p>Now <span class="math inline">\(p_{2,\text{selective}}\)</span>
-yields a much more moderate <span class="math inline">\(p\)</span>-value, and the test based on <span class="math inline">\(p_{2,\text{selective}}\)</span> cannot reject the
-null hypothesis when it holds. By contrast, the naive <span class="math inline">\(p\)</span>-value is tiny and leads to an
-anti-conservative test.</p>
+<p>Now the selective <span class="math inline">\(p\)</span>-value yields
+a much more moderate <span class="math inline">\(p\)</span>-value, and
+the test based on it cannot reject the null hypothesis when it holds. By
+contrast, the naive <span class="math inline">\(p\)</span>-value is tiny
+and leads to an anti-conservative test.</p>
 <div class="sourceCode" id="cb5"><pre class="downlit sourceCode r">
 <code class="sourceCode R"><span><span class="va">cluster_1</span> <span class="op">&lt;-</span> <span class="fl">1</span></span>
-<span><span class="va">cluster_2</span> <span class="op">&lt;-</span> <span class="fl">4</span></span>
-<span><span class="va">cl_1_2_inference_demo</span> <span class="op">&lt;-</span>  <span class="fu"><a href="../reference/test_hier_clusters_exact_1f.html">test_hier_clusters_exact_1f</a></span><span class="op">(</span>X<span class="op">=</span><span class="va">X</span>, link<span class="op">=</span><span class="st">"average"</span>, hcl<span class="op">=</span><span class="va">hcl</span>, K<span class="op">=</span><span class="fl">3</span>, k1<span class="op">=</span><span class="fl">1</span>, k2<span class="op">=</span><span class="fl">3</span>, feat<span class="op">=</span><span class="fl">2</span><span class="op">)</span></span>
-<span><span class="fu"><a href="https://rdrr.io/r/base/summary.html" class="external-link">summary</a></span><span class="op">(</span><span class="va">cl_1_2_inference_demo</span><span class="op">)</span></span>
+<span><span class="va">cluster_2</span> <span class="op">&lt;-</span> <span class="fl">3</span></span>
+<span><span class="va">cl_inference_demo</span> <span class="op">&lt;-</span>  <span class="fu"><a href="../reference/test_hier_clusters_exact_1f.html">test_hier_clusters_exact_1f</a></span><span class="op">(</span>X<span class="op">=</span><span class="va">X</span>, link<span class="op">=</span><span class="st">"average"</span>,</span>
+<span>                                                      hcl<span class="op">=</span><span class="va">hcl</span>, K<span class="op">=</span><span class="fl">3</span>, k1<span class="op">=</span><span class="va">cluster_1</span>, k2<span class="op">=</span><span class="va">cluster_2</span>, feat<span class="op">=</span><span class="fl">2</span><span class="op">)</span></span>
+<span><span class="fu"><a href="https://rdrr.io/r/base/summary.html" class="external-link">summary</a></span><span class="op">(</span><span class="va">cl_inference_demo</span><span class="op">)</span></span>
 <span><span class="co">#&gt;   cluster_1 cluster_2  test_stat p_selective   p_naive</span></span>
 <span><span class="co">#&gt; 1         1         3 -0.1766818   0.8362984 0.8362984</span></span></code></pre></div>
 </div>
diff --git a/docs/articles/Tutorials_hier_files/figure-html/unnamed-chunk-6-1.png b/docs/articles/Tutorials_hier_files/figure-html/unnamed-chunk-6-1.png
deleted file mode 100644
index e8a089a..0000000
Binary files a/docs/articles/Tutorials_hier_files/figure-html/unnamed-chunk-6-1.png and /dev/null differ
diff --git a/docs/articles/technical_details.html b/docs/articles/technical_details.html
index db8c5e5..629707c 100644
--- a/docs/articles/technical_details.html
+++ b/docs/articles/technical_details.html
@@ -78,7 +78,7 @@ <h1 data-toc-skip>Technical details</h1>
     
 <center>
 <img src="../reference/figures/fig_1.png" style="width:90.0%"><figcaption>
-Figure 1: We simulated one dataset according to <span class="math inline">\({MN}_{100\times 10}(\mu, \textbf{I}_{100},
+Figure 1: We simulated one dataset according to <span class="math inline">\(MN_{100\times 10}(\mu, \textbf{I}_{100},
 \Sigma)\)</span>, where <span class="math inline">\(\mu_i =
 (1,0_9)^T\)</span> for <span class="math inline">\(i=1,\ldots,
 50\)</span> and <span class="math inline">\(\mu_i = (0_9,1)^T\)</span>
@@ -160,8 +160,8 @@ <h3 id="inference-for-the-difference-in-means-of-a-single-feature-between-two-es
 U(x) = x - \frac{\hat{\nu} \Sigma_{j}^T [x^T \hat{\nu}]_j
 }{\|\hat{\nu}\|_2^2 \Sigma_{jj}}. \label{eq:defU}
 \end{align}\]</span> In the definition of <span class="math inline">\(p_{j, \text{selective}}\)</span>, we have
-conditioned on (i) the estimated clusters <span class="math inline">\(\mathcal{C}(x)\)</span> to account for the
-data-driven nature of the null hypothesis; and (ii) <span class="math inline">\(U(x)\)</span> to eliminate the unknown nuisance
+conditioned on (i) the estimated clusters <span class="math inline">\({C}(x)\)</span> to account for the data-driven
+nature of the null hypothesis; and (ii) <span class="math inline">\(U(x)\)</span> to eliminate the unknown nuisance
 parameters under the null.</p>
 <p>We show that this <span class="math inline">\(p\)</span>-value for
 testing <span class="math inline">\(\hat{H}_{0j}\)</span> can be written
@@ -169,21 +169,22 @@ <h3 id="inference-for-the-difference-in-means-of-a-single-feature-between-two-es
 \begin{split}
 \label{eq:selective_cdf}
 p_{j, \text{selective}} = 1 - \mathbb{F} \left (\big | [\hat{\nu}^T x]_j
-\big | ; 0, \Sigma_{jj}\|\hat{\nu}\|_2^2; \mathcal{\hat{S}}_j \right ) +
+\big | ; 0, \Sigma_{jj}\|\hat{\nu}\|_2^2; {\hat{S}}_j \right ) +
 \mathbb{F}\left (-\big | [\hat{\nu}^T x]_j \big | ; 0, \Sigma_{jj}
-\|\hat \nu\|_2^2;  \mathcal{\hat{S}}_j \right ),
+\|\hat \nu\|_2^2;  {\hat{S}}_j \right ),
 \end{split}
 \end{align}\]</span>} where <span class="math inline">\(\mathbb{F}(t;
-\mu, \sigma, \mathcal{S})\)</span> denotes the cumulative distribution
-function (CDF) of a <span class="math inline">\(N(\mu,
-\sigma^2)\)</span> random variable truncated to the set <span class="math inline">\(\mathcal{S}\)</span>, <span class="math inline">\(x'(\phi,j) = x + (\phi - ( \bar{x}_{\hat{G}j}
-- \bar{x}_{\hat{G'}j})) \left ( \frac{ \hat{\nu} }{
-\|\hat{\nu}\|_2^2 } \right ) \left ( \frac{\Sigma_j}{\Sigma_{jj}} \right
-)^T,\)</span> and {<span class="math display">\[\begin{equation}
+\mu, \sigma, {S})\)</span> denotes the cumulative distribution function
+(CDF) of a <span class="math inline">\(N(\mu, \sigma^2)\)</span> random
+variable truncated to the set <span class="math inline">\({S}\)</span>,
+<span class="math inline">\(x'(\phi,j) = x + (\phi - (
+\bar{x}_{\hat{G}j} - \bar{x}_{\hat{G'}j})) \left ( \frac{ \hat{\nu}
+}{ \|\hat{\nu}\|_2^2 } \right ) \left ( \frac{\Sigma_j}{\Sigma_{jj}}
+\right )^T,\)</span> and <span class="math display">\[\begin{equation}
 \hat S_j =
-\left \{ \phi \in \mathbb{R}: C(x) =
-\mathcal{C}\left(x'(\phi,j)\right ) \right \}. \label{eq:defS}
-\end{equation}\]</span> }.</p>
+\left \{ \phi \in \mathbb{R}: C(x) = {C}\left(x'(\phi,j)\right )
+\right \}. \label{eq:defS}
+\end{equation}\]</span>.</p>
 <p>While the notation in the last paragraph might seem daunting, the
 intuition is simple: since <span class="math inline">\(p_{j,
 \text{selective}}\)</span> can be rewritten into sums of CDFs of
diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml
index 7ba3443..dc3b4ec 100644
--- a/docs/pkgdown.yml
+++ b/docs/pkgdown.yml
@@ -5,7 +5,7 @@ articles:
   Tutorials: Tutorials.html
   Tutorials_hier: Tutorials_hier.html
   technical_details: technical_details.html
-last_built: 2023-11-26T01:22Z
+last_built: 2023-11-26T01:39Z
 urls:
   reference: https://yiqunchen.github.io/CADET/reference
   article: https://yiqunchen.github.io/CADET/articles
diff --git a/vignettes/Tutorials.Rmd b/vignettes/Tutorials.Rmd
index 273aaca..52b9c6c 100644
--- a/vignettes/Tutorials.Rmd
+++ b/vignettes/Tutorials.Rmd
@@ -23,7 +23,7 @@ library(CADET)
 library(ggplot2)
 ```
 
-We first generate data according to $\mathbf{X} \sim {MN}_{n\times q}(\boldsymbol{\mu}, \textbf{I}_n, \sigma^2 \textbf{I}_q)$ with $n=150,q=2,\sigma=1,$ and 
+We first generate data according to $\mathbf{X} \sim MN_{n\times q}(\boldsymbol{\mu}, \textbf{I}_n, \sigma^2 \textbf{I}_q)$ with $n=150,q=2,\sigma=1,$ and 
 \begin{align}
 \label{eq:power_model}
 \boldsymbol{\mu}_1 =\ldots = \boldsymbol{\mu}_{50} = \begin{bmatrix}
@@ -35,7 +35,7 @@ We first generate data according to $\mathbf{X} \sim {MN}_{n\times q}(\boldsymbo
 \delta/2 \\ 0_{q-1}
 \end{bmatrix}.
 \end{align}
-Here, we can think of ${C}_1 = \{1,\ldots,50\},{C}_2 = \{51,\ldots,100\},{C}_3 = \{101,\ldots,150\}$ as the "true clusters".
+Here, we can think of $C_1 = \{1,\ldots,50\},C_2 = \{51,\ldots,100\},C_3 = \{101,\ldots,150\}$ as the "true clusters".
 In the figure below, we display one such simulation $\mathbf{x}\in\mathbb{R}^{100\times 2}$ with $\delta=10$. 
 
 ```{r fig.align="center",  fig.height = 5, fig.width = 5}
@@ -98,7 +98,7 @@ cl_inference_demo <- kmeans_inference_1f(X, k=3, cluster_1, cluster_2,
 summary(cl_inference_demo)
 ```
 
-In the summary, we have the empirical difference in means of the second feature between the two clusters, i.e.,$\sum_{i\in {\hat{{G}}}}\mathbf{x}_{i,2}/|\hat{{G}}| - \sum_{i\in \hat{G}'}\mathbf{x}_{i,2}/|\hat{G}'|$  (`test_stats`), the naive p-value based on a z-test (`p_naive`), and the selective $p$-value (`p_selective`). In this case, the test based on $p_{\text{selective}}$ can reject this null hypothesis that the blue and pink clusters have the same mean in the first feature ($p_{2,\text{selective}}<0.001$).
+In the summary, we have the empirical difference in means of the second feature between the two clusters, i.e.,$\sum_{i\in \hat{G}}\mathbf{x}_{i,2}/|\hat{{G}}| - \sum_{i\in \hat{G}'}\mathbf{x}_{i,2}/|\hat{G}'|$  (`test_stats`), the naive p-value based on a z-test (`p_naive`), and the selective $p$-value (`p_selective`). In this case, the test based on $p_{\text{selective}}$ can reject this null hypothesis that the blue and pink clusters have the same mean in the first feature ($p_{2,\text{selective}}<0.001$).
 
 ### Inference for k-means clustering when the null hypothesis holds
 
diff --git a/vignettes/technical_details.Rmd b/vignettes/technical_details.Rmd
index c73ae4d..cf1a0dd 100644
--- a/vignettes/technical_details.Rmd
+++ b/vignettes/technical_details.Rmd
@@ -19,7 +19,7 @@ knitr::opts_chunk$set(
 <center>
 
 ![](../man/figures/fig_1.png){width=90%}
-<figcaption>Figure 1: We simulated one dataset according to ${MN}_{100\times 10}(\mu, \textbf{I}_{100}, \Sigma)$, where $\mu_i = (1,0_9)^T$ for $i=1,\ldots, 50$ and $\mu_i = (0_9,1)^T$ for $i=51,\ldots, 100$, and $\Sigma_{ij} = 1\{i=j\}+0.4\cdot 1\{i\neq j\}$. *(a)*:  Empirical distribution of feature 2 based on the simulated data set. In this case, all observations have the same mean for feature 2. *(b)*: We apply k-means clustering to obtain two clusters and plot the empirical distribution of feature 2 stratified by the clusters. *(c)*: Quantile-quantile plot of naive z-test (black) our proposed p-values (orange) applied to the simulated data sets for testing the null hypotheses for a difference in means for features 2--8. </figcaption>
+<figcaption>Figure 1: We simulated one dataset according to $MN_{100\times 10}(\mu, \textbf{I}_{100}, \Sigma)$, where $\mu_i = (1,0_9)^T$ for $i=1,\ldots, 50$ and $\mu_i = (0_9,1)^T$ for $i=51,\ldots, 100$, and $\Sigma_{ij} = 1\{i=j\}+0.4\cdot 1\{i\neq j\}$. *(a)*:  Empirical distribution of feature 2 based on the simulated data set. In this case, all observations have the same mean for feature 2. *(b)*: We apply k-means clustering to obtain two clusters and plot the empirical distribution of feature 2 stratified by the clusters. *(c)*: Quantile-quantile plot of naive z-test (black) our proposed p-values (orange) applied to the simulated data sets for testing the null hypotheses for a difference in means for features 2--8. </figcaption>
 </center>