From 6295a37abf051d670cb441896bd12187ec13662c Mon Sep 17 00:00:00 2001
From: yiqunchen We first generate data according to \(\mathbf{X} \sim {MN}_{n\times q}(\boldsymbol{\mu},
+ We first generate data according to \(\mathbf{X} \sim MN_{n\times q}(\boldsymbol{\mu},
\textbf{I}_n, \sigma^2 \textbf{I}_q)\) with \(n=150,q=2,\sigma=1,\) and \[\begin{align}
\label{eq:power_model}
\boldsymbol{\mu}_1 =\ldots = \boldsymbol{\mu}_{50} = \begin{bmatrix}
@@ -95,9 +95,9 @@ In the summary, we have the empirical difference in means of the
-second feature between the two clusters, i.e.,\(\sum_{i\in
-{\hat{{G}}}}\mathbf{x}_{i,2}/|\hat{{G}}| - \sum_{i\in
-\hat{G}'}\mathbf{x}_{i,2}/|\hat{G}'|\)
+second feature between the two clusters, i.e.,\(\sum_{i\in \hat{G}}\mathbf{x}_{i,2}/|\hat{{G}}| -
+\sum_{i\in \hat{G}'}\mathbf{x}_{i,2}/|\hat{G}'|\)
( By inspection, we see that the blue clusters (labeled as cluster 1)
-and the grey clusters (labeled as cluster 4) have the same mean. Now
-\(p_{\text{selective}}\) yields a much
+and the grey clusters (labeled as cluster 4) have the same mean. Now the
+selective \(p\)-value yields a much
more moderate \(p\)-value, and the test
-based on \(p_{2,\text{selective}}\)
-cannot reject the null hypothesis when it holds. By contrast, the naive
-\(p\)-value is tiny and leads to an
-anti-conservative test.Tutorials for k-means clustering inference
-Tutorials for k-means clustering inference
\boldsymbol{\mu}_{101}=\ldots = \boldsymbol{\mu}_{150} = \begin{bmatrix}
\delta/2 \\ 0_{q-1}
\end{bmatrix}.
-\end{align}\] Here, we can think of \({C}_1 = \{1,\ldots,50\},{C}_2 =
-\{51,\ldots,100\},{C}_3 = \{101,\ldots,150\}\) as the “true
-clusters”. In the figure below, we display one such simulation \(\mathbf{x}\in\mathbb{R}^{100\times 2}\)
+\end{align}\]
set.seed(2022)
@@ -187,9 +187,8 @@
Inference
#> cluster_1 cluster_2 test_stat p_selective p_naive
#> 1 2 3 4.464756 8.514513e-29 2.171388e-110
test_stats
), the naive p-value based on a z-test
(p_naive
), and the selective \(p\)-value (p_selective
). In
this case, the test based on \(p_{\text{selective}}\) can reject this null
@@ -221,13 +220,12 @@ Inferen
#> 2 0 0 50 0
#> 3 25 0 0 25
cluster_1 <- 1
cluster_2 <- 4
diff --git a/docs/articles/Tutorials_hier.html b/docs/articles/Tutorials_hier.html
index 59677d3..4ac46b2 100644
--- a/docs/articles/Tutorials_hier.html
+++ b/docs/articles/Tutorials_hier.html
@@ -156,10 +156,10 @@ Infer
cluster_1 <- 1
cluster_2 <- 3
-cl_1_2_inference_demo <- test_hier_clusters_exact_1f(X=X, link="average", hcl=hcl, K=3, k1=1, k2=2, feat=1)
-summary(cl_1_2_inference_demo)
+cl_inference_demo <- test_hier_clusters_exact_1f(X=X, link="average", hcl=hcl, K=3, k1=cluster_1, k2=cluster_2, feat=1)
+summary(cl_inference_demo)
#> cluster_1 cluster_2 test_stat p_selective p_naive
-#> 1 1 2 4.464756 8.870985e-08 1.774197e-07
+#> 1 1 3 9.910708 2.868783e-26 4.596766e-31
In the summary, we have the empirical difference in means of the
first feature between the two clusters, i.e.,\(\sum_{i\in
{\hat{{G}}}}\mathbf{x}_{i,2}/|\hat{{G}}| - \sum_{i\in
@@ -179,15 +179,17 @@ \(p_{2,\text{selective}}\)
-yields a much more moderate \(p\)-value, and the test based on \(p_{2,\text{selective}}\) cannot reject the
-null hypothesis when it holds. By contrast, the naive \(p\)-value is tiny and leads to an
-anti-conservative test.
Now the selective \(p\)-value yields +a much more moderate \(p\)-value, and +the test based on it cannot reject the null hypothesis when it holds. By +contrast, the naive \(p\)-value is tiny +and leads to an anti-conservative test.
cluster_1 <- 1
-cluster_2 <- 4
-cl_1_2_inference_demo <- test_hier_clusters_exact_1f(X=X, link="average", hcl=hcl, K=3, k1=1, k2=3, feat=2)
-summary(cl_1_2_inference_demo)
+cluster_2 <- 3
+cl_inference_demo <- test_hier_clusters_exact_1f(X=X, link="average",
+ hcl=hcl, K=3, k1=cluster_1, k2=cluster_2, feat=2)
+summary(cl_inference_demo)
#> cluster_1 cluster_2 test_stat p_selective p_naive
#> 1 1 3 -0.1766818 0.8362984 0.8362984
e?UnN|b4_fR2=U^J+#tzT6Pz!g(O<1G8L2Wz$U
z^%bO~$k(e`U?q9EL_$%&ElE97-t0MhtIrm686${GC=s^ah=tLfLi_#&t@yPEIv4ad
ziF6b#wzbkI>96C*1B|}qvpgQY_ierypw1XjA(&Uy7vdH$wfm26paaUU--`CT@dRgA
zIHlvw=SD&kg~nL)yAG#KR~zTX1`cVrlC~1Ka#}w=1Qf}o5Nr5M@U5A2D558~6YAVw
z2a=zi&i-Q8+po-}L+|!1vw$FUa7E9;8F9_rnlkJ1A6*6Jw(eh%YDwJwSnBt~n2F}p
zb>!@&)NX<$;1?6ptgDYNG#(My`w)tf1f@Q2>uIEwqrxd8a1z6I)UDxPJ}aJiZ5&T_
zLK~Rj(O Zbbdd0aDS
z!)(cm?J*r(--oZ=bliVIl=({0?=J#o2n|dwMZorJN0J#`U0o(#ihtuI%kn8LVsi7~
z-T!)T7TRm>ggfB6$Ilob7rt0k^`Yz<|HoVKDgHtn0S`c^Ct-&83t43}%CX7*z8@#<
z72m(x%|hMyf4>ex{Z|P7|8)9(C_Mk)|E0Rd&avliIx6xNj3Q70?#KubtCR2w3u6bg
zA__-vn`a2$C}F0l1pE7=#wRA;Wm@1DN8UUx4P2UnuuDBnFaeE!`y96W^!8!9Rdt(H
zbyjUHc|=6Se;A4bc~30k9BzoSC2pVV{x@DZ!RCk&cVPlC6BA>$tOtZ=%sSnKM
z3f8TQ8a`8UX|W70gM3-a&TcDm{cwHQc`3o7mJ5CMOSA5he_
Pgh7LrH=U$(5=_4(z`5}2aS$K}tf>g9#
zf7JhLMH?;yxjpFzC9#vOolUFt%6%@q#n1l?TUc7+G}M~)Q8rv&oDNzDlwX>)RXMGG
zE1O{Ym#9tA)k!+srq6cf+AW|uQ=6NaWkJP
wa;T?
zI-1)3s)Brc*W%WZgi59V6-F+b?$BS9sJ_<4sO~lvgyk40M;EfPN0XfUb#5
eZ63ewMxcd&$j|j=7P(K8$e(LMhfG7s~dj