A Simple Illustration of the Most Probable Point, and an example use case to calculate the quantile of an output function.
Source files to knit the R-markdown is in the RMD
folder. The knitted html is here.
The following is a trimmed-down version of the RMD for proper display in github. Please see this for the complete version.
Despite criticisms and called by different names, Most Probable Point (MPP) is a fascinating concept, arisen from the field of structural reliability. Other names include Most Probable Failure Point, Most Central Limit-state Point, Most Central Failure Point, Design Point. The last one is deemed the most satisfactory. Nevertheless, we call it MPP in this markdown.
In structural reliability, a performance function characterizes the response of the system: with a negative value indicating failure and a positive value implying that the system is safe. The border between the negative and positive domains is called the limit state: . Reliability concerns with the probability of not failing, ie. the complement of the probability of failure : .
The most common method to calculate the probability of failure is the Monte Carlo simulation. However, the required number of samples increases drastically when is getting small. MPP provides an approximation to , with computation cost that does not depend on the magnitude of , because finding MPP is an optimization problem.
Hasofer-Lind prescribed a procedure to obtain the MPP:
- Transform all of the input random variables from the physical space into uncorrelated standardized normal random variables.
- Find the MPP. The distance from the origin to the MPP in the standardized uncorrelated space is called the Hasofer-Lind index , also called the reliability index.
With First Order Reliability Method (FORM) approximation of the limit state, the probability of failure is given by , where is the cumulative distribution function (CDF) of the standard normal distribution.
In a very restrictive case, if the performance function is linear, and therefore the limit state is a hyper-plane, then the probability of failure is given exactly by , where is the distance from the origin to MPP on the hyper-plane of the linear limit-state function in the uncorrelated standardized normally distributed input space. In more relaxed cases, when the performance function is non-linear, or even when the performance function is linear, transformation of input variables can result in non-linear limit-state function, becomes only an approximation to .
Since is simply a CDF, which is (approximately) , and since is the length of the hyper-sphere of the standardized input random variables , another subtle use case of MPP is that we can find the quantile of the output function by calculating . For example, if we want to find the output quantile at 4-sigma (ie. ), we can find the MPP point on the hyper-plane that coincides with the hyper-sphere of the standardized input random variables, then the output quantile at 4-sigma is . The example given in the last section is solving output quantile instead of calculating .
We picked a 2D input example for easy visualization. Consider a non-linear performance function with 2D inputs, one being log-normally distributed and a second one being normally distributed, with a known correlation coefficient among the two: , with , , and = 0.65. Find the quantile of using MPP and verify it with Monte Carlo simulations.
The first step of the Hasofer-Lind procedure is to express as with , being uncorrelated normally distributed inputs.
-
Correlate to get , with still being normally distributed , using yet to be determined :
-
Transform normal into lognormal:
Given , we can calculate its mean and variance:
Using the semi-empirical equations provided in Kiureghian and Liu 1986:
Then, we can calculate to be substituted into step 2 above:
Following step 1 to 4 outlined above, becomes:
The second step of the Hasofer-Lind procedure is to find MPP. The following code shows a crude iterative method to find the MPP point on the 4-sigma hyper-sphere of the inputs, such that the vector of origin to MPP is parallel (for , or anti-parallel for ) with the gradient at the MPP .
Let us calculate for the log-normal variable:
#lognormal: m1=1.5, s1=0.5
m1= 1.5; s1= 0.5
m_x1= exp(m1 + 0.5*s1^2)
var_x1= (exp(s1^2)-1)*exp(2*m1+s1^2)
delta_x1= sqrt(var_x1)/m_x1
f_x1= delta_x1/sqrt(log(1+delta_x1^2))
rho= 0.65
rho_0= f_x1* rho
print (rho_0)
## [1] 0.6928225
We will setup the functions to calculate and .
# calculate L2-norm
f_norm <- function(x) {
return ( ((x %*% x)[1, 1])^0.5 )
}
# g
f_g <- function(u) {
u1= u[1]
u2= u[2]
return ( (exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))^2 + 3*(-exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))^2 )
}
# dg returns the u1 and u2 components of the gradient g
f_dg <- function(u) {
u1= u[1]
u2= u[2]
return (c(
(2*(exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(0.5*exp(0.5*u1+1.5)+1.5*rho_0) + 6*(-exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(-0.5*exp(0.5*u1+1.5)+1.5*rho_0)) ,
(2*(exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(1.5*sqrt(1-rho_0^2)) + 6*(-exp(0.5*u1+1.5) + (1.5*(rho_0*u1+sqrt(1-rho_0^2)*u2)+7))*(1.5*sqrt(1-rho_0^2)) )) )
}
r= 4
# u1 and u2 are initial guess of points along the 4sigma circle
u1= 4
u2= 0
u_ = c(u1,u2)
for (i in 1:10) {
g= f_g(u_)
dg_ = f_dg(u_)
dot= ((u_/r) %*% (dg_/f_norm(dg_) ) )[1,1]
print (c(u_, g, dot))
#un are new points
#There are two MPP i.e. at +/-4sigma, we run this algorithm at +/- the gradient
#un = u_/r +/- dg_/f_norm(dg_)
un = u_/r + dg_/f_norm(dg_)
u_ = un/f_norm(un) * r
}
## [1] 4.0000000 0.0000000 3406.5736598 0.9999159
## [1] 3.99991587 -0.02594262 3407.48601719 0.99997857
## [1] 3.99980952 -0.03903605 3407.71848632 0.99999454
## [1] 3.99973957 -0.04564397 3407.77770205 0.99999861
## [1] 3.99970012 -0.04897873 3407.79278408 0.99999965
## [1] 3.99967916 -0.05066164 3407.79662523 0.99999991
## [1] 3.99966831 -0.05151093 3407.79760350 0.99999998
## [1] 3.99966277 -0.05193952 3407.79785264 0.99999999
## [1] 3.99965996 -0.05215581 3407.79791608 1.00000000
## [1] 3.99965853 -0.05226496 3407.79793224 1.00000000
From the code chunk above, the MPP point for is , and .
r= 4
# u1 and u2 are initial guess of points along the 4sigma circle
u1= 4
u2= 0
u_ = c(u1,u2)
for (i in 1:10) {
g= f_g(u_)
dg_ = f_dg(u_)
dot= ((u_/r) %*% (dg_/f_norm(dg_) ) )[1,1]
print (c(u_, g, dot))
#un are new points
#There are two MPP i.e. at +/-4sigma, we run this algorithm at +/- the gradient
#un = u_/r +/- dg_/f_norm(dg_)
un = u_/r - dg_/f_norm(dg_)
u_ = un/f_norm(un) * r
}
## [1] 4.0000000 0.0000000 3406.5736598 0.9999159
## [1] 0.02594262 3.99991587 391.87421829 0.82363021
## [1] -3.8117762 1.2125850 65.8802208 -0.3835513
## [1] -3.8435730 -1.1076763 10.0469124 -0.8515261
## [1] -3.3963520 -2.1130057 4.4191063 -0.9812408
## [1] -3.1757451 -2.4320039 4.0103315 -0.9995167
## [1] -3.1375572 -2.4810753 4.0016311 -0.9999968
## [1] -3.134428 -2.485027 4.001576 -1.000000
## [1] -3.134211 -2.485301 4.001575 -1.000000
## [1] -3.134196 -2.485320 4.001575 -1.000000
From the code chunk above, the MPP point for is , and .
To recap the points for , approximated by the MPP method are: 4.001575 and 3407.79793224.
Let us now turn to Monte-Carlo simulation (MCS) to verify them.
We pick the number of samples = to have enough points to calculate the quantiles. And follow the transformation of variables in step 1-4 of Hasofer-Lind procedure outlined before. Please be aware that the results may change when you re-run the code chunk
# number of samples
N= 3e7
# u-space: standardized, uncorrelated
u1= rnorm(N)
u2= rnorm(N)
#lognormal: m1=1.5, s1=0.5
#normal: m2=7, s2=1.5
m1= 1.5; s1= 0.5
m2= 7; s2= 1.5
m_x1= exp(m1 + 0.5*s1^2)
var_x1= (exp(s1^2)-1)*exp(2*m1+s1^2)
delta_x1= sqrt(var_x1)/m_x1
f_x1= delta_x1/sqrt(log(1+delta_x1^2))
rho= 0.65
rho_0= f_x1* rho
#z-space: standardize, correlated
z1= u1
z2= rho_0 * u1 + sqrt(1-rho_0^2) * u2
#v-space: scaled and shifted
v1= s1*z1 + m1
v2= s2*z2 + m2
#x-space:
x1= exp(v1)
x2= v2
Let us verify that we have , , and = 0.65 after the variable transformations.
We are plotting the histogram of and below, overlaid with the corresponding distributions (red curve), ie: , :
library(ggplot2)
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 4.1.2
plothistx1 <- ggplot(data= data.frame(X1 = x1)) +
geom_histogram(mapping=aes(x=X1, y=..density..), fill="steelblue", color="black", bins=50) +
stat_function(fun=dlnorm, args=list(mean=m1, sd=s1), color="red", size=1) +
xlim(0, 20)
plothistx2 <- ggplot(data= data.frame(X2 = x2)) +
geom_histogram(mapping=aes(x=X2, y=..density..), fill="steelblue", colour="black", bins=50) +
stat_function(fun=dnorm, args=list(mean=m2, sd=s2), color="red", size=1) +
xlim(m2-6*s2, m2+6*s2)
ggarrange(plothistx1, plothistx2, ncol=2, nrow=1, labels=c("X1", "X2"))
## Warning: Removed 41703 rows containing non-finite values (stat_bin).
## Warning: Removed 2 rows containing missing values (geom_bar).
## Warning: Removed 2 rows containing missing values (geom_bar).
cor(x1,x2)
## [1] 0.6499332
Let us calculate the quantiles:
# calculate g(x)
g= (x1+x2)^2 + 3*(-x1+x2)^2
quantile(g, pnorm(-4))
## 0.003167124%
## 4.76026
quantile(g, pnorm(4))
## 99.99683%
## 3410.564
I am getting quantile of to be 4.631767, and quantile of to be 3402.384 in one of the runs. And of course the results may change as you re-run the code chunk.
gdf= data.frame(G=g)
qqall <- ggplot(data= gdf, aes(sample=G)) + stat_qq() + stat_qq_line(color="red")
qqn4 <- ggplot(data= gdf, aes(sample=G)) + stat_qq() + stat_qq_line() + xlim(-4.1, -3.9) + ylim(3.5, 5)
qqp4 <- ggplot(data= gdf, aes(sample=G)) + stat_qq() + stat_qq_line() + xlim(3.9, 4.1) + ylim(3300, 3500)
ggarrange(qqall, qqn4, qqp4, ncol=3, nrow=1)
## Warning: Removed 29999543 rows containing missing values (geom_point).
## Warning: Removed 2 row(s) containing missing values (geom_path).
## Warning: Removed 29999736 rows containing missing values (geom_point).
## Warning: Removed 2 row(s) containing missing values (geom_path).
Notice that the approximated from MPP is relatively close to that from (ideal) MCS, whereas the from MPP approximation is not that accurate. We will see why now.
# contour
zxy = expand.grid(seq(-6,6,0.1),seq(-6,6,0.1))
g = matrix( (exp(s1*zxy$Var1+m1)+(s2*rho_0*zxy$Var1+s2*sqrt(1-rho_0^2)*zxy$Var2+m2))^2 + 3*(- exp(s1*zxy$Var1+m1) + (s2*rho_0*zxy$Var1+s2*sqrt(1-rho_0^2)*zxy$Var2+m2))^2 , length(seq(-6,6,0.1)), length(seq(-6,6,0.1)) )
contour(seq(-6,6,0.1),seq(-6,6,0.1),g, levels=c(2, 4.001575, 30, 100, 300, 1000, 3407.79793777, 5000), asp=1, xlim=c(-6,6), ylim=c(-6,6))
# plotting the 4 sigma circle
r= 4
t= seq(-pi,pi,2*pi/360)
xr= r*cos(t)
yr= r*sin(t)
lines(xr, yr, col= "red")
# MPP points
points(c(-3.134196, 3.99965853),c(-2.485320, -0.05226496), pch=19, col="blue")
abline(v=seq(-6,6), h=seq(-6,6), lty=2, col='lightgray')
From the contour plot above, and since this is a simple 2-D input example, we can tell that the contour (ie. state-limit) at point (G=3407.798) is closer to a linear hyper-plane, whereas the contour point (G=4.001575) is way curvier and won’t be approximated well by a linear hyper-plane.
In summary, we illustrated Hasofer-Lind MPP procedure can be used to approximate the output quantile, as long as the limit-state can be approximated by a hyper-plane.