NextOpt team build a reliable defense system with failure prediction, system risk measure and optimization, mechanism design techniques.
Target quantity (phi) has similarity with hyperparameter in that it affects data only by means of model parameter (theta) and also they usually await modeler's decision. Think of decision variable (e.g. portfolio weight) and hyperparameter tuning (e.g. pararmeter of hyperprior). Retroactive exploration of data generation process phi -> theta -> (x,y) process (e.g. decision -> parameter -> data) provides tools for (X, Y) -> Theta -> Phi and therefore, posterior of Phi. Captitalized (X, Y, Theta, Phi) represent population random variable (with identified distribution) while (x, y, theta, phi) denote sample values. Two approaches exist for phi posterior:
- Deductive
Phi|X,Y
density via analytic calculation i.e. getPhi|Theta
andTheta|X,Y
then marginalizing outTheta
. Integration is replaced with sum for discrete theta e.g. scenarios. - Inductive
phi|x,y
sample via computation. Satisfying targeted simulation-based calibration is the necessary condition which determines sample credibility by inspecting the consistency of sampling mechanism. Sampling mechanism consists of three simulators: prior, data, and posterior.
- Systme risk management on top of its vertical and horizontal interaction is our interest and we use hierarchical and mixture model as our main frame.
- Simulation approach is being updated in Moon S. system dynamics blog.
- Contract is being updated in Moon S. supply contract blog.
In R
folder, topics with the following keyword are sorted. Notice the change of background space: from data = Y to parameter = Theta than to quantity of interest := f(Theta) | X. X is a predictor with (n x p) matrix and y is a data with length n vector. Decision is the best example for quantity of interest (QI) whose posterior distribution is attained with the help of parameter.
Y|y
in data space In the presence of limited data, impute raw data with certified assumptions and construct the generative process from Theta to E[Y].
- transform scale and distribution.
- explore data via conditioning predictor: E[theta|X=a] vs E[theta|X=b].
- set resolution. Bin the time axis (e.g. age of the product) by considering the amount of data for individual period interval and the granuality of required forecast.
- identify outlier. Drop noise that murk the main relationship between Ey and theta while preserve the main component of thier relationship Impute two types of data. Data exploration and signal to noise ratio prior knowledge are helpful. Quantile-based drop (or replacement) is the most common.
- incorporate expert knowledge on distribution and range of data.
- construct test set for each scenario or layer. For K-layer hierarchical model (HM), K possible cases exist that needs separate testing; e.g two-layer HM with 1-5-99 engine_archetype(phi)-engine(theta[1..5])-ship(mu[1..99]). Two testsets, first with known engine and unkown ship, and the second, both unkown engine and ship, need construction.
X|x
in parameter space
- generate scaled timeseries features: trend, seasonality, event, self-lag etc.
- generate hierarchical feature i.e. group index.
- select feature. e.g. blackbox forward and backward selection algorithm while more adaptive spike-and-slad or more transparent causal effect based selection are possible.
Theta|XY
is from data to parameter space
- infer parameter values given data for each predictor.
- design pooling structure between different predictors with the assumption Theta|X = a is similar to Theta|X = b.
extreme
increase estimation/simulation efficiency using splitting, exploiting regeneration structure**, verification techniques to model extreme event whereTheta|XY
is highly inefficient.
Phi|Theta
is from parameter to quantity of interest space
- design pooling strucutre in data space (determine the model weight for Bayesian model averaging and stacking especially in HM) and parameter space (aggregate parameter distribution from different models on joint parameter space). Bayesain aggregation provides its overview.
- identify target quantity components e.g. maintenance cost (target quantity in preventive maintenance) consists of inspection and corrective cost.
- directly address the decision problem (Bayesian optimization, Smart predict then optimize).
- To verify
theta|x,y
orqi|x,y
, Simulation-based calibration which is maintained in another repo can be applied.
-
*Mixed pooling of seasonality for time series forecasting: An application to pallet transport data Moon, H., Song, B., & Lee H. (2020), under revision.
-
**Exploiting regenerative structure to estimate finite time averages via simulation Kang, W., Shahabuddin, P., & Whitt, W. (2006)
Rare event estimation and simulation techniques
-
Modelling Extremal Events estimation explains how to estimate the tails of distributions and Ch.6 is the most illustrative.
-
Introduction to Rare Event Simulation shows efficient Monte Carlo computation to estimate occurrence proabaility of rare events. Rare-Event Simulation Techniques: An Introduction and Recent Advances (survey) introduce a shorter overview.
An Axiomatic Approach to Systemic Risk
- Naval Vessel Spare Parts Demand Forecasting Using Data Mining, Yoon17
- Forecasting Spare Parts Demand of Military Aircraft: Comparisons of Data Mining Techniques and Managerial Features from the Case of South Korea
- 모듈형 엔진의 수명관리를 고려한 항공기-임무 할당 모형
tbc
- For the techniques we (NextOpt team) developed, refer to the writeup folder.
- Inventory management is within our research scope but, not in the near future.