docs: add loss function details

godatadriven · Jan 5, 2021 · 70f705d · 70f705d
1 parent 2ba6334
commit 70f705d
Show file tree

Hide file tree

Showing 2 changed files with 25 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -148,3 +148,24 @@ Note that the inputs to `MyPivenModel` must match the inputs to the `piven_model
 You can now call all methods defined as in the PivenBaseModel class. Check the 
 [PivenMlpModel class](https://gitlab.com/jasperginn/piven.py/-/blob/dev/src/piven/Models/mlp_regressor.py)
 for a more detailed example.
+
+## Details: loss function
+
+The piven loss function is more complicated than regular loss functions in that it combines three objectives:
+
+1. The coverage (number of observations within lower and upper PI) should be approximately 1-*a*, where *a* is the 
+desired significance level.
+2. The PI should not be too wide.
+3. The point-prediction should be as accurate as possible.
+
+The piven loss function combines these objectives into a single loss. The loss function takes three arguments.
+
+1. *alpha*: the desired significance level. Given this value, we aim for PI such that, if we re-run our experiments
+many times, the PI would include the true values on our outcome metric (1 - *alpha*) times.
+2. *lambda*: this is a hyperparameter controlling the relative importance of PI width versus PI coverage. As lambda
+shrinks down to 0, you will observe smaller PI at the cost of lower coverage.
+3. *soften*: technicality. Primarily used to ensure that the loss function can be optimized using a gradient-based
+solver.
+
+The default settings are those used by the paper's authors. You should probably leave them as they are unless you
+know what you are doing. For further details, see pp. 4-5 of the paper cited above. 
diff --git a/src/piven/metrics/numpy.py b/src/piven/metrics/numpy.py
@@ -31,16 +31,20 @@ def piven_loss(
     k_soft = _sigmoid((y_pred_pi_high - y_true) * soften) * _sigmoid(
         (y_true - y_pred_pi_low) * soften
     )
+    # 1 if obs between lower & upper PI, else 0
     k_hard = np.maximum(0.0, np.sign(y_pred_pi_high - y_true)) * np.maximum(
         0.0, np.sign(y_true - y_pred_pi_low)
     )
+    # Average of points between lower & upper PI
     mpiw_capt = (
         np.divide(
             np.sum(np.abs(y_pred_pi_high - y_pred_pi_low) * k_hard),
             np.sum(k_hard) + 0.001,
         ),
     )
+    # Coverage
     picp_soft = np.mean(k_soft)
+    # Interior point method --> lambda controls relative importantce of width v. coverage.
     qd_rhs_soft = (
         lambda_in * np.sqrt(n) * np.square(np.maximum(0.0, (1.0 - alpha) - picp_soft))
     )