Skip to content

Latest commit

 

History

History
89 lines (65 loc) · 3.68 KB

linear_regression.md

File metadata and controls

89 lines (65 loc) · 3.68 KB

Federated Linear Regression

Linear Regression(LinR) is a simple statistic model widely used for predicting continuous numbers. FATE provides Heterogeneous Linear Regression(CoordinatedLinR).

Below lists features of Coordinated LinR model:

Linear Model Multi-Host Cross Validation Warm-Start
Hetero LinR
SSHE LinR

Coordinated LinR

CoordinatedLinR also supports multi-Host training.

Here we simplify participants of the federation process into three parties. Party A represents Guest, party B represents Host. Party C, which is also known as “Arbiter,” is a third party that works as coordinator. Party C is responsible for generating private and public keys. (Security of this algorithm is lower than SSHE-LinR, use SSHE-LinR if possible)

The process of HeteroLinR training is shown below:

Figure 1 (Federated HeteroLinR Principle)

A sample alignment process is conducted before training. The sample alignment process identifies overlapping samples in databases of all parties. The federated model is built based on the overlapping samples. The whole sample alignment process is conducted in encryption mode, and so confidential information (e.g. sample ids) will not be leaked.

In the training process, party A and party B each compute the elements needed for final gradients. Arbiter aggregates, calculates, and transfers back the final gradients to corresponding parties. For more details on the secure model-building process, please refer to this paper.

Features

  1. L1 & L2 regularization

  2. Mini-batch mechanism

  3. Weighted training

  4. Torch optimization methods:

    • rmsprop: RMSProp
      • adadelta: AdaDelta
      • adagrad: AdaGrad
      • adam: Adam
      • adamw: AdamW
      • adamax: Adamax
      • asgd: ASGD
      • nadam: NAdam
      • radam: RAdam
  • rprop: RProp > - sgd: gradient descent with arbitrary batch sizegorithm details can refer to this paper.
  1. Torch Learning Rate Scheduler methods:

    • constant
      • step
      • linear
  2. Three converge criteria:

    • diff
      > Use difference of loss between two iterations, not available > for multi-host training

      • abs
        > Use the absolute value of loss

      • weight_diff
        > Use difference of model weights

  3. Support multi-host modeling task.

Hetero-SSHE-LinR features:

  1. Mini-batch mechanism

  2. Support early-stopping mechanism

  3. Support setting arbitrary frequency for revealing loss