Hetero Feature Binning

Feature binning or data binning is a data pre-processing technique. It can be used to reduce the effects of minor observation errors, calculate information values and so on.

Currently, we provide quantile binning and bucket binning methods. To achieve quantile binning approach, we have used a special data structure mentioned in this paper. Feel free to check out the detail algorithm in the paper.

As for calculating the federated iv and woe values, the following figure can describe the principle properly.

As the figure shows, B party which has the data labels encrypt its labels with Addiction homomorphic encryption and then send to A. A static each bin's label sum and send back. Then B can calculate woe and iv base on the given information.

For multiple hosts, it is similar with one host case. Guest sends its encrypted label information to all hosts, and each of the hosts calculates and sends back the static info.

Features

Support Quantile Binning based on quantile summary algorithm.
Support Bucket Binning.
Support calculating woe and iv values.
Support transforming data into bin indexes or woe value(guest only).
Support multiple-host binning.
Support asymmetric binning methods on Host & Guest sides.

Below lists supported features with links to examples:

Cases	Scenario
Input Data with Categorical Features	bucket binning quantile binning
Output Data Transformed	bin index woe value(guest-only)
Skip Metrics Calculation	multi_host

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature_binning.md

feature_binning.md

Hetero Feature Binning

Features

Files

feature_binning.md

Latest commit

History

feature_binning.md

File metadata and controls

Hetero Feature Binning

Features