You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Aequitas library is used for auditing bias and fairness in machine learning models. One key metric it computes is the total number of predicted positives (k), crucial for further fairness metrics calculations.
Issue
Currently, the computation of k on line 130 of group.py assumes there are no missing values in the predictions. This leads to inaccurate calculations if the data contains missing values. Additionally, the current method calculates k across all groups together, which is done on line 164. This method might mask disparities in the predicted positives across different demographic groups.
Suggested Improvement
It would be beneficial to handle missing values explicitly, either by excluding them with a warning or by offering an option to impute them based on user preference. Furthermore, calculating k separately for each group and then summing these values can provide a clearer view of model behavior across different groups. This approach would enhance the transparency and utility of the fairness assessment.
Below is a proposed change in the calculation method:
# Proposed method to calculate k group-wise and handle missing valuesgrouped=df.groupby('group')
k_per_group=grouped.apply(lambdax: x[x[score] ==1].dropna().shape[0])
total_k=k_per_group.sum()
The text was updated successfully, but these errors were encountered:
Background
The Aequitas library is used for auditing bias and fairness in machine learning models. One key metric it computes is the total number of predicted positives (k), crucial for further fairness metrics calculations.
Issue
Currently, the computation of
k
on line 130 ofgroup.py
assumes there are no missing values in the predictions. This leads to inaccurate calculations if the data contains missing values. Additionally, the current method calculatesk
across all groups together, which is done on line 164. This method might mask disparities in the predicted positives across different demographic groups.Suggested Improvement
It would be beneficial to handle missing values explicitly, either by excluding them with a warning or by offering an option to impute them based on user preference. Furthermore, calculating
k
separately for each group and then summing these values can provide a clearer view of model behavior across different groups. This approach would enhance the transparency and utility of the fairness assessment.Below is a proposed change in the calculation method:
The text was updated successfully, but these errors were encountered: