lale.lib.aif360.util module¶
- class lale.lib.aif360.util.FairStratifiedKFold(*, favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None, n_splits: int = 5, n_repeats: int = 1, shuffle: bool = False, random_state=None)[source]¶
Bases:
object
Stratified k-folds cross-validator by labels and protected attributes.
Behaves similar to the StratifiedKFold and RepeatedStratifiedKFold cross-validation iterators from scikit-learn. This cross-validation object can be passed to the cv argument of the auto_configure method.
- Parameters
favorable_labels (array) – Label values which are considered favorable (i.e. “positive”).
protected_attributes (array) – Features for which fairness is desired.
unfavorable_labels (array or None, default None) – Label values which are considered unfavorable (i.e. “negative”).
n_splits (integer, optional, default 5) – Number of folds. Must be at least 2.
n_repeats (integer, optional, default 1) – Number of times the cross-validator needs to be repeated. When >1, this behaves like RepeatedStratifiedKFold.
shuffle (boolean, optional, default False) – Whether to shuffle each class’s samples before splitting into batches. Ignored when n_repeats>1.
random_state (union type, not for optimizer, default None) –
When shuffle is True, random_state affects the ordering of the indices.
None
RandomState used by np.random
numpy.random.RandomState
Use the provided random state, only affecting other users of that same random state instance.
integer
Explicit seed.
- get_n_splits(X=None, y=None, groups=None) int [source]¶
The number of splitting iterations in the cross-validator.
- Parameters
X (Any) – Always ignored, exists for compatibility.
y (Any) – Always ignored, exists for compatibility.
groups (Any) – Always ignored, exists for compatibility.
- Returns
The number of splits.
- Return type
integer
- split(X, y, groups=None)[source]¶
Generate indices to split data into training and test set.
X : array of items : array of items : Any
Training data, including columns with the protected attributes.
y : union type
Target class labels; the array is over samples.
array of items : float
array of items : string
groups : Any
Always ignored, exists for compatibility.
- Returns
result –
train
The training set indices for that split.
test
The testing set indices for that split.
- Return type
- lale.lib.aif360.util.accuracy_and_disparate_impact(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None, fairness_weight: float = 0.5) _AccuracyAndDisparateImpact [source]¶
Create a scikit-learn compatible blended scorer for accuracy and symmetric disparate impact given the fairness info. The scorer is suitable for classification problems, with higher resulting scores indicating better outcomes. The result is a linear combination of accuracy and symmetric disparate impact, and is between 0 and 1. This metric can be used as the scoring argument of an optimizer such as Hyperopt, as shown in this demo.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
fairness_weight (number, >=0, <=1, default=0.5) – At the default weight of 0.5, the two metrics contribute equally to the blended result. Above 0.5, fairness influences the combination more, and below 0.5, fairness influences the combination less. In the extreme, at 1, the outcome is only determined by fairness, and at 0, the outcome ignores fairness.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.average_odds_difference(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None) _AverageOddsDifference [source]¶
Create a scikit-learn compatible average odds difference scorer given the fairness info. Average of difference in false positive rate and true positive rate between unprivileged and privileged groups.
The ideal value of this metric is 0. A value of <0 implies higher benefit for the privileged group and a value >0 implies higher benefit for the unprivileged group. Fairness for this metric is between -0.1 and 0.1.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.balanced_accuracy_and_disparate_impact(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None, fairness_weight: float = 0.5) _BalancedAccuracyAndDisparateImpact [source]¶
Create a scikit-learn compatible blended scorer for balanced accuracy and symmetric disparate impact given the fairness info. The scorer is suitable for classification problems, with higher resulting scores indicating better outcomes. The result is a linear combination of accuracy and symmetric disparate impact, and is between 0 and 1. This metric can be used as the scoring argument of an optimizer such as Hyperopt, as shown in this demo.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
fairness_weight (number, >=0, <=1, default=0.5) – At the default weight of 0.5, the two metrics contribute equally to the blended result. Above 0.5, fairness influences the combination more, and below 0.5, fairness influences the combination less. In the extreme, at 1, the outcome is only determined by fairness, and at 0, the outcome ignores fairness.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.count_fairness_groups(X: Union[DataFrame, ndarray], y: Union[Series, ndarray], favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None) DataFrame [source]¶
Count size of each intersection of groups induced by the fairness info.
- Parameters
X (array) – Features including protected attributes as numpy ndarray or pandas dataframe.
y (array) – Labels as numpy ndarray or pandas series.
favorable_labels (array) – Label values which are considered favorable (i.e. “positive”).
protected_attributes (array) – Features for which fairness is desired.
unfavorable_labels (array or None, default None) – Label values which are considered unfavorable (i.e. “negative”).
- Returns
result – DataFrame with a multi-level index on the rows, where the first level indicates the binarized outcome, and the remaining levels indicate the binarized group membership according to the protected attributes. Column “count” specifies the number of instances for each group. Column “ratio” gives the ratio of the given outcome relative to the total number of instances with any outcome but the same encoded protected attributes.
- Return type
pd.DataFrame
- lale.lib.aif360.util.dataset_to_pandas(dataset, return_only: Literal['X', 'y', 'Xy'] = 'Xy') Tuple[Optional[Series], Optional[Series]] [source]¶
Return pandas representation of the AIF360 dataset.
- Parameters
dataset (aif360.datasets.BinaryLabelDataset) – AIF360 dataset to convert to a pandas representation.
return_only ('Xy', 'X', or 'y') – Which part of features X or labels y to convert and return.
- Returns
result –
item 0: pandas Dataframe or None, features X
item 1: pandas Series or None, labels y
- Return type
- lale.lib.aif360.util.disparate_impact(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None) _DisparateImpact [source]¶
Create a scikit-learn compatible disparate_impact scorer given the fairness info (Feldman et al. 2015). Ratio of rate of favorable outcome for the unprivileged group to that of the privileged group.
In the case of multiple protected attributes, D=privileged means all protected attributes of the sample have corresponding privileged values in the reference group, and D=unprivileged means all protected attributes of the sample have corresponding unprivileged values in the monitored group. The ideal value of this metric is 1. A value <1 implies a higher benefit for the privileged group and a value >1 implies a higher benefit for the unprivileged group. Fairness for this metric is between 0.8 and 1.25.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.equal_opportunity_difference(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None) _EqualOpportunityDifference [source]¶
Create a scikit-learn compatible equal opportunity difference scorer given the fairness info. Difference of true positive rates between the unprivileged and the privileged groups. The true positive rate is the ratio of true positives to the total number of actual positives for a given group.
The ideal value is 0. A value of <0 implies disparate benefit for the privileged group and a value >0 implies disparate benefit for the unprivileged group. Fairness for this metric is between -0.1 and 0.1.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.f1_and_disparate_impact(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None, fairness_weight: float = 0.5) _F1AndDisparateImpact [source]¶
Create a scikit-learn compatible blended scorer for f1 and symmetric disparate impact given the fairness info. The scorer is suitable for classification problems, with higher resulting scores indicating better outcomes. The result is a linear combination of F1 and symmetric disparate impact, and is between 0 and 1. This metric can be used as the scoring argument of an optimizer such as Hyperopt, as shown in this demo.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
fairness_weight (number, >=0, <=1, default=0.5) – At the default weight of 0.5, the two metrics contribute equally to the blended result. Above 0.5, fairness influences the combination more, and below 0.5, fairness influences the combination less. In the extreme, at 1, the outcome is only determined by fairness, and at 0, the outcome ignores fairness.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.fair_stratified_train_test_split(X, y, *arrays, favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None, test_size: float = 0.25, random_state: Optional[Union[RandomState, int]] = None) Tuple [source]¶
Splits X and y into random train and test subsets stratified by labels and protected attributes.
Behaves similar to the train_test_split function from scikit-learn.
- Parameters
X (array) – Features including protected attributes as numpy ndarray or pandas dataframe.
y (array) – Labels as numpy ndarray or pandas series.
*arrays (array) – Sequence of additional arrays with same length as X and y.
favorable_labels (array) – Label values which are considered favorable (i.e. “positive”).
protected_attributes (array) – Features for which fairness is desired.
unfavorable_labels (array or None, default None) – Label values which are considered unfavorable (i.e. “negative”).
test_size (float or int, default=0.25) – If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples.
random_state (int, RandomState instance or None, default=None) –
Controls the shuffling applied to the data before applying the split. Pass an integer for reproducible output across multiple function calls.
None
RandomState used by numpy.random
numpy.random.RandomState
Use the provided random state, only affecting other users of that same random state instance.
integer
Explicit seed.
- Returns
result –
item 0: train_X
item 1: test_X
item 2: train_y
item 3: test_y
item 4+: Each argument in *arrays, if any, yields two items in the result, for the two splits of that array.
- Return type
- lale.lib.aif360.util.r2_and_disparate_impact(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None, fairness_weight: float = 0.5) _R2AndDisparateImpact [source]¶
Create a scikit-learn compatible blended scorer for R2 score and symmetric disparate impact given the fairness info. The scorer is suitable for regression problems, with higher resulting scores indicating better outcomes. It first scales R2, which might be negative, to be between 0 and 1. Then, the result is a linear combination of the scaled R2 and symmetric disparate impact, and is also between 0 and 1. This metric can be used as the scoring argument of an optimizer such as Hyperopt.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
fairness_weight (number, >=0, <=1, default=0.5) – At the default weight of 0.5, the two metrics contribute equally to the blended result. Above 0.5, fairness influences the combination more, and below 0.5, fairness influences the combination less. In the extreme, at 1, the outcome is only determined by fairness, and at 0, the outcome ignores fairness.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.statistical_parity_difference(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None) _StatisticalParityDifference [source]¶
Create a scikit-learn compatible statistical parity difference scorer given the fairness info. Difference of the rate of favorable outcomes received by the unprivileged group to the privileged group.
The ideal value of this metric is 0. A value of <0 implies higher benefit for the privileged group and a value >0 implies higher benefit for the unprivileged group. Fairness for this metric is between -0.1 and 0.1. For a discussion of potential issues with this metric see (Dwork et al. 2012).
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.symmetric_disparate_impact(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None) _SymmetricDisparateImpact [source]¶
Create a scikit-learn compatible scorer for symmetric disparate impact given the fairness info. For disparate impact <= 1.0, return that value, otherwise return its inverse. The result is between 0 and 1. The higher this metric, the better, and the ideal value is 1. A value <1 implies that either the privileged group or the unprivileged group is receiving a disparate benefit.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable
- lale.lib.aif360.util.theil_index(favorable_labels: List[Union[float, str, bool, List[float]]], protected_attributes: List[Dict[str, Any]], unfavorable_labels: Optional[List[Union[float, str, bool, List[float]]]] = None) _AIF360ScorerFactory [source]¶
Create a scikit-learn compatible Theil index scorer given the fairness info (Speicher et al. 2018). Generalized entropy of benefit for all individuals in the dataset, with alpha=1. Measures the inequality in benefit allocation for individuals. With :
A value of 0 implies perfect fairness. Fairness is indicated by lower scores, higher scores are problematic.
- Parameters
favorable_labels (array of union) –
Label values which are considered favorable (i.e. “positive”).
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
protected_attributes (array of dict) –
Features for which fairness is desired.
feature : string or integer
Column name or column index.
reference_group : array of union
Values or ranges that indicate being a member of the privileged group.
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array of union
string
Literal value
or number
Numerical value
or array of numbers, >= 2 items, <= 2 items
Numeric range [a,b] from a to b inclusive.
- Returns
result – Scorer that takes three arguments
(estimator, X, y)
and returns a scalar number. Furthermore, besides being callable, the returned object also has two methods,score_data(y_true, y_pred, X)
for evaluating datasets andscore_estimator(estimator, X, y)
for evaluating estimators.- Return type
callable