lale.lib.aif360.orbis module¶
- class lale.lib.aif360.orbis.Orbis(*, favorable_labels, protected_attributes, unfavorable_labels=None, estimator, redact=True, imbalance_repair_level=0.8, bias_repair_level=0.8, combine='keep_separate', sampling_strategy='mixed', replacement=False, n_jobs=1, random_state=None, k_neighbors=5)¶
Bases:
PlannedIndividualOp
Experimental Orbis (Oversampling to Repair Bias and Imbalance Simultaneously) pre-estimator fairness mitigator.
This documentation is auto-generated from JSON schemas.
Work in progress and subject to change; only supports pandas DataFrame so far. Uses SMOTE and RandomUnderSampler to resample not only for repairing class imbalance, but also group bias. Internally, this works by replacing class labels by the cross product of classes and groups, then changing the sizes of the new intersections to achieve the desired repair levels. Unlike other mitigators in lale.lib.aif360, this mitigator does not come from AIF360.
- Parameters
favorable_labels (array, >=1 items, not for optimizer) –
Label values which are considered favorable (i.e. “positive”).
items : union type
float
Numerical value.
or string
Literal string value.
or boolean
Boolean value.
or array, >=2 items, <=2 items of items : float
Numeric range [a,b] from a to b inclusive.
protected_attributes (array, >=1 items, not for optimizer) –
Features for which fairness is desired.
items : dict
feature : union type
Column name or column index.
string
or integer
reference_group : array, >=1 items
Values or ranges that indicate being a member of the privileged group.
items : union type
string
Literal value.
or float
Numerical value.
or array, >=2 items, <=2 items of items : float
Numeric range [a,b] from a to b inclusive.
monitored_group : union type, default None
Values or ranges that indicate being a member of the unprivileged group.
None
If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.
or array, >=1 items
items : union type
string
Literal value.
or float
Numerical value.
or array, >=2 items, <=2 items of items : float
Numeric range [a,b] from a to b inclusive.
unfavorable_labels (union type, not for optimizer, default None) –
Label values which are considered unfavorable (i.e. “negative”).
None
If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.
or array, >=1 items
items : union type
float
Numerical value.
or string
Literal string value.
or boolean
Boolean value.
or array, >=2 items, <=2 items of items : float
Numeric range [a,b] from a to b inclusive.
estimator (operator, not for optimizer) – Nested classifier.
redact (boolean, optional, not for optimizer, default True) – Whether to redact protected attributes before data preparation (recommended) or not.
imbalance_repair_level (float, >=0.0, <=1.0, optional, default 0.8) –
How much to repair for class imbalance (0 means original imbalance, 1 means perfect balance).
See also constraint-1.
bias_repair_level (float, >=0.0, <=1.0, optional, default 0.8) –
How much to repair for group bias (0 means original bias, 1 means perfect fairness).
See also constraint-1.
combine (‘keep_separate’, ‘and’, ‘or’, or ‘error’, optional, not for optimizer, default ‘keep_separate’) – How to handle the case when there is more than one protected attribute.
sampling_strategy (‘under’, ‘over’, ‘mixed’, ‘minimum’, or ‘maximum’, optional, not for optimizer, default ‘mixed’) –
- How to change the intersection sizes.
Possible choices are:
'under'
: under-sample large intersections to desired repair levels;'over'
: over-sample small intersection to desired repair levels;'mixed'
: mix under- with over-sampling while keeping sizes similar to original;'minimum'
: under-sample everything to the size of the smallest intersection;'maximum'
: over-sample everything to the size of the largest intersection.
See also constraint-1.
replacement (boolean, optional, not for optimizer, default False) – Whether under-sampling is with or without replacement.
n_jobs (integer, optional, not for optimizer, default 1) – The number of threads to open if possible.
random_state (union type, optional, not for optimizer, default None) –
Control the randomization of the algorithm.
None
RandomState used by np.random
or integer
The seed used by the random number generator
or numpy.random.RandomState
Random number generator instance.
k_neighbors (union type, optional, not for optimizer, default 5) –
Number of nearest neighbours to use to construct synthetic samples.
integer
Number of nearest neighbours to use to construct synthetic samples.
or Any
An estimator that inherits from
sklearn.neighbors.base.KNeighborsMixin
that will be used to find the n_neighbors.
Notes
constraint-1 : union type
When sampling_strategy is minimum or maximum, both repair levels must be 1.
sampling_strategy : negated type of ‘minimum’ or ‘maximum’
or dict
imbalance_repair_level : 1
bias_repair_level : 1
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
Features; the outer array is over samples.
items : array
items : union type
float
or string
y (union type) –
Target class labels; the array is over samples.
array of items : float
or array of items : string
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
Features; the outer array is over samples.
items : array
items : union type
float
or string
- Returns
result – Predicted class label per sample.
array of items : float
or array of items : string
- Return type
union type
- predict_proba(X)¶
Probability estimates for all classes.
Note: The predict_proba method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
Features; the outer array is over samples.
items : array
items : union type
float
or string
- Returns
result – The class probabilities of the input samples
array of items : Any
or array of items : array of items : Any
- Return type
union type