lale.lib.aif360.bagging_orbis_classifier module

class lale.lib.aif360.bagging_orbis_classifier.BaggingOrbisClassifier(*, favorable_labels, protected_attributes, unfavorable_labels=None, redact=True, preparation=None, estimator=None, n_estimators=10, imbalance_repair_level=0.8, bias_repair_level=0.8, combine='keep_separate', sampling_strategy='mixed', replacement=False, n_jobs=1, random_state=None)

Bases: PlannedIndividualOp

Experimental BaggingOrbisClassifier in-estimator fairness mitigator.

This documentation is auto-generated from JSON schemas.

Work in progress and subject to change; only supports pandas DataFrame so far. Bagging ensemble classifier, where each inner classifier gets trained on a subset of the data that has been balanced with Orbis. Unlike other mitigators in lale.lib.aif360, this mitigator does not come from AIF360.

Parameters
  • favorable_labels (array, >=1 items, not for optimizer) –

    Label values which are considered favorable (i.e. “positive”).

    • items : union type

      • float

        Numerical value.

      • or string

        Literal string value.

      • or boolean

        Boolean value.

      • or array, >=2 items, <=2 items of items : float

        Numeric range [a,b] from a to b inclusive.

  • protected_attributes (array, >=1 items, not for optimizer) –

    Features for which fairness is desired.

    • items : dict

      • feature : union type

        Column name or column index.

        • string

        • or integer

      • reference_group : array, >=1 items

        Values or ranges that indicate being a member of the privileged group.

        • items : union type

          • string

            Literal value.

          • or float

            Numerical value.

          • or array, >=2 items, <=2 items of items : float

            Numeric range [a,b] from a to b inclusive.

      • monitored_group : union type, default None

        Values or ranges that indicate being a member of the unprivileged group.

        • None

          If monitored_group is not explicitly specified, consider any values not captured by reference_group as monitored.

        • or array, >=1 items

          • items : union type

            • string

              Literal value.

            • or float

              Numerical value.

            • or array, >=2 items, <=2 items of items : float

              Numeric range [a,b] from a to b inclusive.

  • unfavorable_labels (union type, not for optimizer, default None) –

    Label values which are considered unfavorable (i.e. “negative”).

    • None

      If unfavorable_labels is not explicitly specified, consider any labels not captured by favorable_labels as unfavorable.

    • or array, >=1 items

      • items : union type

        • float

          Numerical value.

        • or string

          Literal string value.

        • or boolean

          Boolean value.

        • or array, >=2 items, <=2 items of items : float

          Numeric range [a,b] from a to b inclusive.

  • redact (boolean, optional, not for optimizer, default True) – Whether to redact protected attributes before data preparation (recommended) or not.

  • preparation (union type, optional, not for optimizer, default None) –

    Transformer, which may be an individual operator or a sub-pipeline.

    • operator

    • or None

      NoOp

  • estimator (union type, optional, not for optimizer, default None) –

    The nested classifier to fit on balanced subsets of the data.

    • operator

    • or None

      DecisionTreeClassifier

  • n_estimators (integer, >=10 for optimizer, <=100 for optimizer, uniform distribution, optional, default 10) – The number of base estimators in the ensemble.

  • imbalance_repair_level (float, >=0.0, <=1.0, optional, default 0.8) –

    How much to repair for class imbalance (0 means original imbalance, 1 means perfect balance).

    See also constraint-1.

  • bias_repair_level (float, >=0.0, <=1.0, optional, default 0.8) –

    How much to repair for group bias (0 means original bias, 1 means perfect fairness).

    See also constraint-1.

  • combine (‘keep_separate’, ‘and’, ‘or’, or ‘error’, optional, not for optimizer, default ‘keep_separate’) – How to handle the case when there is more than one protected attribute.

  • sampling_strategy (‘under’, ‘over’, ‘mixed’, ‘minimum’, or ‘maximum’, optional, not for optimizer, default ‘mixed’) –

    How to change the intersection sizes.

    Possible choices are:

    • 'under': under-sample large intersections to desired repair levels;

    • 'over': over-sample small intersection to desired repair levels;

    • 'mixed': mix under- with over-sampling while keeping sizes similar to original;

    • 'minimum': under-sample everything to the size of the smallest intersection;

    • 'maximum': over-sample everything to the size of the largest intersection.

    See also constraint-1.

  • replacement (boolean, optional, not for optimizer, default False) – Whether under-sampling is with or without replacement.

  • n_jobs (integer, optional, not for optimizer, default 1) – The number of threads to open if possible.

  • random_state (union type, optional, not for optimizer, default None) –

    Control the randomization of the algorithm.

    • None

      RandomState used by np.random

    • or integer

      The seed used by the random number generator

    • or numpy.random.RandomState

      Random number generator instance.

Notes

constraint-1 : union type

When sampling_strategy is minimum or maximum, both repair levels must be 1.

  • sampling_strategy : negated type of ‘minimum’ or ‘maximum’

  • or dict

    • imbalance_repair_level : 1

    • bias_repair_level : 1

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    Features; the outer array is over samples.

    • items : array

      • items : union type

        • float

        • or string

  • y (union type) –

    Target class labels; the array is over samples.

    • array of items : float

    • or array of items : string

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

  • items : array

    • items : union type

      • float

      • or string

Returns

result – Predicted class label per sample.

  • array of items : float

  • or array of items : string

Return type

union type

predict_proba(X)

Probability estimates for all classes.

Note: The predict_proba method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

  • items : array

    • items : union type

      • float

      • or string

Returns

result – The class probabilities of the input samples

  • array of items : Any

  • or array of items : array of items : Any

Return type

union type