lale.lib.sklearn.isolation_forest module

class lale.lib.sklearn.isolation_forest.IsolationForest(*, n_estimators=100, max_samples='auto', contamination='auto', max_features=1.0, bootstrap=True, n_jobs=None, random_state=None, verbose=0, warm_start=False)

Bases: PlannedIndividualOp

Isolation forest from scikit-learn for getting the anomaly score of each sample using the IsolationForest algorithm.

This documentation is auto-generated from JSON schemas.

Parameters
  • n_estimators (integer, >=10 for optimizer, <=100 for optimizer, uniform distribution, default 100) – The number of base estimators in the ensemble.

  • max_samples (union type, default 'auto') –

    The number of samples to draw from X to train each base estimator.

    • integer, >=2, <=’X/maxItems’, not for optimizer

      Draw max_samples samples.

    • or float, >0.0, >=0.2 for optimizer, <=1.0, <=1.0 for optimizer

      Draw max_samples * X.shape[0] samples.

    • or ‘auto’

      Draw max_samples=min(256, n_samples) samples.

  • contamination (union type, not for optimizer, default 'auto') –

    The amount of contamination of the data set, i.e. the proportion of outliers in the data set. Used when fitting to define the threshold on the scores of the samples.

    • float, >=0.0, <=0.5

    • or ‘auto’

  • max_features (union type, default 1.0) –

    The number of features to draw from X to train each base estimator.

    • integer, >=2, <=’X/items/maxItems’, not for optimizer

      Draw max_features features.

    • or float, >0.0, >=0.01 for optimizer, <=1.0, <=1.0 for optimizer

      Draw max_samples * X.shape[1] features.

  • bootstrap (boolean, default True) – Whether samples are drawn with (True) or without (False) replacement.

  • n_jobs (union type, not for optimizer, default None) –

    The number of jobs to run in parallel for both fit and predict.

    • None

      1 unless in joblib.parallel_backend context.

    • or -1

      Use all processors.

    • or integer, >=1

      Number of CPU cores.

  • random_state (union type, not for optimizer, default None) –

    Controls the pseudo-randomness of the selection of the feature and split values for each branching step and each tree in the forest. If int, random_state is the seed used by the random number generator

    • integer

    • or numpy.random.RandomState

    • or None

  • verbose (integer, not for optimizer, default 0) – Controls the verbosity of the tree building process.

  • warm_start (boolean, not for optimizer, default False) – When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble.

decision_function(X)

Confidence scores for all classes.

Note: The decision_function method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.

Returns

result

Return type

array of items : float

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array of items : array of items : float) – The training input samples. Sparse matrices are accepted only if

  • y (union type, optional) –

    • array of items : float

      The target values (class labels in classification, real numbers in

    • or None

  • sample_weight (union type, optional) –

    Sample weights. If None, then samples are equally weighted.

    • array of items : float

    • or None

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) –

Returns

result

Return type

array of items : float