lale.lib.sklearn.isolation_forest module¶

class lale.lib.sklearn.isolation_forest.IsolationForest(*, n_estimators=100, max_samples='auto', contamination='auto', max_features=1.0, bootstrap=True, n_jobs=None, random_state=None, verbose=0, warm_start=False)¶

Bases: PlannedIndividualOp

Isolation forest from scikit-learn for getting the anomaly score of each sample using the IsolationForest algorithm.

This documentation is auto-generated from JSON schemas.

Parameters

n_estimators (integer, >=10 for optimizer, <=100 for optimizer, uniform distribution, default 100) – The number of base estimators in the ensemble.
max_samples (union type, default 'auto') –
The number of samples to draw from X to train each base estimator.
- integer, >=2, <=’X/maxItems’, not for optimizer
  
  Draw max_samples samples.
- or float, >0.0, >=0.2 for optimizer, <=1.0, <=1.0 for optimizer
  
  Draw max_samples * X.shape[0] samples.
- or ‘auto’
  
  Draw max_samples=min(256, n_samples) samples.
contamination (union type, not for optimizer, default 'auto') –
The amount of contamination of the data set, i.e. the proportion of outliers in the data set. Used when fitting to define the threshold on the scores of the samples.
- float, >=0.0, <=0.5
- or ‘auto’
max_features (union type, default 1.0) –
The number of features to draw from X to train each base estimator.
- integer, >=2, <=’X/items/maxItems’, not for optimizer
  
  Draw max_features features.
- or float, >0.0, >=0.01 for optimizer, <=1.0, <=1.0 for optimizer
  
  Draw max_samples * X.shape[1] features.
bootstrap (boolean, default True) – Whether samples are drawn with (True) or without (False) replacement.
n_jobs (union type, not for optimizer, default None) –
The number of jobs to run in parallel for both fit and predict.
- None
  
  1 unless in joblib.parallel_backend context.
- or -1
  
  Use all processors.
- or integer, >=1
  
  Number of CPU cores.
random_state (union type, not for optimizer, default None) –
Controls the pseudo-randomness of the selection of the feature and split values for each branching step and each tree in the forest. If int, random_state is the seed used by the random number generator
- integer
- or numpy.random.RandomState
- or None
verbose (integer, not for optimizer, default 0) – Controls the verbosity of the tree building process.
warm_start (boolean, not for optimizer, default False) – When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new ensemble.

decision_function(X)¶

Confidence scores for all classes.

Note: The decision_function method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters: X (array of items : array of items : float) – Features; the outer array is over samples.
Returns: result
Return type: array of items : float

fit(X, y=None, **fit_params)¶

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – The training input samples. Sparse matrices are accepted only if
y (union type, optional) –
- array of items : float
  
  The target values (class labels in classification, real numbers in
- or None
sample_weight (union type, optional) –
Sample weights. If None, then samples are equally weighted.
- array of items : float
- or None

predict(X, **predict_params)¶

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters: X (array of items : array of items : float) –
Returns: result
Return type: array of items : float