lale.lib.sklearn.random_forest_regressor module

class lale.lib.sklearn.random_forest_regressor.RandomForestRegressor(*, n_estimators=100, criterion='squared_error', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, ccp_alpha=0.0, max_samples=None, monotonic_cst=None)

Bases: PlannedIndividualOp

Random forest regressor from scikit-learn.

This documentation is auto-generated from JSON schemas.

Parameters
  • n_estimators (integer, >=1, >=10 for optimizer, <=100 for optimizer, default 100) – The number of trees in the forest.

  • criterion (union type, default 'squared_error') –

    The function to measure the quality of a split. Supported criteria are “squared_error” for the mean squared error, which is equal to variance reduction as feature selection criterion, “absolute_error” for the mean absolute error, and “poisson” which uses reduction in Poisson deviance to find splits. Training using “absolute_error” is significantly slower than when using “squared_error”.

    • ’squared_error’, ‘absolute_error’, or ‘poisson’

    • or ‘mse’ or ‘mae’, not for optimizer

  • max_depth (union type, default None) –

    The maximum depth of the tree.

    • integer, >=1, >=3 for optimizer, <=5 for optimizer

    • or None

      Nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples.

  • min_samples_split (union type, default 2) –

    The minimum number of samples required to split an internal node.

    • integer, >=2, >=2 for optimizer, <=’X/maxItems’, <=5 for optimizer, not for optimizer

      Consider min_samples_split as the minimum number.

    • or float, >0.0, >=0.01 for optimizer, <=1.0, <=0.5 for optimizer, default 0.05

      min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split.

  • min_samples_leaf (union type, default 1) –

    The minimum number of samples required to be at a leaf node.

    • integer, >=1, >=1 for optimizer, <=’X/maxItems’, <=5 for optimizer, not for optimizer

      Consider min_samples_leaf as the minimum number.

    • or float, >0.0, >=0.01 for optimizer, <=0.5, default 0.05

      min_samples_leaf is a fraction and ceil(min_samples_leaf * n_samples) are the minimum number of samples for each node.

  • min_weight_fraction_leaf (float, >=0.0, <=0.5, optional, not for optimizer, default 0.0) – The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Samples have equal weight when sample_weight is not provided.

  • max_features (union type, default None) –

    The number of features to consider when looking for the best split.

    • integer, >=2, <=’X/items/maxItems’, not for optimizer

      Consider max_features features at each split.

    • or float, >0.0, >=0.01 for optimizer, <=1.0, uniform distribution, default 0.5

      max_features is a fraction and int(max_features * n_features) features are considered at each split.

    • or ‘sqrt’, ‘log2’, or None

  • max_leaf_nodes (union type, optional, not for optimizer, default None) –

    Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity.

    • integer, >=1, >=3 for optimizer, <=1000 for optimizer

    • or None

      Unlimited number of leaf nodes.

  • min_impurity_decrease (float, >=0.0, <=10.0 for optimizer, optional, not for optimizer, default 0.0) – A node will be split if this split induces a decrease of the impurity greater than or equal to this value.

  • bootstrap (boolean, default True) –

    Whether bootstrap samples are used when building trees. If False, the whole datset is used to build each tree.

    See also constraint-2.

  • oob_score (union type, optional, not for optimizer, default False) –

    Whether to use out-of-bag samples to estimate the generalization accuracy.

    • callable, not for optimizer

      A callable with signature metric(y_true, y_pred).

    • or boolean

    See also constraint-2.

  • n_jobs (union type, optional, not for optimizer, default None) –

    The number of jobs to run in parallel for both fit and predict.

    • None

      1 unless in joblib.parallel_backend context.

    • or -1

      Use all processors.

    • or integer, >=1

      Number of CPU cores.

  • random_state (union type, optional, not for optimizer, default None) –

    Seed of pseudo-random number generator.

    • numpy.random.RandomState

    • or None

      RandomState used by np.random

    • or integer

      Explicit seed.

  • verbose (integer, optional, not for optimizer, default 0) – Controls the verbosity when fitting and predicting.

  • warm_start (boolean, optional, not for optimizer, default False) – When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.

  • ccp_alpha (float, >=0.0, <=0.1 for optimizer, optional, not for optimizer, default 0.0) – Complexity parameter used for Minimal Cost-Complexity Pruning. The subtree with the largest cost complexity that is smaller than ccp_alpha will be chosen. By default, no pruning is performed.

  • max_samples (union type, optional, not for optimizer, default None) –

    If bootstrap is True, the number of samples to draw from X to train each base estimator.

    • None

      Draw X.shape[0] samples.

    • or integer, >=1

      Draw max_samples samples.

    • or float, >0.0, <1.0

      Draw max_samples * X.shape[0] samples.

  • monotonic_cst (union type, optional, not for optimizer, default None) –

    Indicates the monotonicity constraint to enforce on each feature. Monotonicity constraints are not supported for: multioutput regressions (i.e. when n_outputs > 1),

    regressions trained on data with missing values.

    • array of items : -1, 0, or 1

      array-like of int of shape (n_features)

    • or None

      No constraints are applied.

Notes

constraint-1 : negated type of ‘y/isSparse’

This classifier does not support sparse labels.

constraint-2 : union type

Out of bag estimation only available if bootstrap=True.

  • bootstrap : True

  • or oob_score : False

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    The outer array is over samples aka rows.

    • items : array of items : float

      The inner array is over features aka columns.

  • y (union type) –

    The predicted classes.

    • array of items : array of items : float

    • or array of items : float

  • sample_weight (union type, optional) –

    Sample weights.

    • array of items : float

    • or None

      Samples are equally weighted.

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array, optional) –

The outer array is over samples aka rows.

  • items : array of items : float

    The inner array is over features aka columns.

Returns

result – The predicted values.

  • array of items : array of items : float

  • or array of items : float

Return type

union type