lale.lib.sklearn.random_forest_classifier module¶
- class lale.lib.sklearn.random_forest_classifier.RandomForestClassifier(*, n_estimators=100, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, max_leaf_nodes=None, min_impurity_decrease=0.0, bootstrap=True, oob_score=False, n_jobs=None, random_state=None, verbose=0, warm_start=False, class_weight=None, ccp_alpha=0.0, max_samples=None, monotonic_cst=None)¶
Bases:
PlannedIndividualOp
Random forest classifier from scikit-learn.
This documentation is auto-generated from JSON schemas.
- Parameters
n_estimators (integer, >=1, >=10 for optimizer, <=100 for optimizer, optional, default 100) – The number of trees in the forest.
criterion (‘gini’ or ‘entropy’, optional, default ‘gini’) – The function to measure the quality of a split.
max_depth (union type, optional, default None) –
The maximum depth of the tree.
integer, >=1, >=3 for optimizer, <=5 for optimizer
or None
Nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples.
min_samples_split (union type, optional, default 2) –
The minimum number of samples required to split an internal node.
integer, >=2, >=2 for optimizer, <=’X/maxItems’, <=5 for optimizer, default 2
Consider min_samples_split as the minimum number.
or float, >0.0, >=0.01 for optimizer, <=1.0, <=0.5 for optimizer, default 0.05
min_samples_split is a fraction and ceil(min_samples_split * n_samples) are the minimum number of samples for each split.
min_samples_leaf (union type, optional, default 1) –
The minimum number of samples required to be at a leaf node.
integer, >=1, >=1 for optimizer, <=’X/maxItems’, <=5 for optimizer, default 1
Consider min_samples_leaf as the minimum number.
or float, >0.0, >=0.01 for optimizer, <=0.5, default 0.05
min_samples_leaf is a fraction and ceil(min_samples_leaf * n_samples) are the minimum number of samples for each node.
min_weight_fraction_leaf (float, >=0.0, <=0.5, optional, not for optimizer, default 0.0) – The minimum weighted fraction of the sum total of weights (of all the input samples) required to be at a leaf node. Samples have equal weight when sample_weight is not provided.
max_features (union type, optional, default None) –
The number of features to consider when looking for the best split.
integer, >=2, <=’X/items/maxItems’, not for optimizer
Consider max_features features at each split.
or float, >0.0, >=0.01 for optimizer, <=1.0, uniform distribution, default 0.5
max_features is a fraction and int(max_features * n_features) features are considered at each split.
or ‘sqrt’, ‘log2’, or None
max_leaf_nodes (union type, optional, not for optimizer, default None) –
Grow trees with max_leaf_nodes in best-first fashion. Best nodes are defined as relative reduction in impurity.
integer, >=1, >=3 for optimizer, <=1000 for optimizer
or None
Unlimited number of leaf nodes.
min_impurity_decrease (float, >=0.0, <=10.0 for optimizer, optional, not for optimizer, default 0.0) – A node will be split if this split induces a decrease of the impurity greater than or equal to this value.
bootstrap (boolean, optional, not for optimizer, default True) –
Whether bootstrap samples are used when building trees. If False, the whole datset is used to build each tree.
See also constraint-2.
oob_score (union type, optional, not for optimizer, default False) –
Whether to use out-of-bag samples to estimate the generalization accuracy.
callable, not for optimizer
A callable with signature metric(y_true, y_pred).
or boolean
See also constraint-2.
n_jobs (union type, optional, not for optimizer, default None) –
The number of jobs to run in parallel for both fit and predict.
None
1 unless in joblib.parallel_backend context.
or -1
Use all processors.
or integer, >=1
Number of CPU cores.
random_state (union type, optional, not for optimizer, default None) –
Seed of pseudo-random number generator.
numpy.random.RandomState
or None
RandomState used by np.random
or integer
Explicit seed.
verbose (integer, optional, not for optimizer, default 0) – Controls the verbosity when fitting and predicting.
warm_start (boolean, optional, not for optimizer, default False) – When set to True, reuse the solution of the previous call to fit and add more estimators to the ensemble, otherwise, just fit a whole new forest.
class_weight (union type, not for optimizer, default None) –
Weights associated with classes in the form
{class_label: weight}
.dict
or array of items : dict
or ‘balanced’, ‘balanced_subsample’, or None
ccp_alpha (float, >=0.0, <=0.1 for optimizer, optional, not for optimizer, default 0.0) – Complexity parameter used for Minimal Cost-Complexity Pruning. The subtree with the largest cost complexity that is smaller than ccp_alpha will be chosen. By default, no pruning is performed.
max_samples (union type, optional, not for optimizer, default None) –
If bootstrap is True, the number of samples to draw from X to train each base estimator.
None
Draw X.shape[0] samples.
or integer, >=1
Draw max_samples samples.
or float, >0.0, <1.0
Draw max_samples * X.shape[0] samples.
monotonic_cst (union type, optional, not for optimizer, default None) –
Indicates the monotonicity constraint to enforce on each feature. Monotonicity constraints are not supported for: multioutput regressions (i.e. when n_outputs > 1),
regressions trained on data with missing values.
array of items : -1, 0, or 1
array-like of int of shape (n_features)
or None
No constraints are applied.
Notes
constraint-1 : negated type of ‘y/isSparse’
This classifier does not support sparse labels.
constraint-2 : union type
Out of bag estimation only available if bootstrap=True.
bootstrap : True
or oob_score : False
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
The outer array is over samples aka rows.
items : array of items : float
The inner array is over features aka columns.
y (union type) –
The predicted classes.
array of items : float
or array of items : string
or array of items : boolean
sample_weight (union type, optional) –
Sample weights.
array of items : float
or None
Samples are equally weighted.
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array, optional) –
The outer array is over samples aka rows.
items : array of items : float
The inner array is over features aka columns.
- Returns
result – The predicted classes.
array of items : float
or array of items : string
or array of items : boolean
- Return type
union type
- predict_proba(X)¶
Probability estimates for all classes.
Note: The predict_proba method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array, optional) –
The outer array is over samples aka rows.
items : array of items : float
The inner array is over features aka columns.
- Returns
result – The outer array is over samples aka rows.
items : array of items : float
The inner array has items corresponding to each class.
- Return type
array