lale.lib.snapml.snap_random_forest_classifier module¶
- class lale.lib.snapml.snap_random_forest_classifier.SnapRandomForestClassifier(*, n_estimators=10, criterion='gini', max_depth=None, min_samples_leaf=1, max_features='auto', bootstrap=True, n_jobs=1, random_state=None, verbose=False, use_histograms=False, hist_nbins=256, use_gpu=False, gpu_ids=None)¶
Bases:
PlannedIndividualOp
Random forest classifier from Snap ML. It can be used for binary classification problems.
This documentation is auto-generated from JSON schemas.
- Parameters
n_estimators (integer, >=1, >=10 for optimizer, <=100 for optimizer, optional, default 10) – The number of trees in the forest.
criterion ('gini', optional, not for optimizer, default 'gini') – Function to measure the quality of a split.
max_depth (union type, optional, default None) –
The maximum depth of the tree.
integer, >=1, >=3 for optimizer, <=5 for optimizer
or None
Nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_leaf samples.
min_samples_leaf (union type, optional, not for optimizer, default 1) –
The minimum number of samples required to be at a leaf node.
integer, >=1, <=’X/maxItems’, not for optimizer
Consider min_samples_leaf as the minimum number.
or float, >0.0, <=0.5
min_samples_leaf is a fraction and ceil(min_samples_leaf * n_samples) are the minimum number of samples for each node.
max_features (union type, optional, default 'auto') –
The number of features to consider when looking for the best split.
integer, >=1, <=’X/items/maxItems’, not for optimizer
Consider max_features features at each split.
or float, >0.0, >=0.1 for optimizer, <=1.0, <=0.9 for optimizer, uniform distribution
max_features is a fraction and int(max_features * n_features) features are considered at each split.
or ‘auto’, ‘sqrt’, ‘log2’, or None
bootstrap (boolean, optional, not for optimizer, default True) – Whether bootstrap samples are used when building trees.
n_jobs (integer, >=1, optional, not for optimizer, default 1) – Number of CPU threads to use.
random_state (union type, optional, not for optimizer, default None) –
Seed of pseudo-random number generator.
None
RandomState used by np.random
or integer
Explicit seed.
verbose (boolean, optional, not for optimizer, default False) – If True, it prints debugging information while training. Warning: this will increase the training time. For performance evaluation, use verbose=False.
use_histograms (boolean, optional, not for optimizer, default False) –
Use histogram-based splits rather than exact splits.
See also constraint-1.
hist_nbins (integer, optional, not for optimizer, default 256) – Number of histogram bins.
use_gpu (boolean, optional, not for optimizer, default False) –
Use GPU acceleration (only supported for histogram-based splits).
See also constraint-1.
gpu_ids (union type, optional, not for optimizer, default None) –
Device IDs of the GPUs which will be used when GPU acceleration is enabled.
None
Use [0].
or array of items : integer
Notes
constraint-1 : union type
GPU only supported for histogram-based splits.
use_gpu : False
or use_histograms : True
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
The outer array is over samples aka rows.
items : array of items : float
The inner array is over features aka columns.
y (union type) –
The classes.
array of items : float
or array of items : string
or array of items : boolean
sample_weight (union type, optional, default None) –
Sample weights.
array of items : float
or None
Samples are equally weighted.
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
The outer array is over samples aka rows.
items : array of items : float
The inner array is over features aka columns.
n_jobs (integer, >=0, optional, default 0) – Number of threads used to run inference. By default inference runs with maximum number of available threads.
- Returns
result – The predicted classes.
array of items : float
or array of items : string
or array of items : boolean
- Return type
union type
- predict_proba(X)¶
Probability estimates for all classes.
Note: The predict_proba method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array, optional) –
The outer array is over samples aka rows.
items : array of items : float
The inner array is over features aka columns.
n_jobs (integer, >=0, optional, default 0) – Number of threads used to run inference. By default inference runs with maximum number of available threads..
- Returns
result – The outer array is over samples aka rows.
items : array of items : float
The inner array contains probabilities corresponding to each class.
- Return type
array