lale.lib.snapml.snap_boosting_machine_classifier module¶

class lale.lib.snapml.snap_boosting_machine_classifier.SnapBoostingMachineClassifier(*, num_round=100, learning_rate=0.1, random_state=0, colsample_bytree=1.0, subsample=1.0, verbose=False, lambda_l2=0.0, early_stopping_rounds=10, compress_trees=False, base_score=None, class_weight=None, max_depth=None, min_max_depth=1, max_max_depth=5, n_jobs=1, use_histograms=True, hist_nbins=256, use_gpu=False, gpu_id=0, tree_select_probability=1.0, regularizer=1.0, fit_intercept=False, gamma=1.0, n_components=10)¶

Bases: PlannedIndividualOp

Boosting machine classifier from Snap ML. It can be used for binary classification problems.

This documentation is auto-generated from JSON schemas.

Parameters

num_round (integer, >=1, >=100 for optimizer, <=1000 for optimizer, default 100) – Number of boosting iterations.
learning_rate (float, >0.0, >=0.01 for optimizer, <=0.3 for optimizer, uniform distribution, default 0.1) – Learning rate / shrinkage factor.
random_state (integer, not for optimizer, default 0) – Random seed.
colsample_bytree (float, >0.0, <=1.0, not for optimizer, default 1.0) – Fraction of feature columns used at each boosting iteration.
subsample (float, >0.0, <=1.0, not for optimizer, default 1.0) – Fraction of training examples used at each boosting iteration.
verbose (boolean, not for optimizer, default False) – Print off information during training.
lambda_l2 (float, >=0.0, not for optimizer, default 0.0) – L2-reguralization penalty used during tree-building.
early_stopping_rounds (integer, >=1, not for optimizer, default 10) – When a validation set is provided, training will stop if the validation loss does not increase after a fixed number of rounds.
compress_trees (boolean, not for optimizer, default False) – Compress trees after training for fast inference.
base_score (union type, not for optimizer, default None) –
Base score to initialize boosting algorithm. If None then the algorithm will initialize the base score to be the the logit of the probability of the positive class.
- float
- or None
class_weight (‘balanced’ or None, not for optimizer, default None) – If set to ‘balanced’ samples weights will be applied to account for class imbalance, otherwise no sample weights will be used.
max_depth (union type, not for optimizer, default None) –
If set, will set min_max_depth = max_depth = max_max_depth
- integer, >=1
- or None
min_max_depth (integer, >=1, >=1 for optimizer, <=5 for optimizer, default 1) – Minimum max_depth of trees in the ensemble.
max_max_depth (integer, >=1, >=5 for optimizer, <=10 for optimizer, default 5) – Maximum max_depth of trees in the ensemble.
n_jobs (integer, >=1, not for optimizer, default 1) – Number of threads to use during training.
use_histograms (boolean, not for optimizer, default True) –
Use histograms to accelerate tree-building.

See also constraint-1.
hist_nbins (integer, not for optimizer, default 256) – Number of histogram bins.
use_gpu (boolean, not for optimizer, default False) –
Use GPU for tree-building.

See also constraint-1.
gpu_id (integer, not for optimizer, default 0) – Device ID for GPU to use during training.
tree_select_probability (float, >=0.0, <=1.0, not for optimizer, default 1.0) – Probability of selecting a tree (rather than a kernel ridge regressor) at each boosting iteration.
regularizer (float, >=0.0, not for optimizer, default 1.0) – L2-regularization penality for the kernel ridge regressor.
fit_intercept (boolean, not for optimizer, default False) – Include intercept term in the kernel ridge regressor.
gamma (float, >=0.0, not for optimizer, default 1.0) – Guassian kernel parameter.
n_components (integer, >=1, not for optimizer, default 10) – Number of components in the random projection.

Notes

constraint-1 : union type

GPU only supported for histogram-based splits.

use_gpu : False

or use_histograms : True

fit(X, y=None, **fit_params)¶

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters

X (array) –
The outer array is over samples aka rows.
- items : array of items : float
  
  The inner array is over features aka columns.
y (union type) –
The classes.
- array of items : float
- or array of items : string
- or array of items : boolean
sample_weight (union type, optional, default None) –
Sample weights.
- array of items : float
- or None
  
  Samples are equally weighted.
X_val (union type, optional, default None) –
- array
  The outer array is over validation samples aka rows.
  - items : array of items : float
    
    The inner array is over features aka columns.
- or None
  
  No validation set provided.
y_val (union type, optional, default None) –
The validation classes.
- array of items : float
- or array of items : string
- or array of items : boolean
- or None
  
  No validation set provided.
sample_weight_val (union type, optional, default None) –
Validation sample weights.
- array of items : float
- or None
  
  Validation samples are equally weighted.

predict(X, **predict_params)¶

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –
The outer array is over samples aka rows.
- items : array of items : float
  
  The inner array is over features aka columns.
n_jobs (integer, >=1, optional, default 1) – Number of threads used to run inference.

Returns

result – The predicted classes.

array of items : float
or array of items : string
or array of items : boolean

Return type

union type

predict_proba(X)¶

Probability estimates for all classes.

Note: The predict_proba method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array, optional) –
The outer array is over samples aka rows.
- items : array of items : float
  
  The inner array is over features aka columns.
n_jobs (integer, >=1, optional, default 1) – Number of threads used to run inference.

Returns

result – The outer array is over samples aka rows.

items : array of items : float

The inner array contains probabilities corresponding to each class.

Return type

array