lale.lib.snapml.snap_boosting_machine_classifier module

class lale.lib.snapml.snap_boosting_machine_classifier.SnapBoostingMachineClassifier(*, num_round=100, learning_rate=0.1, random_state=0, colsample_bytree=1.0, subsample=1.0, verbose=False, lambda_l2=0.0, early_stopping_rounds=10, compress_trees=False, base_score=None, class_weight=None, max_depth=None, min_max_depth=1, max_max_depth=5, n_jobs=1, use_histograms=True, hist_nbins=256, use_gpu=False, gpu_id=0, tree_select_probability=1.0, regularizer=1.0, fit_intercept=False, gamma=1.0, n_components=10)

Bases: PlannedIndividualOp

Boosting machine classifier from Snap ML. It can be used for binary classification problems.

This documentation is auto-generated from JSON schemas.

Parameters
  • num_round (integer, >=1, >=100 for optimizer, <=1000 for optimizer, default 100) – Number of boosting iterations.

  • learning_rate (float, >0.0, >=0.01 for optimizer, <=0.3 for optimizer, uniform distribution, default 0.1) – Learning rate / shrinkage factor.

  • random_state (integer, not for optimizer, default 0) – Random seed.

  • colsample_bytree (float, >0.0, <=1.0, not for optimizer, default 1.0) – Fraction of feature columns used at each boosting iteration.

  • subsample (float, >0.0, <=1.0, not for optimizer, default 1.0) – Fraction of training examples used at each boosting iteration.

  • verbose (boolean, not for optimizer, default False) – Print off information during training.

  • lambda_l2 (float, >=0.0, not for optimizer, default 0.0) – L2-reguralization penalty used during tree-building.

  • early_stopping_rounds (integer, >=1, not for optimizer, default 10) – When a validation set is provided, training will stop if the validation loss does not increase after a fixed number of rounds.

  • compress_trees (boolean, not for optimizer, default False) – Compress trees after training for fast inference.

  • base_score (union type, not for optimizer, default None) –

    Base score to initialize boosting algorithm. If None then the algorithm will initialize the base score to be the the logit of the probability of the positive class.

    • float

    • or None

  • class_weight (‘balanced’ or None, not for optimizer, default None) – If set to ‘balanced’ samples weights will be applied to account for class imbalance, otherwise no sample weights will be used.

  • max_depth (union type, not for optimizer, default None) –

    If set, will set min_max_depth = max_depth = max_max_depth

    • integer, >=1

    • or None

  • min_max_depth (integer, >=1, >=1 for optimizer, <=5 for optimizer, default 1) – Minimum max_depth of trees in the ensemble.

  • max_max_depth (integer, >=1, >=5 for optimizer, <=10 for optimizer, default 5) – Maximum max_depth of trees in the ensemble.

  • n_jobs (integer, >=1, not for optimizer, default 1) – Number of threads to use during training.

  • use_histograms (boolean, not for optimizer, default True) –

    Use histograms to accelerate tree-building.

    See also constraint-1.

  • hist_nbins (integer, not for optimizer, default 256) – Number of histogram bins.

  • use_gpu (boolean, not for optimizer, default False) –

    Use GPU for tree-building.

    See also constraint-1.

  • gpu_id (integer, not for optimizer, default 0) – Device ID for GPU to use during training.

  • tree_select_probability (float, >=0.0, <=1.0, not for optimizer, default 1.0) – Probability of selecting a tree (rather than a kernel ridge regressor) at each boosting iteration.

  • regularizer (float, >=0.0, not for optimizer, default 1.0) – L2-regularization penality for the kernel ridge regressor.

  • fit_intercept (boolean, not for optimizer, default False) – Include intercept term in the kernel ridge regressor.

  • gamma (float, >=0.0, not for optimizer, default 1.0) – Guassian kernel parameter.

  • n_components (integer, >=1, not for optimizer, default 10) – Number of components in the random projection.

Notes

constraint-1 : union type

GPU only supported for histogram-based splits.

  • use_gpu : False

  • or use_histograms : True

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    The outer array is over samples aka rows.

    • items : array of items : float

      The inner array is over features aka columns.

  • y (union type) –

    The classes.

    • array of items : float

    • or array of items : string

    • or array of items : boolean

  • sample_weight (union type, optional, default None) –

    Sample weights.

    • array of items : float

    • or None

      Samples are equally weighted.

  • X_val (union type, optional, default None) –

    • array

      The outer array is over validation samples aka rows.

      • items : array of items : float

        The inner array is over features aka columns.

    • or None

      No validation set provided.

  • y_val (union type, optional, default None) –

    The validation classes.

    • array of items : float

    • or array of items : string

    • or array of items : boolean

    • or None

      No validation set provided.

  • sample_weight_val (union type, optional, default None) –

    Validation sample weights.

    • array of items : float

    • or None

      Validation samples are equally weighted.

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    The outer array is over samples aka rows.

    • items : array of items : float

      The inner array is over features aka columns.

  • n_jobs (integer, >=1, optional, default 1) – Number of threads used to run inference.

Returns

result – The predicted classes.

  • array of items : float

  • or array of items : string

  • or array of items : boolean

Return type

union type

predict_proba(X)

Probability estimates for all classes.

Note: The predict_proba method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters
  • X (array, optional) –

    The outer array is over samples aka rows.

    • items : array of items : float

      The inner array is over features aka columns.

  • n_jobs (integer, >=1, optional, default 1) – Number of threads used to run inference.

Returns

result – The outer array is over samples aka rows.

  • items : array of items : float

    The inner array contains probabilities corresponding to each class.

Return type

array