lale.lib.snapml.snap_decision_tree_regressor module

class lale.lib.snapml.snap_decision_tree_regressor.SnapDecisionTreeRegressor(*, criterion='mse', splitter='best', max_depth=None, min_samples_leaf=1, max_features=None, random_state=None, n_jobs=1, use_histograms=True, hist_nbins=256, use_gpu=False, gpu_id=0, verbose=False)

Bases: PlannedIndividualOp

Decision tree Regressor from Snap ML.

This documentation is auto-generated from JSON schemas.

Parameters
  • criterion ('mse', optional, not for optimizer, default 'mse') – Function to measure the quality of a split.

  • splitter ('best', optional, not for optimizer, default 'best') – The strategy used to choose the split at each node.

  • max_depth (union type, optional, default None) –

    The maximum depth of the tree.

    • integer, >=1, >=3 for optimizer, <=5 for optimizer

    • or None

      Nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_leaf samples.

  • min_samples_leaf (union type, optional, not for optimizer, default 1) –

    The minimum number of samples required to be at a leaf node.

    • integer, >=1, <=’X/maxItems’, not for optimizer

      Consider min_samples_leaf as the minimum number.

    • or float, >0.0, <=0.5

      min_samples_leaf is a fraction and ceil(min_samples_leaf * n_samples) are the minimum number of samples for each node.

  • max_features (union type, optional, default None) –

    The number of features to consider when looking for the best split.

    • integer, >=1, <=’X/items/maxItems’, not for optimizer

      Consider max_features features at each split.

    • or float, >0.0, >=0.1 for optimizer, <=1.0, <=0.9 for optimizer, uniform distribution

      max_features is a fraction and int(max_features * n_features) features are considered at each split.

    • or ‘auto’, ‘sqrt’, ‘log2’, or None

  • random_state (union type, optional, not for optimizer, default None) –

    Seed of pseudo-random number generator.

    • None

      RandomState used by np.random

    • or integer

      Explicit seed.

  • n_jobs (integer, >=1, optional, not for optimizer, default 1) – Number of CPU threads to use.

  • use_histograms (boolean, optional, not for optimizer, default True) –

    Use histogram-based splits rather than exact splits.

    See also constraint-1.

  • hist_nbins (integer, >=1, >=16 for optimizer, <=256, <=256 for optimizer, optional, default 256) – Number of histogram bins.

  • use_gpu (boolean, optional, not for optimizer, default False) –

    Use GPU acceleration (only supported for histogram-based splits).

    See also constraint-1.

  • gpu_id (integer, optional, not for optimizer, default 0) – Device ID of the GPU which will be used when GPU acceleration is enabled.

  • verbose (boolean, optional, not for optimizer, default False) – If True, it prints debugging information while training. Warning: this will increase the training time. For performance evaluation, use verbose=False.

Notes

constraint-1 : union type

GPU only supported for histogram-based splits.

  • use_gpu : False

  • or use_histograms : True

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    The outer array is over samples aka rows.

    • items : array of items : float

      The inner array is over features aka columns.

  • y (union type of array of items : float) – The regression target.

  • sample_weight (union type, optional, default None) –

    Sample weights.

    • array of items : float

    • or None

      Samples are equally weighted.

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    The outer array is over samples aka rows.

    • items : array of items : float

      The inner array is over features aka columns.

  • n_jobs (integer, >=0, optional, default 0) – Number of threads used to run inference. By default inference runs with maximum number of available threads.

Returns

result – The predicted values.

Return type

union type of array of items : float