lale.lib.xgboost.xgb_regressor module¶
- class lale.lib.xgboost.xgb_regressor.XGBRegressor(*, max_depth=None, learning_rate=None, n_estimators, verbosity=None, silent=None, objective='reg:linear', booster=None, tree_method=None, n_jobs=1, nthread=None, gamma=None, min_child_weight=None, max_delta_step=None, subsample=None, colsample_bytree=None, colsample_bylevel=None, colsample_bynode=None, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, base_score=None, random_state=0, missing=nan, importance_type='gain', seed=None, monotone_constraints=None, interaction_constraints=None, num_parallel_tree=None, validate_parameters=None, gpu_id=None, enable_categorical=False, predictor=None, max_leaves=None, max_bin=None, grow_policy=None, sampling_method=None, max_cat_to_onehot=None, eval_metric=None, early_stopping_rounds=None, callbacks=None, feature_types, max_cat_threshold=None, device=None, multi_strategy=None)¶
Bases:
PlannedIndividualOpXGBRegressor gradient boosted decision trees.
This documentation is auto-generated from JSON schemas.
- Parameters
max_depth (union type, default None) –
Maximum tree depth for base learners.
integer, >=0, >=1 for optimizer, <=7 for optimizer, uniform distribution
or None, not for optimizer
learning_rate (union type, default None) –
Boosting learning rate (xgb’s “eta”)
float, >=0.02 for optimizer, <=1 for optimizer, loguniform distribution
or None, not for optimizer
n_estimators (union type) –
Number of trees to fit.
integer, >=50 for optimizer, <=1000 for optimizer, default 200
or None
verbosity (union type, >=0, <=3, not for optimizer, default None) –
The degree of verbosity.
integer
or None
silent (union type, optional, not for optimizer, default None) –
Deprecated and replaced with verbosity, but adding to be backward compatible.
boolean
or None
objective (union type, not for optimizer, default 'reg:linear') –
Specify the learning task and the corresponding learning objective or a custom objective function to be used.
’reg:linear’, ‘reg:logistic’, ‘reg:gamma’, ‘reg:tweedie’, or ‘reg:squarederror’
or callable, not for optimizer
booster (‘gbtree’, ‘gblinear’, ‘dart’, or None, not for optimizer, default None) – Specify which booster to use.
tree_method (‘auto’, ‘exact’, ‘approx’, ‘hist’, ‘gpu_hist’, or None, not for optimizer, default None) – Specify which tree method to use. Default to auto. If this parameter is set to default, XGBoost will choose the most conservative option available. Refer to https://xgboost.readthedocs.io/en/latest/parameter.html.
n_jobs (union type, not for optimizer, default 1) –
Number of parallel threads used to run xgboost. (replaces
nthread)integer
or None
nthread (union type, optional, not for optimizer, default None) –
Number of parallel threads used to run xgboost. Deprecated, please use n_jobs
integer
or None
gamma (union type, default None) –
Minimum loss reduction required to make a further partition on a leaf node of the tree.
float, >=0, <=1.0 for optimizer
or None, not for optimizer
min_child_weight (union type, default None) –
Minimum sum of instance weight(hessian) needed in a child.
integer, >=2 for optimizer, <=20 for optimizer, uniform distribution
or None, not for optimizer
max_delta_step (union type, not for optimizer, default None) –
Maximum delta step we allow each tree’s weight estimation to be.
None
or integer
subsample (union type, default None) –
Subsample ratio of the training instance.
float, >0, >=0.01 for optimizer, <=1.0 for optimizer, uniform distribution
or None, not for optimizer
colsample_bytree (union type, not for optimizer, default None) –
Subsample ratio of columns when constructing each tree.
float, >0, >=0.1 for optimizer, <=1, <=1.0 for optimizer, uniform distribution
or None, not for optimizer
colsample_bylevel (union type, not for optimizer, default None) –
Subsample ratio of columns for each split, in each level.
float, >0, >=0.1 for optimizer, <=1, <=1.0 for optimizer, uniform distribution
or None, not for optimizer
colsample_bynode (union type, not for optimizer, default None) –
Subsample ratio of columns for each split.
float, >0, <=1
or None, not for optimizer
reg_alpha (union type, default None) –
L1 regularization term on weights
float, >=0 for optimizer, <=1 for optimizer, uniform distribution
or None, not for optimizer
reg_lambda (union type, default None) –
L2 regularization term on weights
float, >=0.1 for optimizer, <=1 for optimizer, uniform distribution
or None, not for optimizer
scale_pos_weight (union type, not for optimizer, default None) –
Balancing of positive and negative weights.
float
or None, not for optimizer
base_score (union type, not for optimizer, default None) –
The initial prediction score of all instances, global bias.
float
or None, not for optimizer
random_state (union type, not for optimizer, default 0) –
Random number seed. (replaces seed)
None
RandomState used by np.random
or integer
The seed used by the random number generator
or numpy.random.RandomState
Random number generator instance.
missing (union type, not for optimizer, default nan) –
Value in the data which needs to be present as a missing value. If If None, defaults to np.nan.
float
or None or nan
importance_type (‘gain’, ‘weight’, ‘cover’, ‘total_gain’, ‘total_cover’, or None, optional, not for optimizer, default ‘gain’) – The feature importance type for the feature_importances_ property.
seed (any type, optional, not for optimizer, default None) – deprecated and replaced with random_state, but adding to be backward compatible.
monotone_constraints (union type, optional, not for optimizer, default None) –
Constraint of variable monotonicity.
None
or string
interaction_constraints (union type, optional, not for optimizer, default None) –
Constraints for interaction representing permitted interactions. The constraints must be specified in the form of a nest list, e.g. [[0, 1], [2, 3, 4]], where each inner list is a group of indices of features that are allowed to interact with each other.
None
or string
num_parallel_tree (union type, optional, not for optimizer, default None) –
Used for boosting random forest.
None
or integer
validate_parameters (union type, optional, not for optimizer, default None) –
Give warnings for unknown parameter.
None
or boolean
or integer
gpu_id (union type, optional, not for optimizer, default None) –
Device ordinal.
integer
or None
enable_categorical (boolean, optional, not for optimizer, default False) – Experimental support for categorical data. Do not set to true unless you are interested in development. Only valid when gpu_hist and dataframe are used.
predictor (union type, optional, not for optimizer, default None) –
Force XGBoost to use specific predictor, available choices are [cpu_predictor, gpu_predictor].
string
or None
max_leaves (union type, optional, not for optimizer, default None) –
Maximum number of leaves; 0 indicates no limit.
integer
or None, not for optimizer
max_bin (union type, optional, not for optimizer, default None) –
If using histogram-based algorithm, maximum number of bins per feature.
integer
or None, not for optimizer
grow_policy (0, 1, ‘depthwise’, ‘lossguide’, or None, optional, not for optimizer, default None) –
- Tree growing policy.
0 or depthwise: favor splitting at nodes closest to the node, i.e. grow depth-wise. 1 or lossguide: favor splitting at nodes with highest loss change.
sampling_method (‘uniform’, ‘gadient_based’, or None, optional, not for optimizer, default None) –
- Sampling method. Used only by gpu_hist tree method.
uniform: select random training instances uniformly.
gradient_based select random training instances with higher probability when the gradient and hessian are larger. (cf. CatBoost)
max_cat_to_onehot (union type, optional, not for optimizer, default None) –
- A threshold for deciding whether XGBoost should use
one-hot encoding based split for categorical data.
integer
or None
eval_metric (union type, optional, not for optimizer, default None) –
Metric used for monitoring the training result and early stopping.
string
or array of items : string
or array, not for optimizer of items : callable
or None
early_stopping_rounds (union type, optional, not for optimizer, default None) –
- Activates early stopping.
Validation metric needs to improve at least once in every early_stopping_rounds round(s) to continue training.
integer
or None
callbacks (union type, optional, not for optimizer, default None) –
- List of callback functions that are applied at end of each iteration.
It is possible to use predefined callbacks by using Callback API.
array, not for optimizer of items : callable
or None
feature_types (Any, optional, not for optimizer) – Used for specifying feature types without constructing a dataframe. See DMatrix for details.
max_cat_threshold (union type, optional, not for optimizer, default None) –
- Maximum number of categories considered for each split.
Used only by partition-based splits for preventing over-fitting. Also, enable_categorical needs to be set to have categorical feature support. See Categorical Data and Parameters for Categorical Feature for details.
integer, >=0, >=1 for optimizer, <=10 for optimizer, uniform distribution
or None
device (union type, optional, not for optimizer, default None) –
Device ordinal
’cpu’, ‘cuda’, or ‘gpu’
or None
multi_strategy (union type, optional, not for optimizer, default None) –
- The strategy used for training multi-target models,
including multi-target regression and multi-class classification. See Multiple Outputs for more information.
’one_output_per_tree’
One model for each target.
or ‘multi_output_tree’
Use multi-target trees.
or None
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – Feature matrix
y (array of items : float) – Labels
sample_weight (union type, optional, default None) –
Weight for each instance
array of items : float
or None
eval_set (union type, optional, default None) –
A list of (X, y) pairs to use as a validation set for
array
or None
sample_weight_eval_set (union type, optional, default None) –
A list of the form [L_1, L_2, …, L_n], where each L_i is a list of
array
or None
eval_metric (union type, optional, default None) –
If a str, should be a built-in evaluation metric to use. See
array of items : string
or string
or None
or dict
early_stopping_rounds (union type, optional, default None) –
Activates early stopping. Validation error needs to decrease at
integer
or None
verbose (boolean, optional, default True) – If verbose and an evaluation set is used, writes the evaluation
xgb_model (union type, optional, default None) –
file name of stored xgb model or ‘Booster’ instance Xgb model to be
string
or None
callbacks (union type, optional, default None) –
List of callback functions that are applied at each iteration.
array of items : dict
or None
- partial_fit(X, y=None, **fit_params)¶
Incremental fit to train train the operator on a batch of samples.
Note: The partial_fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – Feature matrix
y (array of items : float) – Labels
sample_weight (union type, optional, default None) –
Weight for each instance
array of items : float
or None
eval_set (union type, optional, default None) –
A list of (X, y) pairs to use as a validation set for
array
or None
sample_weight_eval_set (union type, optional, default None) –
A list of the form [L_1, L_2, …, L_n], where each L_i is a list of
array
or None
eval_metric (union type, optional, default None) –
If a str, should be a built-in evaluation metric to use. See
array of items : string
or string
or None
or dict
early_stopping_rounds (union type, optional, default None) –
Activates early stopping. Validation error needs to decrease at
integer
or None
verbose (boolean, optional, default True) – If verbose and an evaluation set is used, writes the evaluation
xgb_model (union type, optional, default None) –
file name of stored xgb model or ‘Booster’ instance Xgb model to be
string
or None
callbacks (union type, optional, default None) –
List of callback functions that are applied at each iteration.
array of items : dict
or None
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – The dmatrix storing the input.
output_margin (boolean, optional, default False) – Whether to output the raw untransformed margin value.
ntree_limit (union type, optional) –
Limit number of trees in the prediction; defaults to best_ntree_limit if defined
integer
or None
validate_features (boolean, optional, default True) – When this is True, validate that the Booster’s and data’s feature_names are identical.
- Returns
result – Output data schema for predictions (target class labels).
- Return type
array of items : float