lale.lib.xgboost.xgb_regressor module¶
- class lale.lib.xgboost.xgb_regressor.XGBRegressor(*, max_depth=None, learning_rate=None, n_estimators, verbosity=None, silent=None, objective='reg:linear', booster=None, tree_method=None, n_jobs=1, nthread=None, gamma=None, min_child_weight=None, max_delta_step=None, subsample=None, colsample_bytree=None, colsample_bylevel=None, colsample_bynode=None, reg_alpha=None, reg_lambda=None, scale_pos_weight=None, base_score=None, random_state=0, missing=nan, importance_type='gain', seed=None, monotone_constraints=None, interaction_constraints=None, num_parallel_tree=None, validate_parameters=None, gpu_id=None, enable_categorical=False, predictor=None, max_leaves=None, max_bin=None, grow_policy=None, sampling_method=None, max_cat_to_onehot=None, eval_metric=None, early_stopping_rounds=None, callbacks=None, feature_types, max_cat_threshold=None, device=None, multi_strategy=None)¶
Bases:
PlannedIndividualOp
XGBRegressor gradient boosted decision trees.
This documentation is auto-generated from JSON schemas.
- Parameters
max_depth (union type, default None) –
Maximum tree depth for base learners.
integer, >=0, >=1 for optimizer, <=7 for optimizer, uniform distribution
or None, not for optimizer
learning_rate (union type, default None) –
Boosting learning rate (xgb’s “eta”)
float, >=0.02 for optimizer, <=1 for optimizer, loguniform distribution
or None, not for optimizer
n_estimators (union type) –
Number of trees to fit.
integer, >=50 for optimizer, <=1000 for optimizer, default 200
or None
verbosity (union type, >=0, <=3, not for optimizer, default None) –
The degree of verbosity.
integer
or None
silent (union type, optional, not for optimizer, default None) –
Deprecated and replaced with verbosity, but adding to be backward compatible.
boolean
or None
objective (union type, not for optimizer, default 'reg:linear') –
Specify the learning task and the corresponding learning objective or a custom objective function to be used.
’reg:linear’, ‘reg:logistic’, ‘reg:gamma’, ‘reg:tweedie’, or ‘reg:squarederror’
or callable
booster (‘gbtree’, ‘gblinear’, ‘dart’, or None, not for optimizer, default None) – Specify which booster to use.
tree_method (‘auto’, ‘exact’, ‘approx’, ‘hist’, ‘gpu_hist’, or None, not for optimizer, default None) – Specify which tree method to use. Default to auto. If this parameter is set to default, XGBoost will choose the most conservative option available. Refer to https://xgboost.readthedocs.io/en/latest/parameter.html.
n_jobs (union type, not for optimizer, default 1) –
Number of parallel threads used to run xgboost. (replaces
nthread
)integer
or None
nthread (union type, optional, not for optimizer, default None) –
Number of parallel threads used to run xgboost. Deprecated, please use n_jobs
integer
or None
gamma (union type, default None) –
Minimum loss reduction required to make a further partition on a leaf node of the tree.
float, >=0, <=1.0 for optimizer
or None, not for optimizer
min_child_weight (union type, default None) –
Minimum sum of instance weight(hessian) needed in a child.
integer, >=2 for optimizer, <=20 for optimizer, uniform distribution
or None, not for optimizer
max_delta_step (union type, not for optimizer, default None) –
Maximum delta step we allow each tree’s weight estimation to be.
None
or integer
subsample (union type, default None) –
Subsample ratio of the training instance.
float, >0, >=0.01 for optimizer, <=1.0 for optimizer, uniform distribution
or None, not for optimizer
colsample_bytree (union type, not for optimizer, default None) –
Subsample ratio of columns when constructing each tree.
float, >0, >=0.1 for optimizer, <=1, <=1.0 for optimizer, uniform distribution
or None, not for optimizer
colsample_bylevel (union type, not for optimizer, default None) –
Subsample ratio of columns for each split, in each level.
float, >0, >=0.1 for optimizer, <=1, <=1.0 for optimizer, uniform distribution
or None, not for optimizer
colsample_bynode (union type, not for optimizer, default None) –
Subsample ratio of columns for each split.
float, >0, <=1
or None, not for optimizer
reg_alpha (union type, default None) –
L1 regularization term on weights
float, >=0 for optimizer, <=1 for optimizer, uniform distribution
or None, not for optimizer
reg_lambda (union type, default None) –
L2 regularization term on weights
float, >=0.1 for optimizer, <=1 for optimizer, uniform distribution
or None, not for optimizer
scale_pos_weight (union type, not for optimizer, default None) –
Balancing of positive and negative weights.
float
or None, not for optimizer
base_score (union type, not for optimizer, default None) –
The initial prediction score of all instances, global bias.
float
or None, not for optimizer
random_state (union type, not for optimizer, default 0) –
Random number seed. (replaces seed)
integer
or None
missing (union type, not for optimizer, default nan) –
Value in the data which needs to be present as a missing value. If If None, defaults to np.nan.
float
or None or nan
importance_type (‘gain’, ‘weight’, ‘cover’, ‘total_gain’, ‘total_cover’, or None, optional, not for optimizer, default ‘gain’) – The feature importance type for the feature_importances_ property.
seed (any type, optional, not for optimizer, default None) – deprecated and replaced with random_state, but adding to be backward compatible.
monotone_constraints (union type, optional, not for optimizer, default None) –
Constraint of variable monotonicity.
None
or string
interaction_constraints (union type, optional, not for optimizer, default None) –
Constraints for interaction representing permitted interactions. The constraints must be specified in the form of a nest list, e.g. [[0, 1], [2, 3, 4]], where each inner list is a group of indices of features that are allowed to interact with each other.
None
or string
num_parallel_tree (union type, optional, not for optimizer, default None) –
Used for boosting random forest.
None
or integer
validate_parameters (union type, optional, not for optimizer, default None) –
Give warnings for unknown parameter.
None
or boolean
or integer
gpu_id (union type, optional, not for optimizer, default None) –
Device ordinal.
integer
or None
enable_categorical (boolean, optional, not for optimizer, default False) – Experimental support for categorical data. Do not set to true unless you are interested in development. Only valid when gpu_hist and dataframe are used.
predictor (union type, optional, not for optimizer, default None) –
Force XGBoost to use specific predictor, available choices are [cpu_predictor, gpu_predictor].
string
or None
max_leaves (union type, optional, not for optimizer, default None) –
Maximum number of leaves; 0 indicates no limit.
integer
or None, not for optimizer
max_bin (union type, optional, not for optimizer, default None) –
If using histogram-based algorithm, maximum number of bins per feature.
integer
or None, not for optimizer
grow_policy (0, 1, ‘depthwise’, ‘lossguide’, or None, optional, not for optimizer, default None) –
- Tree growing policy.
0 or depthwise: favor splitting at nodes closest to the node, i.e. grow depth-wise. 1 or lossguide: favor splitting at nodes with highest loss change.
sampling_method (‘uniform’, ‘gadient_based’, or None, optional, not for optimizer, default None) –
- Sampling method. Used only by gpu_hist tree method.
uniform: select random training instances uniformly.
gradient_based select random training instances with higher probability when the gradient and hessian are larger. (cf. CatBoost)
max_cat_to_onehot (union type, optional, not for optimizer, default None) –
- A threshold for deciding whether XGBoost should use
one-hot encoding based split for categorical data.
integer
or None
eval_metric (union type, optional, not for optimizer, default None) –
Metric used for monitoring the training result and early stopping.
string
or array of items : string
or array of items : callable
or None
early_stopping_rounds (union type, optional, not for optimizer, default None) –
- Activates early stopping.
Validation metric needs to improve at least once in every early_stopping_rounds round(s) to continue training.
integer
or None
callbacks (union type, optional, not for optimizer, default None) –
- List of callback functions that are applied at end of each iteration.
It is possible to use predefined callbacks by using Callback API.
array of items : callable
or None
feature_types (Any, optional, not for optimizer) – Used for specifying feature types without constructing a dataframe. See DMatrix for details.
max_cat_threshold (union type, optional, not for optimizer, default None) –
- Maximum number of categories considered for each split.
Used only by partition-based splits for preventing over-fitting. Also, enable_categorical needs to be set to have categorical feature support. See Categorical Data and Parameters for Categorical Feature for details.
integer, >=0, >=1 for optimizer, <=10 for optimizer, uniform distribution
or None
device (union type, optional, not for optimizer, default None) –
Device ordinal
’cpu’, ‘cuda’, or ‘gpu’
or None
multi_strategy (union type, optional, not for optimizer, default None) –
- The strategy used for training multi-target models,
including multi-target regression and multi-class classification. See Multiple Outputs for more information.
’one_output_per_tree’
One model for each target.
or ‘multi_output_tree’
Use multi-target trees.
or None
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – Feature matrix
y (array of items : float) – Labels
sample_weight (union type, optional, default None) –
Weight for each instance
array of items : float
or None
eval_set (union type, optional, default None) –
A list of (X, y) pairs to use as a validation set for
array
or None
sample_weight_eval_set (union type, optional, default None) –
A list of the form [L_1, L_2, …, L_n], where each L_i is a list of
array
or None
eval_metric (union type, optional, default None) –
If a str, should be a built-in evaluation metric to use. See
array of items : string
or string
or None
or dict
early_stopping_rounds (union type, optional, default None) –
Activates early stopping. Validation error needs to decrease at
integer
or None
verbose (boolean, optional, default True) – If verbose and an evaluation set is used, writes the evaluation
xgb_model (union type, optional, default None) –
file name of stored xgb model or ‘Booster’ instance Xgb model to be
string
or None
callbacks (union type, optional, default None) –
List of callback functions that are applied at each iteration.
array of items : dict
or None
- partial_fit(X, y=None, **fit_params)¶
Incremental fit to train train the operator on a batch of samples.
Note: The partial_fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – Feature matrix
y (array of items : float) – Labels
sample_weight (union type, optional, default None) –
Weight for each instance
array of items : float
or None
eval_set (union type, optional, default None) –
A list of (X, y) pairs to use as a validation set for
array
or None
sample_weight_eval_set (union type, optional, default None) –
A list of the form [L_1, L_2, …, L_n], where each L_i is a list of
array
or None
eval_metric (union type, optional, default None) –
If a str, should be a built-in evaluation metric to use. See
array of items : string
or string
or None
or dict
early_stopping_rounds (union type, optional, default None) –
Activates early stopping. Validation error needs to decrease at
integer
or None
verbose (boolean, optional, default True) – If verbose and an evaluation set is used, writes the evaluation
xgb_model (union type, optional, default None) –
file name of stored xgb model or ‘Booster’ instance Xgb model to be
string
or None
callbacks (union type, optional, default None) –
List of callback functions that are applied at each iteration.
array of items : dict
or None
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – The dmatrix storing the input.
output_margin (boolean, optional, default False) – Whether to output the raw untransformed margin value.
ntree_limit (union type, optional) –
Limit number of trees in the prediction; defaults to best_ntree_limit if defined
integer
or None
validate_features (boolean, optional, default True) – When this is True, validate that the Booster’s and data’s feature_names are identical.
- Returns
result – Output data schema for predictions (target class labels).
- Return type
array of items : float