lale.lib.lightgbm.lgbm_regressor module¶
- class lale.lib.lightgbm.lgbm_regressor.LGBMRegressor(*, boosting_type='gbdt', num_leaves=31, max_depth=-1, learning_rate=0.1, n_estimators=200, subsample_for_bin=200000, objective=None, class_weight=None, min_split_gain=0.0, min_child_weight=0.001, min_child_samples=20, subsample=1.0, subsample_freq=0, colsample_bytree=1.0, reg_alpha=0.0, reg_lambda=0.0, random_state=None, n_jobs=-1, silent='warn', importance_type='split', n_job=None)¶
Bases:
PlannedIndividualOp
Combined schema for expected data and hyperparameters.
This documentation is auto-generated from JSON schemas.
- Parameters
boosting_type (union type, optional, default 'gbdt') –
‘gbdt’
Traditional Gradient Boosting Decision Tree.
or ‘dart’
Dropouts meet Multiple Additive Regression Trees.
or ‘goss’, not for optimizer
Gradient-based One-Side Sampling.
or ‘rf’, not for optimizer
Random Forest.
See also constraint-1, constraint-2.
num_leaves (union type, optional, default 31) –
Maximum tree leaves for base learners
integer, not for optimizer
or 2, 4, 8, 32, 64, 128, or 16
max_depth (union type, optional, not for optimizer, default -1 of integer, >=3 for optimizer, <=5 for optimizer) – Maximum tree depth for base learners, <=0 means no limit
learning_rate (float, >=0.02 for optimizer, <=1.0 for optimizer, loguniform distribution, optional, default 0.1) – Boosting learning rate.
n_estimators (integer, >=50 for optimizer, <=1000 for optimizer, uniform distribution, optional, default 200) – Number of boosted trees to fit.
subsample_for_bin (integer, optional, not for optimizer, default 200000) – Number of samples for constructing bins.
objective (union type, optional, not for optimizer, default None) –
Specify the learning task and the corresponding learning objective or a custom objective function to be used
dict
or ‘regression’ or None
class_weight (union type, optional, not for optimizer, default None) –
Weights associated with classes
dict
or ‘balanced’ or None
min_split_gain (float, optional, not for optimizer, default 0.0) – Minimum loss reduction required to make a further partition on a leaf node of the tree.
min_child_weight (float, >=0.0001 for optimizer, <=0.01 for optimizer, optional, default 0.001) – Minimum sum of instance weight (hessian) needed in a child (leaf).
min_child_samples (integer, >=5 for optimizer, <=30 for optimizer, uniform distribution, optional, default 20) – Minimum number of data needed in a child (leaf).
subsample (float, >=0.01 for optimizer, <=1.0 for optimizer, uniform distribution, optional, default 1.0) –
Subsample ratio of the training instance.
See also constraint-2.
subsample_freq (integer, >=0 for optimizer, <=5 for optimizer, uniform distribution, optional, default 0) –
Frequence of subsample, <=0 means no enable.
See also constraint-2.
colsample_bytree (float, >=0.01 for optimizer, <=1.0 for optimizer, optional, default 1.0) – Subsample ratio of columns when constructing each tree.
reg_alpha (float, >=0.0 for optimizer, <=1.0 for optimizer, optional, default 0.0) – L1 regularization term on weights.
reg_lambda (float, >=0.0 for optimizer, <=1.0 for optimizer, optional, default 0.0) – L2 regularization term on weights.
random_state (union type, optional, not for optimizer, default None) –
Random number seed. If None, default seeds in C++ code will be used.
integer
or numpy.random.RandomState
or None
n_jobs (integer, optional, not for optimizer, default -1) – Number of parallel threads.
silent (union type, optional, not for optimizer, default 'warn') –
Whether to print messages while running boosting.
’warn’
or boolean
importance_type (‘split’ or ‘gain’, optional, not for optimizer, default ‘split’) – The type of feature importance to be filled into feature_importances_.
n_job (union type, optional, not for optimizer, default None) –
Number of parallel threads to use for training (can be changed at prediction time by passing it as an extra keyword argument). For better performance, it is recommended to set this to the number of physical cores in the CPU. Negative integers are interpreted as following joblib’s formula (n_cpus + 1 + n_jobs), just like scikit-learn (so e.g. -1 means using all threads). A value of zero corresponds the default number of threads configured for OpenMP in the system.
integer
Number of parallel threads.
or None
Use the number of physical cores in the system (its correct detection requires either the joblib or the psutil util libraries to be installed).
Notes
constraint-1 : union type
boosting_type rf needs bagging (which means subsample_freq > 0 and subsample < 1.0)
boosting_type : negated type of ‘rf’
or intersection type
dict of subsample_freq : negated type of 0
and dict of subsample : negated type of 1.0
constraint-2 : union type
boosting_type goss cannot use bagging (which means subsample_freq = 0 and subsample = 1.0)
boosting_type : negated type of ‘goss’
or subsample_freq : 0
or subsample : 1.0
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – The input samples. Internally, it will be converted to
y (array of items : float) – Target values real numbers
sample_weight (union type, optional, default None) –
Weights of training data.
array of items : float
or None
init_score (union type, optional, default None) –
Init score of training data.
array of items : float
or None
group (any type, optional, default None) – Group data of training data.
eval_set (any type, optional, default None) – A list of (X, y) tuple pairs to use as validation sets.
eval_names (any type, optional, default None) – Names of eval_set.
eval_sample_weight (any type, optional, default None) – Weights of eval data.
eval_class_weight (union type, optional, default None) –
Class weights of eval data.
array of items : float
or None
eval_init_score (any type, optional, default None) – Init score of eval data.
eval_group (any type, optional, default None) – Group data of eval data.
eval_metric (union type, optional, default None) –
string, list of strings, callable or None, optional (default=None).
array of items : string
or ‘l2’ or None
or callable
early_stopping_rounds (union type, optional, default None) –
Activates early stopping. The model will train until the validation score stops improving.
integer
or None
verbose (union type, optional, default True) –
Requires at least one evaluation data.
boolean
or integer
feature_name (union type, optional, default 'auto') –
Feature names. If ‘auto’ and data is pandas DataFrame, data columns names are used.
array of items : string
or ‘auto’
categorical_feature (union type, optional, default 'auto') –
Categorical features. If list of int, interpreted as indices. If list of strings, interpreted as feature names.
array
items : union type
string
or integer
or ‘auto’
callbacks (union type, optional, default None) –
List of callback functions that are applied at each iteration.
array of items : dict
or None
- partial_fit(X, y=None, **fit_params)¶
Incremental fit to train train the operator on a batch of samples.
Note: The partial_fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – The input samples. Internally, it will be converted to
y (array of items : float) – Target values real numbers
sample_weight (union type, optional, default None) –
Weights of training data.
array of items : float
or None
init_score (union type, optional, default None) –
Init score of training data.
array of items : float
or None
group (any type, optional, default None) – Group data of training data.
eval_set (any type, optional, default None) – A list of (X, y) tuple pairs to use as validation sets.
eval_names (any type, optional, default None) – Names of eval_set.
eval_sample_weight (any type, optional, default None) – Weights of eval data.
eval_class_weight (union type, optional, default None) –
Class weights of eval data.
array of items : float
or None
eval_init_score (any type, optional, default None) – Init score of eval data.
eval_group (any type, optional, default None) – Group data of eval data.
eval_metric (union type, optional, default None) –
string, list of strings, callable or None, optional (default=None).
array of items : string
or ‘l2’ or None
or callable
early_stopping_rounds (union type, optional, default None) –
Activates early stopping. The model will train until the validation score stops improving.
integer
or None
verbose (union type, optional, default True) –
Requires at least one evaluation data.
boolean
or integer
feature_name (union type, optional, default 'auto') –
Feature names. If ‘auto’ and data is pandas DataFrame, data columns names are used.
array of items : string
or ‘auto’
categorical_feature (union type, optional, default 'auto') –
Categorical features. If list of int, interpreted as indices. If list of strings, interpreted as feature names.
array
items : union type
string
or integer
or ‘auto’
callbacks (union type, optional, default None) –
List of callback functions that are applied at each iteration.
array of items : dict
or None
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array, optional of items : array of items : float) – Input features matrix.
raw_score (boolean, optional, default False) – Whether to predict raw scores.
num_iteration (union type, optional, default None) –
Limit number of iterations in the prediction.
integer
or None
pred_leaf (boolean, optional, default False) – Whether to predict leaf index.
pred_contrib (boolean, optional, default False) – Whether to predict feature contributions.
- Returns
result – Return the predicted value for each sample.
- Return type
array of items : float