lale.lib.sklearn.target_encoder module¶
- class lale.lib.sklearn.target_encoder.TargetEncoder(*, categories='auto', target_type='auto', smooth='auto', cv=5, shuffle=True, random_state=None)¶
Bases:
PlannedIndividualOp
Target encoder for regression and classification targets..
This documentation is auto-generated from JSON schemas.
- Parameters
categories (union type, not for optimizer, default 'auto') –
Categories (unique values) per feature.
’auto’
Determine categories automatically from training data.
or array
The ith list element holds the categories expected in the ith column.
items : union type
array of items : string
or array of items : float
Should be sorted.
target_type (union type, not for optimizer, default 'auto') –
Type of target.
’auto’
Type of target is inferred with type_of_target.
or ‘continuous’
Continuous target
or ‘binary’
Binary target
or ‘multiclass’
Multiclass target
smooth (union type, optional, not for optimizer, default 'auto') –
The amount of mixing of the target mean conditioned on the value of the category with the global target mean.
’auto’
Set to an empirical Bayes estimate.
or float, >=0.0, <=1.0
A larger smooth value will put more weight on the global target mean
cv (integer, >=1, optional, not for optimizer, default 5) – Determines the number of folds in the cross fitting strategy used in fit_transform. For classification targets, StratifiedKFold is used and for continuous targets, KFold is used.
shuffle (boolean, optional, not for optimizer, default True) – Whether to shuffle the data in fit_transform before splitting into folds. Note that the samples within each split will not be shuffled.
random_state (union type, optional, not for optimizer, default None) –
When shuffle is True, random_state affects the ordering of the indices, which controls the randomness of each fold. Otherwise, this parameter has no effect. Pass an int for reproducible output across multiple function calls.
None
or numpy.random.RandomState
Use the provided random state, only affecting other users of that same random state instance.
or integer
Explicit seed.
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
Features; the outer array is over samples.
items : union type
array of items : float
or array of items : string
y (array, optional) – The target data used to encode the categories.
- transform(X, y=None)¶
Transform the data.
Note: The transform method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
Features; the outer array is over samples.
items : union type
array of items : float
or array of items : string
- Returns
result – Transformed input; the outer array is over samples.
items : union type
array of items : float
or array of items : string
- Return type
array