lale.lib.category_encoders.target_encoder module¶
- class lale.lib.category_encoders.target_encoder.TargetEncoder(*, verbose=0, cols=None, drop_invariant=False, return_df=True, handle_missing='value', handle_unknown='value', min_samples_leaf=1, smoothing=1.0)¶
Bases:
PlannedIndividualOp
Target encoder transformer from scikit-learn contrib that encodes categorical features as numbers.
This documentation is auto-generated from JSON schemas.
- Parameters
verbose (integer, not for optimizer, default 0) – Verbosity of the output, 0 for none.
cols (union type, not for optimizer, default None) –
Columns to encode.
None
All string columns will be encoded.
or array of items : string
drop_invariant (boolean, not for optimizer, default False) – Whether to drop columns with 0 variance.
return_df (boolean, not for optimizer, default True) – Whether to return a pandas DataFrame from transform (otherwise it will be a numpy array).
handle_missing (‘error’, ‘return_nan’, or ‘value’, not for optimizer, default ‘value’) – Given ‘value’, return the target mean.
handle_unknown (‘error’, ‘return_nan’, or ‘value’, not for optimizer, default ‘value’) – Given ‘value’, return the target mean.
min_samples_leaf (integer, >=1, <=10 for optimizer, not for optimizer, default 1) – For regularization the weighted average between category mean and global mean is taken. The weight is an S-shaped curve between 0 and 1 with the number of samples for a category on the x-axis. The curve reaches 0.5 at min_samples_leaf. (parameter k in the original paper)
smoothing (float, >0.0, <=10.0 for optimizer, not for optimizer, default 1.0) – Smoothing effect to balance categorical average vs prior. Higher value means stronger regularization. The value must be strictly bigger than 0. Higher values mean a flatter S-curve (see min_samples_leaf).
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
Features; the outer array is over samples.
items : array
items : union type
float
or string
y (union type) –
Target class labels; the array is over samples.
array of items : float
or array of items : string
- transform(X, y=None)¶
Transform the data.
Note: The transform method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array) –
Features; the outer array is over samples.
items : array
items : union type
float
or string
- Returns
result
- Return type
array of items : array of items : float