lale.lib.autogen.k_bins_discretizer module¶
- class lale.lib.autogen.k_bins_discretizer.KBinsDiscretizer(*, n_bins=5, encode='onehot', strategy='quantile', dtype=None, subsample='warn')¶
Bases:
PlannedIndividualOp
Combined schema for expected data and hyperparameters.
This documentation is auto-generated from JSON schemas.
- Parameters
n_bins (union type, not for optimizer, default 5) –
The number of bins to produce
integer
or array of items : float
encode (‘onehot’, ‘onehot-dense’, or ‘ordinal’, default ‘onehot’) – Method used to encode the transformed result
strategy (‘uniform’, ‘quantile’, or ‘kmeans’, default ‘quantile’) – Strategy used to define the widths of the bins
dtype (Any, optional, not for optimizer, default None) –
subsample (union type, optional, not for optimizer, default 'warn') –
Maximum number of samples, used to fit the model, for computational efficiency. Defaults to 200_000 when strategy=’quantile’ and to None when strategy=’uniform’ or strategy=’kmeans’. subsample=None means that all the training samples are used when computing the quantiles that determine the binning thresholds. Since quantile computation relies on sorting each column of X and that sorting has an n log(n) time complexity, it is recommended to use subsampling on datasets with a very large number of samples.
’warn’ or None
or integer, >=0
Notes
constraint-1 : negated type of ‘X/isSparse’
A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (Any) – Data to be discretized.
y (Any) –
sample_weight (union type, optional, default None) –
Contains weight values to be associated with each sample. Only possible when strategy is set to “quantile”.
array of items : float
or None
- transform(X, y=None)¶
Transform the data.
Note: The transform method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (Any) – Data to be discretized.
- Returns
result – Data in the binned space.
- Return type
Any