lale.lib.autogen.k_bins_discretizer module

class lale.lib.autogen.k_bins_discretizer.KBinsDiscretizer(*, n_bins=5, encode='onehot', strategy='quantile', dtype=None, subsample='warn')

Bases: PlannedIndividualOp

Combined schema for expected data and hyperparameters.

This documentation is auto-generated from JSON schemas.

Parameters
  • n_bins (union type, not for optimizer, default 5) –

    The number of bins to produce

    • integer

    • or array of items : float

  • encode (‘onehot’, ‘onehot-dense’, or ‘ordinal’, default ‘onehot’) – Method used to encode the transformed result

  • strategy (‘uniform’, ‘quantile’, or ‘kmeans’, default ‘quantile’) – Strategy used to define the widths of the bins

  • dtype (Any, optional, not for optimizer, default None) –

  • subsample (union type, optional, not for optimizer, default 'warn') –

    Maximum number of samples, used to fit the model, for computational efficiency. Defaults to 200_000 when strategy=’quantile’ and to None when strategy=’uniform’ or strategy=’kmeans’. subsample=None means that all the training samples are used when computing the quantiles that determine the binning thresholds. Since quantile computation relies on sorting each column of X and that sorting has an n log(n) time complexity, it is recommended to use subsampling on datasets with a very large number of samples.

    • ’warn’ or None

    • or integer, >=0

Notes

constraint-1 : negated type of ‘X/isSparse’

A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (Any) – Data to be discretized.

  • y (Any) –

  • sample_weight (union type, optional, default None) –

    Contains weight values to be associated with each sample. Only possible when strategy is set to “quantile”.

    • array of items : float

    • or None

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (Any) – Data to be discretized.

Returns

result – Data in the binned space.

Return type

Any