lale.lib.rasl.ordinal_encoder module

class lale.lib.rasl.ordinal_encoder.OrdinalEncoder(*, categories='auto', dtype='float64', handle_unknown='use_encoded_value', unknown_value)

Bases: PlannedIndividualOp

Relational algebra reimplementation of scikit-learn’s OrdinalEncoder transformer that encodes categorical features as numbers.

This documentation is auto-generated from JSON schemas.

Works on both pandas and Spark dataframes by using Aggregate for fit and Map for transform, which in turn use the appropriate backend.

Parameters
  • categories (union type, not for optimizer, default 'auto') –

    • ‘auto’ or None

      Determine categories automatically from training data.

    • or array

      The ith list element holds the categories expected in the ith column.

      • items : union type

        • array of items : string

        • or array of items : float

          Should be sorted.

  • dtype ('float64', not for optimizer, default 'float64') – This implementation only supports dtype=’float64’.

  • handle_unknown ('use_encoded_value', optional, not for optimizer, default 'use_encoded_value') – This implementation only supports handle_unknown=’use_encoded_value’.

  • unknown_value (union type, optional, not for optimizer) –

    The encoded value of unknown categories to use when handle_unknown=’use_encoded_value’. It has to be distinct from the values used to encode any of the categories in fit. If set to np.nan, the dtype hyperparameter must be a float dtype.

    • integer

    • or nan or None

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    Features; the outer array is over samples.

    • items : union type

      • array of items : float

      • or array of items : string

  • y (any type, optional) – Target class labels; the array is over samples.

partial_fit(X, y=None, **fit_params)

Incremental fit to train train the operator on a batch of samples.

Note: The partial_fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

  • items : union type

    • array of items : float

    • or array of items : string

Returns

result – Ordinal codes.

Return type

array of items : array of items : float