lale.lib.rasl.one_hot_encoder module

class lale.lib.rasl.one_hot_encoder.OneHotEncoder(*, categories='auto', sparse=False, dtype='float64', handle_unknown='ignore', drop=None)

Bases: PlannedIndividualOp

Relational algebra reimplementation of scikit-learn’s OneHotEncoder transformer that encodes categorical features as numbers.

This documentation is auto-generated from JSON schemas.

Works on both pandas and Spark dataframes by using Aggregate for fit and Map for transform, which in turn use the appropriate backend.

Parameters
  • categories (union type, not for optimizer, default 'auto') –

    • ‘auto’ or None

      Determine categories automatically from training data.

    • or array

      The ith list element holds the categories expected in the ith column.

      • items : union type

        • array of items : string

        • or array of items : float

          Should be sorted.

  • sparse (False, optional, not for optimizer, default False) – This implementation only supports sparse=False.

  • dtype ('float64', not for optimizer, default 'float64') – This implementation only supports dtype=’float64’.

  • handle_unknown ('ignore', not for optimizer, default 'ignore') – This implementation only supports handle_unknown=’ignore’.

  • drop (None, optional, not for optimizer, default None) – This implementation only supports drop=None.

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    Features; the outer array is over samples.

    • items : array

      • items : union type

        • float

        • or string

  • y (any type, optional) – Target class labels; the array is over samples.

partial_fit(X, y=None, **fit_params)

Incremental fit to train train the operator on a batch of samples.

Note: The partial_fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

  • items : array

    • items : union type

      • float

      • or string

Returns

result – One-hot codes.

Return type

array of items : array of items : float