lale.lib.rasl.simple_imputer module

class lale.lib.rasl.simple_imputer.SimpleImputer(*, missing_values=nan, strategy='mean', fill_value=None, verbose=0, copy=True, add_indicator=False)

Bases: PlannedIndividualOp

Relational algebra reimplementation of scikit-learn’s SimpleImputer.

This documentation is auto-generated from JSON schemas.

Works on both pandas and Spark dataframes by using Aggregate for fit and Map for transform, which in turn use the appropriate backend.

Parameters
  • missing_values (union type, not for optimizer, default nan) –

    The placeholder for the missing values.

    • float

    • or string

    • or nan

    • or None

  • strategy (union type, default 'mean') –

    The imputation strategy.

    • ’constant’, not for optimizer

    • or ‘mean’, ‘median’, or ‘most_frequent’

  • fill_value (union type, not for optimizer, default None) –

    When strategy == “constant”, fill_value is used to replace all occurrences of missing_values

    • float

    • or string

    • or None

  • verbose (integer, not for optimizer, default 0) – Controls the verbosity of the imputer.

  • copy (True, not for optimizer, default True) – copy=True is the only value currently supported by this implementation

  • add_indicator (False, not for optimizer, default False) – add_indicator=False is the only value currently supported by this implementation

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    Input data, where n_samples is the number of samples and n_features is the number of features.

    • items : array

      • items : union type

        • float

        • or string

  • y (any type, optional) –

partial_fit(X, y=None, **fit_params)

Incremental fit to train train the operator on a batch of samples.

Note: The partial_fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

The input data to complete.

  • items : array

    • items : union type

      • float

      • or string

Returns

result – The input data to complete.

  • items : array

    • items : union type

      • float

      • or string

Return type

array