lale.lib.rasl.project module

class lale.lib.rasl.project.Project(*, columns=None, drop_columns=None)

Bases: PlannedIndividualOp

Projection keeps a subset of the columns, like in relational algebra.

This documentation is auto-generated from JSON schemas.

Examples

>>> df = pd.DataFrame(data={'A': [1,2], 'B': ['x','y'], 'C': [3,4]})
>>> keep_numbers = Project(columns={'type': 'number'})
>>> keep_numbers.fit(df).transform(df)
NDArrayWithSchema([[1, 3],
                   [2, 4]])
Parameters
  • columns (union type, not for optimizer, default None) –

    The subset of columns to retain.

    The supported column specification formats include some of the ones from scikit-learn’s ColumnTransformer, and in addition, filtering by using a JSON subschema check.

    • None

      If not specified, keep all columns.

    • or array of items : integer

      Multiple columns by index.

    • or array of items : string

      Multiple Dataframe columns by names.

    • or callable

      Callable that is passed the input data X and can return a list of column names or indices.

    • or dict

      Keep columns whose schema is a subschema of this JSON schema.

  • drop_columns (union type, not for optimizer, default None) –

    The subset of columns to remove.

    The drop_columns argument supports the same formats as columns. If both are specified, keep everything from columns that is not also in drop_columns.

    • None

      If not specified, drop no further columns.

    • or array of items : integer

      Multiple columns by index.

    • or array of items : string

      Multiple Dataframe columns by names.

    • or callable

      Callable that is passed the input data X and can return a list of column names or indices.

    • or dict

      Remove columns whose schema is a subschema of this JSON schema.

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    Features; the outer array is over samples.

    • items : array

      • items : union type

        • float

        • or string

  • y (any type, optional) – Target for supervised learning (ignored).

partial_fit(X, y=None, **fit_params)

Incremental fit to train train the operator on a batch of samples.

Note: The partial_fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

  • items : array

    • items : union type

      • float

      • or string

Returns

result – Features; the outer array is over samples.

  • items : array

    • items : union type

      • float

      • or string

Return type

array

lale.lib.rasl.project.get_column_factory(columns, kind) MonoidFactory[source]