lale.lib.sklearn.column_transformer module

class lale.lib.sklearn.column_transformer.ColumnTransformer(*, transformers, remainder='drop', sparse_threshold=0.3, n_jobs=None, transformer_weights=None, verbose=False, verbose_feature_names_out=True)

Bases: PlannedIndividualOp

ColumnTransformer from scikit-learn applies transformers to columns of an array or pandas DataFrame.

This documentation is auto-generated from JSON schemas.

Parameters
  • transformers (array, not for optimizer) –

    Operators or pipelines to be applied to subsets of the data.

    • items : tuple, >=3 items, <=3 items

      Tuple of (name, transformer, column(s)).

      • item 0 : string

        Name.

      • item 1 : union type

        Transformer.

        • operator

          Transformer supporting fit and transform.

        • or ‘passthrough’ or ‘drop’

      • item 2 : union type

        Column(s).

        • integer

          One column by index.

        • or array of items : integer

          Multiple columns by index.

        • or string

          One Dataframe column by name.

        • or array of items : string

          Multiple Dataframe columns by names.

        • or array of items : boolean

          Boolean mask.

        • or callable of integer or array or string

          Callable that is passed the input data X and can return any of the above.

  • remainder (union type, optional, not for optimizer, default 'drop') –

    Transformation for columns that were not specified in transformers.

    • operator

      Transformer supporting fit and transform.

    • or ‘passthrough’ or ‘drop’

  • sparse_threshold (float, >=0.0, <=1.0, optional, not for optimizer, default 0.3) – If the output of the different transfromers contains sparse matrices, these will be stacked as a sparse matrix if the overall density is lower than this value. Use sparse_threshold=0 to always return dense.

  • n_jobs (union type, optional, not for optimizer, default None) –

    Number of jobs to run in parallel

    • None

      1 unless in joblib.parallel_backend context.

    • or -1

      Use all processors.

    • or integer, >=1

      Number of CPU cores.

  • transformer_weights (union type, optional, not for optimizer, default None) –

    Multiplicative weights for features per transformer. The output of the transformer is multiplied by these weights.

    • dict

      Keys are transformer names, values the weights.

    • or None

  • verbose (boolean, optional, not for optimizer, default False) – If True, the time elapsed while fitting each transformer will be printed as it is completed.

  • verbose_feature_names_out (boolean, optional, not for optimizer, default True) – If True, get_feature_names_out will prefix all feature names with the name of the transformer that generated that feature. If False, get_feature_names_out will not prefix any feature names and will error if feature names are not unique.

Notes

constraint-1 : negated type of ‘X/isSparse’

A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    Features; the outer array is over samples.

    • items : array

      • items : union type

        • float

        • or string

  • y (any type, optional) – Target for supervised learning (ignored).

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

  • items : array

    • items : union type

      • float

      • or string

Returns

result – Features; the outer array is over samples.

Return type

array of items : array of items : float