lale.lib.sklearn.column_transformer module

class lale.lib.sklearn.column_transformer.ColumnTransformer(*, transformers, remainder='drop', sparse_threshold=0.3, n_jobs=None, transformer_weights=None, verbose=False, verbose_feature_names_out=True, force_int_remainder_cols=True)

Bases: PlannedIndividualOp

ColumnTransformer from scikit-learn applies transformers to columns of an array or pandas DataFrame.

This documentation is auto-generated from JSON schemas.

Parameters
  • transformers (array, not for optimizer) –

    Operators or pipelines to be applied to subsets of the data.

    • items : tuple, >=3 items, <=3 items

      Tuple of (name, transformer, column(s)).

      • item 0 : string

        Name.

      • item 1 : union type

        Transformer.

        • operator

          Transformer supporting fit and transform.

        • or ‘passthrough’ or ‘drop’

      • item 2 : union type

        Column(s).

        • integer

          One column by index.

        • or array of items : integer

          Multiple columns by index.

        • or string

          One Dataframe column by name.

        • or array of items : string

          Multiple Dataframe columns by names.

        • or array of items : boolean

          Boolean mask.

        • or callable, not for optimizer of integer or array or string

          Callable that is passed the input data X and can return any of the above.

  • remainder (union type, optional, not for optimizer, default 'drop') –

    Transformation for columns that were not specified in transformers.

    • operator

      Transformer supporting fit and transform.

    • or ‘passthrough’ or ‘drop’

  • sparse_threshold (float, >=0.0, <=1.0, optional, not for optimizer, default 0.3) – If the output of the different transfromers contains sparse matrices, these will be stacked as a sparse matrix if the overall density is lower than this value. Use sparse_threshold=0 to always return dense.

  • n_jobs (union type, optional, not for optimizer, default None) –

    Number of jobs to run in parallel

    • None

      1 unless in joblib.parallel_backend context.

    • or -1

      Use all processors.

    • or integer, >=1

      Number of CPU cores.

  • transformer_weights (union type, optional, not for optimizer, default None) –

    Multiplicative weights for features per transformer. The output of the transformer is multiplied by these weights.

    • dict

      Keys are transformer names, values the weights.

    • or None

  • verbose (boolean, optional, not for optimizer, default False) – If True, the time elapsed while fitting each transformer will be printed as it is completed.

  • verbose_feature_names_out (union type, optional, not for optimizer, default True) –

    • boolean

      If True, get_feature_names_out will prefix all feature names with the name of the transformer that generated that feature.

      If False, get_feature_names_out will not prefix any feature names and will error if feature names are not unique.

    • or string

      A string ready for formatting. The given string will be formatted using two field names: transformer_name and feature_name. e.g. “{feature_name}__{transformer_name}”

    • or callable, not for optimizer

      A Callable[[str, str], str]. ColumnTransformer.get_feature_names_out will rename all the features using the name of the transformer. The first argument of the callable is the transformer name and the second argument is the feature name. The returned string will be the new feature name.

  • force_int_remainder_cols (boolean, optional, not for optimizer, default True) –

Notes

constraint-1 : negated type of ‘X/isSparse’

A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array) –

    Features; the outer array is over samples.

    • items : array

      • items : union type

        • float

        • or string

  • y (any type, optional) – Target for supervised learning (ignored).

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

  • items : array

    • items : union type

      • float

      • or string

Returns

result – Features; the outer array is over samples.

Return type

array of items : array of items : float