lale.lib.sklearn.column_transformer module¶

class lale.lib.sklearn.column_transformer.ColumnTransformer(*, transformers, remainder='drop', sparse_threshold=0.3, n_jobs=None, transformer_weights=None, verbose=False, verbose_feature_names_out=True, force_int_remainder_cols=True)¶

Bases: PlannedIndividualOp

ColumnTransformer from scikit-learn applies transformers to columns of an array or pandas DataFrame.

This documentation is auto-generated from JSON schemas.

Parameters

transformers (array, not for optimizer) –
Operators or pipelines to be applied to subsets of the data.
- items : tuple, >=3 items, <=3 items
  Tuple of (name, transformer, column(s)).
  - item 0 : string
    
    Name.
  - item 1 : union type
    
    Transformer.
    
    operator
    
    Transformer supporting fit and transform.
    
    or ‘passthrough’ or ‘drop’
  - item 2 : union type
    
    Column(s).
    
    integer
    
    One column by index.
    
    or array of items : integer
    
    Multiple columns by index.
    
    or string
    
    One Dataframe column by name.
    
    or array of items : string
    
    Multiple Dataframe columns by names.
    
    or array of items : boolean
    
    Boolean mask.
    
    or callable, not for optimizer of integer or array or string
    
    Callable that is passed the input data X and can return any of the above.
remainder (union type, optional, not for optimizer, default 'drop') –
Transformation for columns that were not specified in transformers.
- operator
  
  Transformer supporting fit and transform.
- or ‘passthrough’ or ‘drop’
sparse_threshold (float, >=0.0, <=1.0, optional, not for optimizer, default 0.3) – If the output of the different transfromers contains sparse matrices, these will be stacked as a sparse matrix if the overall density is lower than this value. Use sparse_threshold=0 to always return dense.
n_jobs (union type, optional, not for optimizer, default None) –
Number of jobs to run in parallel
- None
  
  1 unless in joblib.parallel_backend context.
- or -1
  
  Use all processors.
- or integer, >=1
  
  Number of CPU cores.
transformer_weights (union type, optional, not for optimizer, default None) –
Multiplicative weights for features per transformer. The output of the transformer is multiplied by these weights.
- dict
  
  Keys are transformer names, values the weights.
- or None
verbose (boolean, optional, not for optimizer, default False) – If True, the time elapsed while fitting each transformer will be printed as it is completed.
verbose_feature_names_out (union type, optional, not for optimizer, default True) –
- boolean
  
  If True, get_feature_names_out will prefix all feature names with the name of the transformer that generated that feature.
  If False, get_feature_names_out will not prefix any feature names and will error if feature names are not unique.
- or string
  
  A string ready for formatting. The given string will be formatted using two field names: transformer_name and feature_name. e.g. “{feature_name}__{transformer_name}”
- or callable, not for optimizer
  
  A Callable[[str, str], str]. ColumnTransformer.get_feature_names_out will rename all the features using the name of the transformer. The first argument of the callable is the transformer name and the second argument is the feature name. The returned string will be the new feature name.
force_int_remainder_cols (boolean, optional, not for optimizer, default True) –

Notes

constraint-1 : negated type of ‘X/isSparse’

A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

fit(X, y=None, **fit_params)¶

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters

X (array) –
Features; the outer array is over samples.
- items : array
  - items : union type
    
    float
    
    or string
y (any type, optional) – Target for supervised learning (ignored).

transform(X, y=None)¶

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

items : array
- items : union type
  float
  
  or string

Returns

result – Features; the outer array is over samples.

Return type

array of items : array of items : float