lale.lib.sklearn.column_transformer module¶

class lale.lib.sklearn.column_transformer.ColumnTransformer(*, transformers, remainder='drop', sparse_threshold=0.3, n_jobs=None, transformer_weights=None, verbose=False, verbose_feature_names_out=True)¶

Bases: PlannedIndividualOp

ColumnTransformer from scikit-learn applies transformers to columns of an array or pandas DataFrame.

This documentation is auto-generated from JSON schemas.

Parameters

transformers (array, not for optimizer) –
Operators or pipelines to be applied to subsets of the data.
- items : tuple, >=3 items, <=3 items
  Tuple of (name, transformer, column(s)).
  - item 0 : string
    
    Name.
  - item 1 : union type
    
    Transformer.
    
    operator
    
    Transformer supporting fit and transform.
    
    or ‘passthrough’ or ‘drop’
  - item 2 : union type
    
    Column(s).
    
    integer
    
    One column by index.
    
    or array of items : integer
    
    Multiple columns by index.
    
    or string
    
    One Dataframe column by name.
    
    or array of items : string
    
    Multiple Dataframe columns by names.
    
    or array of items : boolean
    
    Boolean mask.
    
    or callable of integer or array or string
    
    Callable that is passed the input data X and can return any of the above.
remainder (union type, optional, not for optimizer, default 'drop') –
Transformation for columns that were not specified in transformers.
- operator
  
  Transformer supporting fit and transform.
- or ‘passthrough’ or ‘drop’
sparse_threshold (float, >=0.0, <=1.0, optional, not for optimizer, default 0.3) – If the output of the different transfromers contains sparse matrices, these will be stacked as a sparse matrix if the overall density is lower than this value. Use sparse_threshold=0 to always return dense.
n_jobs (union type, optional, not for optimizer, default None) –
Number of jobs to run in parallel
- None
  
  1 unless in joblib.parallel_backend context.
- or -1
  
  Use all processors.
- or integer, >=1
  
  Number of CPU cores.
transformer_weights (union type, optional, not for optimizer, default None) –
Multiplicative weights for features per transformer. The output of the transformer is multiplied by these weights.
- dict
  
  Keys are transformer names, values the weights.
- or None
verbose (boolean, optional, not for optimizer, default False) – If True, the time elapsed while fitting each transformer will be printed as it is completed.
verbose_feature_names_out (boolean, optional, not for optimizer, default True) – If True, get_feature_names_out will prefix all feature names with the name of the transformer that generated that feature. If False, get_feature_names_out will not prefix any feature names and will error if feature names are not unique.

Notes

constraint-1 : negated type of ‘X/isSparse’

A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

fit(X, y=None, **fit_params)¶

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters

X (array) –
Features; the outer array is over samples.
- items : array
  - items : union type
    
    float
    
    or string
y (any type, optional) – Target for supervised learning (ignored).

transform(X, y=None)¶

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array) –

Features; the outer array is over samples.

items : array
- items : union type
  float
  
  or string

Returns

result – Features; the outer array is over samples.

Return type

array of items : array of items : float