lale.lib.rasl.batching module¶

class lale.lib.rasl.batching.Batching(*, operator, batch_size=64, shuffle=False, num_workers=0, inmemory=False, num_epochs=None, max_resident=None, scoring=None, progress_callback=None, partial_transform=False, priority='resource_aware', verbose=0)¶

Bases: PlannedIndividualOp

Batching trains the given pipeline using batches.

This documentation is auto-generated from JSON schemas.

The batch_size is used across all steps of the pipeline, serializing the intermediate outputs if specified.

Parameters

operator (operator, optional, not for optimizer) – A lale pipeline object to be used inside of batching
batch_size (integer, >=1, >=32 for optimizer, <=128 for optimizer, uniform distribution, optional, default 64) – Batch size used for transform.
shuffle (boolean, optional, not for optimizer, default False) – Shuffle dataset before batching or not.
num_workers (integer, optional, not for optimizer, default 0) – Number of workers for pytorch dataloader.
inmemory (boolean, optional, not for optimizer, default False) –

Whether all the computations are done in memory
or intermediate outputs are serialized. Only applies to transform/predict. For fit, use the max_resident argument.
num_epochs (union type, optional, not for optimizer, default None) –
Number of epochs. If the operator has num_epochs as a parameter, that takes precedence.
- integer
- or None
max_resident (union type, optional, not for optimizer, default None) –
Amount of memory to be used in bytes.
- integer
- or None
scoring (union type, optional, not for optimizer, default None) –
Batch-wise scoring metrics from lale.lib.rasl.
- callable
- or None
progress_callback (union type, optional, not for optimizer, default None) –
Callback function to get performance metrics per batch.
- callable
- or None
partial_transform (boolean, optional, not for optimizer, default False) – Whether to allow partially-trained upstream operators to transform data for training downstream operators even before the upstream operator has been fully trained.
priority (‘batch’, ‘step’, or ‘resource_aware’, optional, not for optimizer, default ‘resource_aware’) – Scheduling priority in task graphs. “batch” will execute tasks from earlier batches first. “step” will execute tasks from earlier steps first, like nested-loop algorithm. And “resource_aware” will execute tasks with less non-resident data first.
verbose (integer, optional, not for optimizer, default 0) – Verbosity level, higher values mean more information.

fit(X, y=None, **fit_params)¶

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters

X (union type) –
Features; the outer array is over samples.
- array
  - items : union type
    
    float
    
    or string
    
    or boolean
- or array
  - items : array
    
    items : union type
    
    float
    
    or string
    
    or boolean
- or dict
y (union type) –
- array
  - items : union type
    
    integer
    
    or float
    
    or string
- or None
classes (union type, optional) –
The total number of classes in the entire training dataset.
- array
  - items : union type
    
    float
    
    or string
    
    or boolean
- or None

predict(X, **predict_params)¶

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (union type) –
Features; the outer array is over samples.
- array
  - items : union type
    
    float
    
    or string
    
    or boolean
- or array
  - items : array
    
    items : union type
    
    float
    
    or string
    
    or boolean
- or any type
y (array, optional) –
- items : union type
  - integer
  - or float

Returns

result – Output data schema for transformed data.

Return type

Any

transform(X, y=None)¶

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (union type) –
Features; the outer array is over samples.
- array
  - items : union type
    
    float
    
    or string
    
    or boolean
- or array
  - items : array
    
    items : union type
    
    float
    
    or string
    
    or boolean
- or any type
y (array, optional) –
- items : union type
  - integer
  - or float

Returns

result – Output data schema for transformed data.

Return type

Any