lale.lib.rasl.batching module¶
- class lale.lib.rasl.batching.Batching(*, operator, batch_size=64, shuffle=False, num_workers=0, inmemory=False, num_epochs=None, max_resident=None, scoring=None, progress_callback=None, partial_transform=False, priority='resource_aware', verbose=0)¶
Bases:
PlannedIndividualOp
Batching trains the given pipeline using batches.
This documentation is auto-generated from JSON schemas.
The batch_size is used across all steps of the pipeline, serializing the intermediate outputs if specified.
- Parameters
operator (operator, optional, not for optimizer) – A lale pipeline object to be used inside of batching
batch_size (integer, >=1, >=32 for optimizer, <=128 for optimizer, uniform distribution, optional, default 64) – Batch size used for transform.
shuffle (boolean, optional, not for optimizer, default False) – Shuffle dataset before batching or not.
num_workers (integer, optional, not for optimizer, default 0) – Number of workers for pytorch dataloader.
inmemory (boolean, optional, not for optimizer, default False) –
- Whether all the computations are done in memory
or intermediate outputs are serialized. Only applies to transform/predict. For fit, use the max_resident argument.
num_epochs (union type, optional, not for optimizer, default None) –
Number of epochs. If the operator has num_epochs as a parameter, that takes precedence.
integer
or None
max_resident (union type, optional, not for optimizer, default None) –
Amount of memory to be used in bytes.
integer
or None
scoring (union type, optional, not for optimizer, default None) –
Batch-wise scoring metrics from lale.lib.rasl.
callable
or None
progress_callback (union type, optional, not for optimizer, default None) –
Callback function to get performance metrics per batch.
callable
or None
partial_transform (boolean, optional, not for optimizer, default False) – Whether to allow partially-trained upstream operators to transform data for training downstream operators even before the upstream operator has been fully trained.
priority (‘batch’, ‘step’, or ‘resource_aware’, optional, not for optimizer, default ‘resource_aware’) – Scheduling priority in task graphs. “batch” will execute tasks from earlier batches first. “step” will execute tasks from earlier steps first, like nested-loop algorithm. And “resource_aware” will execute tasks with less non-resident data first.
verbose (integer, optional, not for optimizer, default 0) – Verbosity level, higher values mean more information.
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (union type) –
Features; the outer array is over samples.
array
items : union type
float
or string
or boolean
or array
items : array
items : union type
float
or string
or boolean
or dict
y (union type) –
array
items : union type
integer
or float
or string
or None
classes (union type, optional) –
The total number of classes in the entire training dataset.
array
items : union type
float
or string
or boolean
or None
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (union type) –
Features; the outer array is over samples.
array
items : union type
float
or string
or boolean
or array
items : array
items : union type
float
or string
or boolean
or any type
y (array, optional) –
items : union type
integer
or float
- Returns
result – Output data schema for transformed data.
- Return type
Any
- transform(X, y=None)¶
Transform the data.
Note: The transform method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (union type) –
Features; the outer array is over samples.
array
items : union type
float
or string
or boolean
or array
items : array
items : union type
float
or string
or boolean
or any type
y (array, optional) –
items : union type
integer
or float
- Returns
result – Output data schema for transformed data.
- Return type
Any