lale.operators module¶
Classes for Lale operators including individual operators, pipelines, and operator choice.
This module declares several functions for constructing individual operators, pipelines, and operator choices.
Functions make_pipeline and Pipeline compose linear sequential pipelines, where each step has an edge to the next step. Instead of these functions you can also use the >> combinator.
Functions make_union_no_concat and make_union compose pipelines that operate over the same data without edges between their steps. Instead of these functions you can also use the & combinator.
Function make_choice creates an operator choice. Instead of this function you can also use the | combinator.
Function make_pipeline_graph creates a pipeline from steps and edges, thus supporting any arbitrary acyclic directed graph topology.
Function make_operator creates an individual Lale operator from a schema and an implementation class or object. This is called for each of the operators in module lale.lib when it is being imported.
Functions get_available_operators, get_available_estimators, and get_available_transformers return lists of individual operators previously registered by make_operator.
The root of the hierarchy is the abstract class Operator, all other Lale operators inherit from this class, either directly or indirectly.
The abstract classes Operator, PlannedOperator, TrainableOperator, and TrainedOperator correspond to lifecycle states.
The concrete classes IndividualOp, PlannedIndividualOp, TrainableIndividualOp, and TrainedIndividualOp inherit from the corresponding abstract operator classes and encapsulate implementations of individual operators from machine-learning libraries such as scikit-learn.
The concrete classes BasePipeline, PlannedPipeline, TrainablePipeline, and TrainedPipeline inherit from the corresponding abstract operator classes and represent directed acyclic graphs of operators. The steps of a pipeline can be any operators, including individual operators, other pipelines, or operator choices, whose lifecycle state is at least that of the pipeline.
The concrete class OperatorChoice represents a planned operator that offers a choice for automated algorithm selection. The steps of a choice can be any planned operators, including individual operators, pipelines, or other operator choices.
The following picture illustrates the core operator class hierarchy.
scikit-learn compatibility:¶
Lale operators attempt to behave like reasonable sckit-learn operators when possible. In particular, operators support:
get_params to return the hyperparameter settings for an operator.
set_params for updating them (in-place). This is only supported by TrainableIndividualOps and Pipelines. Note that while set_params is supported for compatibility, but its use is not encouraged, since it mutates the operator in-place. Instead, we recommend using with_params, a functional alternative that is supported by all operators. It returns a new operator with updated parameters.
sklearn.base.clone works for Lale operators, cloning them as expected. Note that cloning a TrainedOperator will return a TrainableOperator, since the cloned version does not have the result of training.
There also some known differences (that we are not currently planning on changing):
Lale operators do not inherit from any sklearn base class.
The Operator class constructors do not explicitly declare their set of hyperparameters. However, the do implement get_params, (just not using sklearn style reflection).
There may also be other incompatibilities: our testing currently focuses on ensuring that clone works.
parameter path format:¶
scikit-learn uses a simple addressing scheme to refer to nested hyperparameter: name__param refers to the param hyperparameter nested under the name object. Since lale supports richer structures, we conservatively extend this scheme as follows:
__ : separates nested components (as-in sklearn).
? : is the discriminant (choice made) for a choice.
? : is also a prefix for the nested parts of the chosen branch.
x@n : In a pipeline, if multiple components have identical names, everything but the first are suffixed with a number (starting with 1) indicating which one we are talking about. For example, given (x >> y >> x), we would treat this much the same as (x >> y >> x@1).
$ : is used in the rare case that sklearn would expect the key of an object, but we allow (and have) a non-object schema. In that case, $ is used as the key. This should only happen at the top level, since nested occurrences should be removed.
# : is a structure indicator, and the value should be one of ‘list’, ‘tuple’, or ‘dict’.
n : is used to represent the nth component in an array or tuple.
- class lale.operators.BasePipeline(steps: List[OpType_co], edges: Optional[Iterable[Tuple[OpType_co, OpType_co]]] = None, _lale_preds: Optional[Union[Dict[int, List[int]], Dict[OpType_co, List[OpType_co]]]] = None, ordered: bool = False)[source]¶
Bases:
Operator
,Generic
[OpType_co
]This is a concrete class that can instantiate a new pipeline operator and provide access to its meta data.
- get_params(deep: Union[bool, Literal[0]] = True) Dict[str, Any] [source]¶
If deep is False, additional ‘_lale_XXX’ fields are added to support cloning. If these are not desires, deep=0 can be used to disable this
- is_classifier() bool [source]¶
Checks if this operator is a clasifier.
- Returns
True if the classifier tag is set.
- Return type
- is_supervised() bool [source]¶
Checks if this operator needs labeled data for learning.
- Returns
True if the fit method requires a y argument.
- Return type
- remove_last(inplace: bool = False) BasePipeline[OpType_co] [source]¶
- set_params(**impl_params)[source]¶
This implements the set_params, as per the scikit-learn convention, extended as documented in the module docstring
- property steps: List[Tuple[str, OpType_co]]¶
This is meant to function similarly to the scikit-learn steps property and for linear pipelines, should behave the same
- class lale.operators.IndividualOp(_lale_name: str, _lale_impl, _lale_schemas, _lale_frozen_hyperparameters=None, **hp)[source]¶
Bases:
Operator
This is a concrete class that can instantiate a new individual operator and provide access to its metadata. The enum property can be used to access enumerations for hyper-parameters, auto-generated from the operator’s schema. For example, LinearRegression.enum.solver.saga As a short-hand, if the hyper-parameter name does not conflict with any fields of this class, the auto-generated enums can also be accessed directly. For example, LinearRegression.solver.saga
Create a new IndividualOp.
- Parameters
- property enum: _DictionaryObjectForEnum¶
- get_defaults() Mapping[str, Any] [source]¶
Returns the default values of hyperparameters for the operator.
- Returns
A dictionary with names of the hyperparamers as keys and their default values as values.
- Return type
- get_forwards() Union[bool, List[str]] [source]¶
Returns the list of attributes (methods/properties) the schema has asked to be forwarded. A boolean value is a blanket opt-in or out of forwarding
- get_param_dist(size=10) Dict[str, List[Any]] [source]¶
Returns a dictionary for discretized hyperparameters.
Each entry is a list of values. For continuous hyperparameters, it returns up to size uniformly distributed values.
Warning: ignores side constraints, unions, and distributions.
- get_param_ranges() Tuple[Dict[str, Any], Dict[str, Any]] [source]¶
Returns two dictionaries, ranges and cat_idx, for hyperparameters.
The ranges dictionary has two kinds of entries. Entries for numeric and Boolean hyperparameters are tuples of the form (min, max, default). Entries for categorical hyperparameters are lists of their values.
The cat_idx dictionary has (min, max, default) entries of indices into the corresponding list of values.
Warning: ignores side constraints and unions.
- get_params(deep: Union[bool, Literal[0]] = True) Dict[str, Any] [source]¶
Get parameters for this operator.
This method follows scikit-learn’s convention that all operators have a constructor which takes a list of keyword arguments. This is not required for operator impls which do not desire scikit-compatibility.
- Parameters
deep (boolean, optional) – If True, will return the parameters for this operator, and their nested parameters If False, will return the parameters for this operator, along with ‘_lale_XXX` fields needed to support cloning
- Returns
params – Parameter names mapped to their values.
- Return type
mapping of string to any
- get_schema(schema_kind: str) Dict[str, Any] [source]¶
Return a schema of the operator.
- Parameters
schema_kind (string, 'hyperparams' or 'input_fit' or 'input_partial_fit' or 'input_transform' or 'input_transform_X_y' or 'input_predict' or 'input_predict_proba' or 'input_decision_function' or 'output_transform' or 'output_transform_X_y' or 'output_predict' or 'output_predict_proba' or 'output_decision_function') – Type of the schema to be returned.
- Returns
The Python object containing the JSON schema of the operator. For all the schemas currently present, this would be a dictionary.
- Return type
- get_tags() Dict[str, List[str]] [source]¶
Return the tags of an operator.
- Returns
A list of tags describing the operator.
- Return type
- has_schema(schema_kind: str) bool [source]¶
Return true if the operator has the schema kind.
- Parameters
schema_kind (string, 'hyperparams' or 'input_fit' or 'input_partial_fit' or 'input_transform' or 'input_transform_X_y' or 'input_predict' or 'input_predict_proba' or 'input_decision_function' or 'output_transform' or 'output_transform_X_y' or 'output_predict' or 'output_predict_proba' or 'output_decision_function' or 'input_score_samples' or 'output_score_samples') – Type of the schema to be returned.
- Return type
True if the json schema is present, False otherwise.
- has_tag(tag: str) bool [source]¶
Check the presence of a tag for an operator.
- Parameters
tag (string) –
- Returns
Flag indicating the presence or absence of the given tag in this operator’s schemas.
- Return type
boolean
- hyperparam_schema(name: Optional[str] = None) Dict[str, Any] [source]¶
Returns the hyperparameter schema for the operator.
- Parameters
name (string, optional) – Name of the hyperparameter.
- Returns
Full hyperparameter schema for this operator or part of the schema corresponding to the hyperparameter given by parameter name.
- Return type
- hyperparams_all() Optional[Dict[str, Any]] [source]¶
This is the hyperparameters that are currently set. Some of them may not have been set explicitly (e.g. if this is a clone of an operator, some of these may be defaults. To get the hyperparameters that were actually set, use
hyperparams()
- property impl: Any¶
Returns the underlying impl. This can be used to access additional field and methods not exposed by Lale. If only the type of the impl is needed, please use self.impl_class instead, as it can be more efficient.
If the found impl has a _wrapped_model, it will be returned instead
- property impl_class: type¶
Returns the class of the underlying impl. This should return the same thing as self.impl.__class__, but can be more efficient.
- input_schema_decision_function() Dict[str, Any] [source]¶
Input schema for the decision_function method.
- input_schema_predict_log_proba() Dict[str, Any] [source]¶
Input schema for the predict_log_proba method. We assume that it is the same as the predict_proba method if none has been defined explicitly.
- input_schema_score_samples() Dict[str, Any] [source]¶
Input schema for the score_samples method. We assume that it is the same as the predict method if none has been defined explicitly.
- is_classifier() bool [source]¶
Checks if this operator is a clasifier.
- Returns
True if the classifier tag is set.
- Return type
- is_supervised(default_if_missing=True) bool [source]¶
Checks if this operator needs labeled data for learning.
- Returns
True if the fit method requires a y argument.
- Return type
- output_schema_decision_function() Dict[str, Any] [source]¶
Output schema for the decision_function method.
- output_schema_predict_log_proba() Dict[str, Any] [source]¶
Output schema for the predict_log_proba method. We assume that it is the same as the predict_proba method if none has been defined explicitly.
- output_schema_score_samples() Dict[str, Any] [source]¶
Output schema for the score_samples method. We assume that it is the same as the predict method if none has been defined explicitly.
- property shallow_impl: Any¶
Returns the underlying impl. This can be used to access additional field and methods not exposed by Lale. If only the type of the impl is needed, please use self.impl_class instead, as it can be more efficient.
- class lale.operators.Operator[source]¶
Bases:
object
Abstract base class for all Lale operators.
Pipelines and individual operators extend this.
- step_1 >> step_2 -> PlannedPipeline
Pipe combinator, create two-step pipeline with edge from step_1 to step_2.
If step_1 is a pipeline, create edges from all of its sinks. If step_2 is a pipeline, create edges to all of its sources.
- Parameters
- Returns
Pipeline with edge from step_1 to step_2.
- Return type
- step_1 & step_2 -> PlannedPipeline
And combinator, create two-step pipeline without an edge between step_1 and step_2.
- Parameters
- Returns
Pipeline without any additional edges beyond those already inside of step_1 or step_2.
- Return type
- step_1 | step_2 -> OperatorChoice
Or combinator, create operator choice between step_1 and step_2.
- Parameters
- Returns
Algorithmic coice between step_1 or step_2.
- Return type
- property classes_¶
- clone() Operator [source]¶
Return a copy of this operator, with the same hyper-parameters but without training data This behaves the same as calling sklearn.base.clone(self)
- property coef_¶
- diff(other: Operator, show_imports: bool = True, customize_schema: bool = False, ipython_display: Literal[False] = False) str [source]¶
- diff(other: Operator, show_imports: bool = True, customize_schema: bool = False, ipython_display: bool = False) Optional[str]
Displays a diff between this operator and the given other operator.
- Parameters
other (Operator) – Operator to diff against
show_imports (bool, default True) – Whether to include import statements in the pretty-printed code.
customize_schema (bool, default False) – If True, then individual operators whose schema differs from the lale.lib version of the operator will be printed with calls to customize_schema that reproduce this difference.
ipython_display (bool, default False) – If True, will display Markdown-formatted diff string in Jupyter notebook. If False, returns pretty-printing diff as Python string.
- Returns
If called with ipython_display=False, return pretty-printed diff as a Python string.
- Return type
str or None
- property feature_importances_¶
- get_forwards() Union[bool, List[str]] [source]¶
Returns the list of attributes (methods/properties) the schema has asked to be forwarded. A boolean value is a blanket opt-in or out of forwarding
- get_param_dist(size=10) Dict[str, List[Any]] [source]¶
Returns a dictionary for discretized hyperparameters.
Each entry is a list of values. For continuous hyperparameters, it returns up to size uniformly distributed values.
Warning: ignores side constraints, unions, and distributions.
- get_param_ranges() Tuple[Dict[str, Any], Dict[str, Any]] [source]¶
Returns two dictionaries, ranges and cat_idx, for hyperparameters.
The ranges dictionary has two kinds of entries. Entries for numeric and Boolean hyperparameters are tuples of the form (min, max, default). Entries for categorical hyperparameters are lists of their values.
The cat_idx dictionary has (min, max, default) entries of indices into the corresponding list of values.
Warning: ignores side constraints and unions.
- abstract is_classifier() bool [source]¶
Checks if this operator is a clasifier.
- Returns
True if the classifier tag is set.
- Return type
- is_frozen_trainable() bool [source]¶
Return true if all hyperparameters are bound, in other words, search spaces contain no free hyperparameters to be tuned.
- is_frozen_trained() bool [source]¶
Return true if all learnable coefficients are bound, in other words, there are no free parameters to be learned by fit.
- abstract is_supervised() bool [source]¶
Checks if this operator needs labeled data for learning.
- Returns
True if the fit method requires a y argument.
- Return type
- property n_classes_¶
- pretty_print(*, show_imports: bool = True, combinators: bool = True, assign_nested: bool = True, customize_schema: bool = False, astype: Literal['lale', 'sklearn'] = 'lale', ipython_display: Literal[False] = False) str [source]¶
- pretty_print(*, show_imports: bool = True, combinators: bool = True, assign_nested: bool = True, customize_schema: bool = False, astype: Literal['lale', 'sklearn'] = 'lale', ipython_display: Union[bool, Literal['input']] = False) Optional[str]
Returns the Python source code representation of the operator.
- Parameters
show_imports (bool, default True) – Whether to include import statements in the pretty-printed code.
combinators (bool, default True) – If True, pretty-print with combinators (>>, |, &). Otherwise, pretty-print with functions (make_pipeline, make_choice, make_union) instead. Always False when astype is ‘sklearn’.
assign_nested (bool, default True) – If True, then nested operators, such as the base estimator for an ensemble, get assigned to fresh intermediate variables if configured with non-trivial arguments of their own.
customize_schema (bool, default False) – If True, then individual operators whose schema differs from the lale.lib version of the operator will be printed with calls to customize_schema that reproduce this difference.
astype (union type, default 'lale') –
‘lale’
Use lale.operators.make_pipeline and lale.operators.make_union when pretty-printing wth functions.
’sklearn’
Set combinators to False and use sklearn.pipeline.make_pipeline and sklearn.pipeline.make_union for pretty-printed functions.
ipython_display (union type, default False) –
False
Return the pretty-printed code as a plain old Python string.
True:
Pretty-print in notebook cell output with syntax highlighting.
’input’
Create a new notebook cell with pretty-printed code as input.
- Returns
If called with ipython_display=False, return pretty-printed Python source code as a Python string.
- Return type
str or None
- replace(original_op: Operator, replacement_op: Operator) Operator [source]¶
Replaces an original operator with a replacement operator for the given operator. Replacement also occurs for all operators within the given operator’s steps (i.e. pipelines and choices). If a planned operator is given as original_op, all derived operators (including trainable and trained versions) will be replaced. Otherwise, only the exact operator instance will be replaced.
- Parameters
original_op – Operator to replace within given operator. If operator is a planned operator, all derived operators (including trainable and trained versions) will be replaced. Otherwise, only the exact operator instance will be replaced.
replacement_op – Operator to replace the original with.
- Returns
Modified operator where original operator is replaced with replacement throughout.
- Return type
modified_operator
- to_json() Dict[str, Any] [source]¶
Returns the JSON representation of the operator.
- Returns
JSON representation that describes this operator and is valid with respect to lale.json_operator.SCHEMA.
- Return type
JSON document
- abstract transform_schema(s_X: Dict[str, Any]) Dict[str, Any] [source]¶
Return the output schema given the input schema.
- Parameters
s_X – Input dataset or schema.
- Returns
Schema of the output data given the input data schema.
- Return type
JSON schema
- abstract validate_schema(X: Any, y: Optional[Any] = None)[source]¶
Validate that X and y are valid with respect to the input schema of this operator.
- Parameters
X – Features.
y – Target class labels or None for unsupervised operators.
- Raises
ValueError – If X or y are invalid as inputs.
- visualize(ipython_display: bool = True)[source]¶
Visualize the operator using graphviz (use in a notebook).
- Parameters
ipython_display (bool, default True) – If True, proactively ask Jupyter to render the graph. Otherwise, the graph will only be rendered when visualize() was called in the last statement in a notebook cell.
- Returns
Digraph object from the graphviz package.
- Return type
Digraph
- class lale.operators.OperatorChoice(steps, name: Optional[str] = None)[source]¶
Bases:
PlannedOperator
,Generic
[OperatorChoiceType_co
]- is_classifier() bool [source]¶
Checks if this operator is a clasifier.
- Returns
True if the classifier tag is set.
- Return type
- is_frozen_trainable() bool [source]¶
Return true if all hyperparameters are bound, in other words, search spaces contain no free hyperparameters to be tuned.
- is_supervised() bool [source]¶
Checks if this operator needs labeled data for learning.
- Returns
True if the fit method requires a y argument.
- Return type
- set_params(**impl_params)[source]¶
This implements the set_params, as per the scikit-learn convention, extended as documented in the module docstring
- property steps: List[Tuple[str, OperatorChoiceType_co]]¶
This is meant to function similarly to the scikit-learn steps property and for linear pipelines, should behave the same
- class lale.operators.PlannedIndividualOp(_lale_name: str, _lale_impl, _lale_schemas, _lale_frozen_hyperparameters=None, _lale_trained=False, **hp)[source]¶
Bases:
IndividualOp
,PlannedOperator
This is a concrete class that returns a trainable individual operator through its __call__ method. A configure method can use an optimizer and return the best hyperparameter combination.
Create a new IndividualOp.
- Parameters
- auto_configure(X: Any, y: Optional[Any] = None, optimizer=None, cv=None, scoring=None, **kwargs) TrainedIndividualOp [source]¶
Perform combined algorithm selection and hyperparameter tuning on this planned operator.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
optimizer – lale.lib.lale.Hyperopt or lale.lib.lale.GridSearchCV default is None.
cv – cross-validation option that is valid for the optimizer. Default is None, which will use the optimizer’s default value.
scoring – scoring option that is valid for the optimizer. Default is None, which will use the optimizer’s default value.
kwargs – Other keyword arguments to be passed to the optimizer.
- Returns
Best operator discovered by the optimizer.
- Return type
- Raises
ValueError – If an invalid optimizer is provided
- customize_schema(schemas: Optional[Schema] = None, relevantToOptimizer: Optional[List[str]] = None, constraint: Optional[Union[Schema, Dict[str, Any], List[Union[Schema, Dict[str, Any]]]]] = None, tags: Optional[Dict] = None, forwards: Optional[Union[bool, List[str]]] = None, set_as_available: bool = False, **kwargs: Optional[Union[Schema, Dict[str, Any]]]) PlannedIndividualOp [source]¶
- freeze_trainable() TrainableIndividualOp [source]¶
- class lale.operators.PlannedOperator[source]¶
Bases:
Operator
Abstract class for Lale operators in the planned lifecycle state.
- step_1 >> step_2 -> PlannedPipeline
Pipe combinator, create two-step pipeline with edge from step_1 to step_2.
If step_1 is a pipeline, create edges from all of its sinks. If step_2 is a pipeline, create edges to all of its sources.
- Parameters
- Returns
Pipeline with edge from step_1 to step_2.
- Return type
- step_1 & step_2 -> PlannedPipeline
And combinator, create two-step pipeline without an edge between step_1 and step_2.
- Parameters
- Returns
Pipeline without any additional edges beyond those already inside of step_1 or step_2.
- Return type
- step_1 | step_2 -> OperatorChoice
Or combinator, create operator choice between step_1 and step_2.
- Parameters
- Returns
Algorithmic coice between step_1 or step_2.
- Return type
- auto_configure(X: Any, y: Optional[Any] = None, optimizer: Optional[PlannedIndividualOp] = None, cv: Optional[Any] = None, scoring: Optional[Any] = None, **kwargs) TrainedOperator [source]¶
Perform combined algorithm selection and hyperparameter tuning on this planned operator.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
optimizer – lale.lib.lale.Hyperopt or lale.lib.lale.GridSearchCV default is None.
cv – cross-validation option that is valid for the optimizer. Default is None, which will use the optimizer’s default value.
scoring – scoring option that is valid for the optimizer. Default is None, which will use the optimizer’s default value.
kwargs – Other keyword arguments to be passed to the optimizer.
- Returns
Best operator discovered by the optimizer.
- Return type
- Raises
ValueError – If an invalid optimizer is provided
- class lale.operators.PlannedPipeline(steps: List[PlannedOpType_co], edges: Optional[Iterable[Tuple[PlannedOpType_co, PlannedOpType_co]]] = None, _lale_preds: Optional[Dict[int, List[int]]] = None, ordered: bool = False)[source]¶
Bases:
BasePipeline
[PlannedOpType_co
],PlannedOperator
- auto_configure(X: Any, y: Optional[Any] = None, optimizer=None, cv=None, scoring=None, **kwargs) TrainedPipeline [source]¶
Perform combined algorithm selection and hyperparameter tuning on this planned operator.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
optimizer – lale.lib.lale.Hyperopt or lale.lib.lale.GridSearchCV default is None.
cv – cross-validation option that is valid for the optimizer. Default is None, which will use the optimizer’s default value.
scoring – scoring option that is valid for the optimizer. Default is None, which will use the optimizer’s default value.
kwargs – Other keyword arguments to be passed to the optimizer.
- Returns
Best operator discovered by the optimizer.
- Return type
- Raises
ValueError – If an invalid optimizer is provided
- is_frozen_trainable() bool [source]¶
Return true if all hyperparameters are bound, in other words, search spaces contain no free hyperparameters to be tuned.
- is_frozen_trained() bool [source]¶
Return true if all learnable coefficients are bound, in other words, there are no free parameters to be learned by fit.
- remove_last(inplace: bool = False) PlannedPipeline[PlannedOpType_co] [source]¶
- class lale.operators.TrainableIndividualOp(_lale_name, _lale_impl, _lale_schemas, _lale_frozen_hyperparameters=None, **hp)[source]¶
Bases:
PlannedIndividualOp
,TrainableOperator
Create a new IndividualOp.
- Parameters
- convert_to_trained() TrainedIndividualOp [source]¶
- customize_schema(schemas: Optional[Schema] = None, relevantToOptimizer: Optional[List[str]] = None, constraint: Optional[Union[Schema, Dict[str, Any], List[Union[Schema, Dict[str, Any]]]]] = None, tags: Optional[Dict] = None, forwards: Optional[Union[bool, List[str]]] = None, set_as_available: bool = False, **kwargs: Optional[Union[Schema, Dict[str, Any]]]) TrainableIndividualOp [source]¶
- decision_function(X=None)[source]¶
Deprecated since version 0.0.0: The decision_function method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call decision_function on the trained operator returned by fit instead.
- fit(X: Any, y: Optional[Any] = None, **fit_params) TrainedIndividualOp [source]¶
Train the learnable coefficients of this operator, if any.
Return a trained version of this operator. If this operator has free learnable coefficients, bind them to values that fit the data according to the operator’s algorithm. Do nothing if the operator implementation lacks a fit method or if the operator has been marked as is_frozen_trained.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
fit_params (Dictionary, optional) – A dictionary of keyword parameters to be used during training.
- Returns
A new copy of this operators that is the same except that its learnable coefficients are bound to their trained values.
- Return type
- freeze_trainable() TrainableIndividualOp [source]¶
Return a copy of the trainable parts of this operator that is the same except that all hyperparameters are bound and none are free to be tuned. If there is an operator choice, it is kept as is.
- freeze_trained() TrainedIndividualOp [source]¶
Deprecated since version 0.0.0: The freeze_trained method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call freeze_trained on the trained operator returned by fit instead.
- get_pipeline(pipeline_name: Optional[str] = None, astype: Literal['lale', 'sklearn'] = 'lale') Optional[TrainableOperator] [source]¶
Deprecated since version 0.0.0: The get_pipeline method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call get_pipeline on the trained operator returned by fit instead.
- predict(X=None, **predict_params) Any [source]¶
Deprecated since version 0.0.0: The predict method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call predict on the trained operator returned by fit instead.
- predict_log_proba(X=None)[source]¶
Deprecated since version 0.0.0: The predict_log_proba method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call predict_log_proba on the trained operator returned by fit instead.
- predict_proba(X=None)[source]¶
Deprecated since version 0.0.0: The predict_proba method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call predict_proba on the trained operator returned by fit instead.
- score(X, y, **score_params) Any [source]¶
Deprecated since version 0.0.0: The score method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call score on the trained operator returned by fit instead.
- score_samples(X=None)[source]¶
Deprecated since version 0.0.0: The score_samples method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call score_samples on the trained operator returned by fit instead.
- set_params(**impl_params)[source]¶
This implements the set_params, as per the scikit-learn convention, extended as documented in the module docstring
- summary() DataFrame [source]¶
Deprecated since version 0.0.0: The summary method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call summary on the trained operator returned by fit instead.
- class lale.operators.TrainableOperator[source]¶
Bases:
PlannedOperator
Abstract class for Lale operators in the trainable lifecycle state.
- step_1 >> step_2 -> PlannedPipeline
Pipe combinator, create two-step pipeline with edge from step_1 to step_2.
If step_1 is a pipeline, create edges from all of its sinks. If step_2 is a pipeline, create edges to all of its sources.
- Parameters
- Returns
Pipeline with edge from step_1 to step_2.
- Return type
- step_1 & step_2 -> PlannedPipeline
And combinator, create two-step pipeline without an edge between step_1 and step_2.
- Parameters
- Returns
Pipeline without any additional edges beyond those already inside of step_1 or step_2.
- Return type
- step_1 | step_2 -> OperatorChoice
Or combinator, create operator choice between step_1 and step_2.
- Parameters
- Returns
Algorithmic coice between step_1 or step_2.
- Return type
- abstract fit(X: Any, y: Optional[Any] = None, **fit_params) TrainedOperator [source]¶
Train the learnable coefficients of this operator, if any.
Return a trained version of this operator. If this operator has free learnable coefficients, bind them to values that fit the data according to the operator’s algorithm. Do nothing if the operator implementation lacks a fit method or if the operator has been marked as is_frozen_trained.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
fit_params (Dictionary, optional) – A dictionary of keyword parameters to be used during training.
- Returns
A new copy of this operators that is the same except that its learnable coefficients are bound to their trained values.
- Return type
- fit_transform(X: Any, y: Optional[Any] = None, **fit_params)[source]¶
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
fit_params (Dictionary, optional) – A dictionary of keyword parameters to be used during training.
- Returns
Transformed features; see output_transform schema of the operator.
- Return type
result
- abstract freeze_trainable() TrainableOperator [source]¶
Return a copy of the trainable parts of this operator that is the same except that all hyperparameters are bound and none are free to be tuned. If there is an operator choice, it is kept as is.
- class lale.operators.TrainablePipeline(steps: List[TrainableOpType_co], edges: Optional[Iterable[Tuple[TrainableOpType_co, TrainableOpType_co]]] = None, _lale_preds: Optional[Dict[int, List[int]]] = None, ordered: bool = False, _lale_trained=False)[source]¶
Bases:
PlannedPipeline
[TrainableOpType_co
],TrainableOperator
- convert_to_trained() TrainedPipeline[TrainedIndividualOp] [source]¶
- decision_function(X)[source]¶
Deprecated since version 0.0.0: The decision_function method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call decision_function on the trained operator returned by fit instead.
- fit(X: Any, y: Optional[Any] = None, **fit_params) TrainedPipeline[TrainedIndividualOp] [source]¶
Train the learnable coefficients of this operator, if any.
Return a trained version of this operator. If this operator has free learnable coefficients, bind them to values that fit the data according to the operator’s algorithm. Do nothing if the operator implementation lacks a fit method or if the operator has been marked as is_frozen_trained.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
fit_params (Dictionary, optional) – A dictionary of keyword parameters to be used during training.
- Returns
A new copy of this operators that is the same except that its learnable coefficients are bound to their trained values.
- Return type
- freeze_trainable() TrainablePipeline [source]¶
Return a copy of the trainable parts of this operator that is the same except that all hyperparameters are bound and none are free to be tuned. If there is an operator choice, it is kept as is.
- freeze_trained() TrainedPipeline [source]¶
- partial_fit(X: Any, y: Optional[Any] = None, freeze_trained_prefix: bool = True, unsafe: bool = False, **fit_params) TrainedPipeline[TrainedIndividualOp] [source]¶
partial_fit for a pipeline. This method assumes that all but the last node of a pipeline are frozen_trained and only the last node needs to be fit using its partial_fit method. If that is not the case, and freeze_trained_prefix is True, it freezes the prefix of the pipeline except the last node if they are trained.
- Parameters
X – Features; see partial_fit schema of the last node.
y – Labels/target
freeze_trained_prefix – If True, all but the last node are freeze_trained and only the last node is partial_fit.
unsafe – boolean. This flag allows users to override the validation that throws an error when the the operators in the prefix of this pipeline are not tagged with has_partial_transform. Setting unsafe to True would perform the transform as if it was row-wise even in the case it may not be.
fit_params – dict Additional keyword arguments to be passed to partial_fit of the estimator
- Returns
A partially trained pipeline, which can be trained further by other calls to partial_fit
- Return type
- Raises
ValueError – The piepline has a non-frozen prefix
- predict(X, **predict_params) Any [source]¶
Deprecated since version 0.0.0: The predict method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call predict on the trained operator returned by fit instead.
- predict_log_proba(X)[source]¶
Deprecated since version 0.0.0: The predict_log_proba method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call predict_log_proba on the trained operator returned by fit instead.
- predict_proba(X)[source]¶
Deprecated since version 0.0.0: The predict_proba method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call predict_proba on the trained operator returned by fit instead.
- remove_last(inplace: bool = False) TrainablePipeline[TrainableOpType_co] [source]¶
- score(X, y, **score_params)[source]¶
Deprecated since version 0.0.0: The score method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call score on the trained operator returned by fit instead.
- class lale.operators.TrainedIndividualOp(*args, _lale_trained=False, _lale_impl=None, **kwargs)[source]¶
Bases:
TrainableIndividualOp
,TrainedOperator
Create a new IndividualOp.
- Parameters
- customize_schema(schemas: Optional[Schema] = None, relevantToOptimizer: Optional[List[str]] = None, constraint: Optional[Union[Schema, Dict[str, Any], List[Union[Schema, Dict[str, Any]]]]] = None, tags: Optional[Dict] = None, forwards: Optional[Union[bool, List[str]]] = None, set_as_available: bool = False, **kwargs: Optional[Union[Schema, Dict[str, Any]]]) TrainedIndividualOp [source]¶
- decision_function(X: Any = None)[source]¶
Confidence scores for all classes.
- Parameters
X – Features; see input_decision_function schema of the operator.
- Returns
Confidences; see output_decision_function schema of the operator.
- Return type
result
- fit(X: Any, y: Optional[Any] = None, **fit_params) TrainedIndividualOp [source]¶
Train the learnable coefficients of this operator, if any.
Return a trained version of this operator. If this operator has free learnable coefficients, bind them to values that fit the data according to the operator’s algorithm. Do nothing if the operator implementation lacks a fit method or if the operator has been marked as is_frozen_trained.
- Parameters
X – Features that conform to the X property of input_schema_fit.
y (optional) – Labels that conform to the y property of input_schema_fit. Default is None.
fit_params (Dictionary, optional) – A dictionary of keyword parameters to be used during training.
- Returns
A new copy of this operators that is the same except that its learnable coefficients are bound to their trained values.
- Return type
- freeze_trainable() TrainedIndividualOp [source]¶
Return a copy of the trainable parts of this operator that is the same except that all hyperparameters are bound and none are free to be tuned. If there is an operator choice, it is kept as is.
- freeze_trained() TrainedIndividualOp [source]¶
Deprecated since version 0.0.0: The freeze_trained method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call freeze_trained on the trained operator returned by fit instead.
- get_pipeline(pipeline_name: None = None, astype: astype_type = 'lale') Optional[TrainedOperator] [source]¶
- get_pipeline(pipeline_name: str, astype: astype_type = 'lale') Optional[TrainableOperator]
Deprecated since version 0.0.0: The get_pipeline method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call get_pipeline on the trained operator returned by fit instead.
- is_frozen_trained() bool [source]¶
Return true if all learnable coefficients are bound, in other words, there are no free parameters to be learned by fit.
- predict(X: Any = None, **predict_params) Any [source]¶
Make predictions.
- Parameters
X – Features; see input_predict schema of the operator.
predict_params – Additional parameters that should be passed to the predict method
- Returns
Predictions; see output_predict schema of the operator.
- Return type
result
- predict_log_proba(X: Any = None)[source]¶
Predicted class log-probabilities for X.
- Parameters
X – Features.
- Returns
Class log probabilities.
- Return type
result
- predict_proba(X: Any = None)[source]¶
Probability estimates for all classes.
- Parameters
X – Features; see input_predict_proba schema of the operator.
- Returns
Probabilities; see output_predict_proba schema of the operator.
- Return type
result
- score(X: Any, y: Any, **score_params) Any [source]¶
Performance evaluation with a default metric.
- Parameters
X – Features.
y – Ground truth labels.
score_params – Any additional parameters expected by the score function of the underlying operator.
- Returns
performance metric value
- Return type
score
- score_samples(X: Any = None)[source]¶
Scores for each sample in X. The type of scores depends on the operator.
- Parameters
X – Features.
- Returns
scores per sample.
- Return type
result
- summary() DataFrame [source]¶
Deprecated since version 0.0.0: The summary method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call summary on the trained operator returned by fit instead.
- transform(X: Any, y: Any = None) Any [source]¶
Transform the data.
- Parameters
X – Features; see input_transform schema of the operator.
y (None) –
- Returns
Transformed features; see output_transform schema of the operator.
- Return type
result
- transform_X_y(X: Any, y: Any) Any [source]¶
Transform the data and target.
- Parameters
X – Features; see input_transform schema of the operator.
y – target; see input_transform schema of the operator.
- Returns
Transformed features and target; see output_transform schema of the operator.
- Return type
result
- class lale.operators.TrainedOperator[source]¶
Bases:
TrainableOperator
Abstract class for Lale operators in the trained lifecycle state.
- step_1 >> step_2 -> PlannedPipeline
Pipe combinator, create two-step pipeline with edge from step_1 to step_2.
If step_1 is a pipeline, create edges from all of its sinks. If step_2 is a pipeline, create edges to all of its sources.
- Parameters
- Returns
Pipeline with edge from step_1 to step_2.
- Return type
- step_1 & step_2 -> PlannedPipeline
And combinator, create two-step pipeline without an edge between step_1 and step_2.
- Parameters
- Returns
Pipeline without any additional edges beyond those already inside of step_1 or step_2.
- Return type
- step_1 | step_2 -> OperatorChoice
Or combinator, create operator choice between step_1 and step_2.
- Parameters
- Returns
Algorithmic coice between step_1 or step_2.
- Return type
- abstract decision_function(X: Any)[source]¶
Confidence scores for all classes.
- Parameters
X – Features; see input_decision_function schema of the operator.
- Returns
Confidences; see output_decision_function schema of the operator.
- Return type
result
- abstract freeze_trained() TrainedOperator [source]¶
Return a copy of this trainable operator that is the same except that all learnable coefficients are bound and thus fit is a no-op.
- abstract predict(X: Any, **predict_params) Any [source]¶
Make predictions.
- Parameters
X – Features; see input_predict schema of the operator.
predict_params – Additional parameters that should be passed to the predict method
- Returns
Predictions; see output_predict schema of the operator.
- Return type
result
- abstract predict_log_proba(X: Any)[source]¶
Predicted class log-probabilities for X.
- Parameters
X – Features.
- Returns
Class log probabilities.
- Return type
result
- abstract predict_proba(X: Any)[source]¶
Probability estimates for all classes.
- Parameters
X – Features; see input_predict_proba schema of the operator.
- Returns
Probabilities; see output_predict_proba schema of the operator.
- Return type
result
- abstract score(X: Any, y: Any, **score_params)[source]¶
Performance evaluation with a default metric.
- Parameters
X – Features.
y – Ground truth labels.
score_params – Any additional parameters expected by the score function of the underlying operator.
- Returns
performance metric value
- Return type
score
- class lale.operators.TrainedPipeline(*args, _lale_trained=False, **kwargs)[source]¶
Bases:
TrainablePipeline
[TrainedOpType_co
],TrainedOperator
- decision_function(X: Any)[source]¶
Confidence scores for all classes.
- Parameters
X – Features; see input_decision_function schema of the operator.
- Returns
Confidences; see output_decision_function schema of the operator.
- Return type
result
- freeze_trainable() TrainedPipeline [source]¶
Return a copy of the trainable parts of this operator that is the same except that all hyperparameters are bound and none are free to be tuned. If there is an operator choice, it is kept as is.
- partial_fit(X: Any, y: Optional[Any] = None, freeze_trained_prefix: bool = True, unsafe: bool = False, classes: Optional[Any] = None, **fit_params) TrainedPipeline[TrainedIndividualOp] [source]¶
partial_fit for a pipeline. This method assumes that all but the last node of a pipeline are frozen_trained and only the last node needs to be fit using its partial_fit method. If that is not the case, and freeze_trained_prefix is True, it freezes the prefix of the pipeline except the last node if they are trained.
- Parameters
X – Features; see partial_fit schema of the last node.
y – Labels/target
freeze_trained_prefix – If True, all but the last node are freeze_trained and only the last node is partial_fit.
unsafe – boolean. This flag allows users to override the validation that throws an error when the the operators in the prefix of this pipeline are not tagged with has_partial_transform. Setting unsafe to True would perform the transform as if it was row-wise even in the case it may not be.
fit_params – dict Additional keyword arguments to be passed to partial_fit of the estimator
classes (Any) –
- Returns
A partially trained pipeline, which can be trained further by other calls to partial_fit
- Return type
- Raises
ValueError – The piepline has a non-frozen prefix
- predict(X, **predict_params) Any [source]¶
Deprecated since version 0.0.0: The predict method is deprecated on a trainable operator, because the learned coefficients could be accidentally overwritten by retraining. Call predict on the trained operator returned by fit instead.
- predict_log_proba(X: Any)[source]¶
Predicted class log-probabilities for X.
- Parameters
X – Features.
- Returns
Class log probabilities.
- Return type
result
- predict_proba(X: Any)[source]¶
Probability estimates for all classes.
- Parameters
X – Features; see input_predict_proba schema of the operator.
- Returns
Probabilities; see output_predict_proba schema of the operator.
- Return type
result
- remove_last(inplace: bool = False) TrainedPipeline[TrainedOpType_co] [source]¶
- score(X: Any, y: Any, **score_params)[source]¶
Performance evaluation with a default metric based on the final estimator.
- Parameters
X – Features.
y – Ground truth labels.
score_params – Any additional parameters expected by the score function of the final estimator. These will be ignored for now.
- Returns
Performance metric value.
- Return type
score
- score_samples(X: Any = None)[source]¶
Scores for each sample in X. There type of scores is based on the last operator in the pipeline.
- Parameters
X – Features.
- Returns
Scores per sample.
- Return type
result
- lale.operators.clone_op(op: CloneOpType, name: Optional[str] = None) CloneOpType [source]¶
Clone any operator.
- lale.operators.customize_schema(op: CustomizeOpType, schemas: Optional[Schema] = None, relevantToOptimizer: Optional[List[str]] = None, constraint: Optional[Union[Schema, Dict[str, Any], List[Union[Schema, Dict[str, Any]]]]] = None, tags: Optional[Dict] = None, forwards: Optional[Union[bool, List[str]]] = None, set_as_available: bool = False, **kwargs: Optional[Union[Schema, Dict[str, Any]]]) CustomizeOpType [source]¶
Return a new operator with a customized schema
- Parameters
op (Operator) – The base operator to customize
schemas (Schema) – A dictionary of json schemas for the operator. Override the entire schema and ignore other arguments
input (Schema) – (or input_*) override the input schema for method *. input_* must be an existing method (already defined in the schema for lale operators, existing method for external operators)
output (Schema) – (or output_*) override the output schema for method *. output_* must be an existing method (already defined in the schema for lale operators, existing method for external operators)
relevantToOptimizer (String list) – update the set parameters that will be optimized.
constraint (Schema) – Add a constraint in JSON schema format.
tags (Dict) – Override the tags of the operator.
forwards (boolean or a list of strings) – Which methods/properties to forward to the underlying impl. (False for none, True for all).
set_as_available (bool) – Override the list of available operators so get_available_operators returns this customized operator.
kwargs (Schema) – Override the schema of the hyperparameter. param must be an existing parameter (already defined in the schema for lale operators, __init__ parameter for external operators)
- Returns
Copy of the operator with a customized schema
- Return type
- lale.operators.get_available_estimators(tags: Optional[AbstractSet[str]] = None) List[PlannedOperator] [source]¶
- lale.operators.get_available_operators(tag: str, more_tags: Optional[AbstractSet[str]] = None) List[PlannedOperator] [source]¶
- lale.operators.get_available_transformers(tags: Optional[AbstractSet[str]] = None) List[PlannedOperator] [source]¶
- lale.operators.get_op_from_lale_lib(impl_class, wrapper_modules=None) Optional[IndividualOp] [source]¶
- lale.operators.make_choice(*orig_steps: Union[Operator, Any], name: Optional[str] = None) OperatorChoice [source]¶
- lale.operators.make_operator(impl, schemas=None, name: Optional[str] = None, set_as_available: bool = True) PlannedIndividualOp [source]¶
- lale.operators.make_pipeline(*orig_steps: TrainedOperator) TrainedPipeline [source]¶
- lale.operators.make_pipeline(*orig_steps: TrainableOperator) TrainablePipeline
- lale.operators.make_pipeline(*orig_steps: Union[Operator, Any]) PlannedPipeline
- lale.operators.make_pipeline_graph(steps: List[TrainedOperator], edges: List[Tuple[Operator, Operator]], ordered: bool = False) TrainedPipeline [source]¶
- lale.operators.make_pipeline_graph(steps: List[TrainableOperator], edges: List[Tuple[Operator, Operator]], ordered: bool = False) TrainablePipeline
- lale.operators.make_pipeline_graph(steps: List[Operator], edges: List[Tuple[Operator, Operator]], ordered: bool = False) PlannedPipeline
Based on the state of the steps, it is important to decide an appropriate type for a new Pipeline. This method will decide the type, create a new Pipeline of that type and return it. #TODO: If multiple independently trained components are composed together in a pipeline, should it be of type TrainedPipeline? Currently, it will be TrainablePipeline, i.e. it will be forced to train it again.
- lale.operators.make_pretrained_operator(impl, schemas=None, name: Optional[str] = None) TrainedIndividualOp [source]¶
- lale.operators.make_union(*orig_steps: TrainedOperator) TrainedPipeline [source]¶
- lale.operators.make_union(*orig_steps: TrainableOperator) TrainablePipeline
- lale.operators.make_union(*orig_steps: Union[Operator, Any]) PlannedPipeline
- lale.operators.make_union_no_concat(*orig_steps: TrainedOperator) TrainedPipeline [source]¶
- lale.operators.make_union_no_concat(*orig_steps: TrainableOperator) TrainablePipeline
- lale.operators.make_union_no_concat(*orig_steps: Union[Operator, Any]) PlannedPipeline