lale.helpers module

class lale.helpers.GenSym(names: Set[str])[source]

Bases: object

lale.helpers.add_missing_values(orig_X, missing_rate=0.1, seed=None)[source]
lale.helpers.append_batch(data, batch_data)[source]
lale.helpers.are_hyperparameters_equal(hyperparam1, hyperparam2)[source]
lale.helpers.arg_name(pos=0, level=1) Optional[str][source]
lale.helpers.assignee_name(level=1) Optional[str][source]
lale.helpers.create_data_loader(X: Any, y: Optional[Any] = None, batch_size: int = 1, num_workers: int = 0, shuffle: bool = True)[source]

A function that takes a dataset as input and outputs a Pytorch dataloader.

Parameters
  • X (Input data.) – The formats supported are Pandas DataFrame, Numpy array, a sparse matrix, torch.tensor, torch.utils.data.Dataset, path to a HDF5 file, lale.util.batch_data_dictionary_dataset.BatchDataDict, a Python dictionary of the format {“dataset”: torch.utils.data.Dataset, “collate_fn”:collate_fn for torch.utils.data.DataLoader}

  • y (Labels., optional) – Supported formats are Numpy array or Pandas series, by default None

  • batch_size (int, optional) – Number of samples in each batch, by default 1

  • num_workers (int, optional) – Number of workers used by the data loader, by default 0

  • shuffle (boolean, optional, default True) – Whether to use SequentialSampler or RandomSampler for creating batches

Return type

torch.utils.data.DataLoader

Raises

TypeError – Raises a TypeError if the input format is not supported.

lale.helpers.create_individual_op_using_reflection(class_name, operator_name, param_dict)[source]
lale.helpers.create_instance_from_hyperopt_search_space(lale_object, hyperparams) Operator[source]

Hyperparams is a n-tuple of dictionaries of hyper-parameters, each dictionary corresponds to an operator in the pipeline

lale.helpers.cross_val_score(estimator, X, y=None, scoring: ~typing.Any = <function accuracy_score>, cv: ~typing.Any = 5)[source]

Use the given estimator to perform fit and predict for splits defined by ‘cv’ and compute the given score on each of the splits.

Parameters
Returns

cv_results

Return type

a list of scores corresponding to each cross validation fold

lale.helpers.cross_val_score_track_trials(estimator, X, y=None, scoring: ~typing.Any = <function accuracy_score>, cv: ~typing.Any = 5, args_to_scorer: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, args_to_cv: ~typing.Optional[~typing.Dict[str, ~typing.Any]] = None, **fit_params)[source]

Use the given estimator to perform fit and predict for splits defined by ‘cv’ and compute the given score on each of the splits.

Parameters
Returns

cv_results

Return type

a list of scores corresponding to each cross validation fold

lale.helpers.data_to_json(data, subsample_array: bool = True) Union[list, dict, int, float][source]
lale.helpers.dict_without(orig_dict: Dict[str, Any], key: str) Dict[str, Any][source]
lale.helpers.find_lale_wrapper(sklearn_obj: Any) Optional[Any][source]
Parameters

sklearn_obj – An sklearn compatible object that may have a lale wrapper

Returns

The lale wrapper type, or None if one could not be found

lale.helpers.fold_schema(X, y, cv=1, is_classifier=True)[source]
lale.helpers.get_estimator_param_name_from_hyperparams(hyperparams)[source]
lale.helpers.get_name_and_index(name: str) Tuple[str, int][source]

given a name of the form “name@i”, returns (name, i) if given a name of the form “name”, returns (name, 0)

lale.helpers.get_sklearn_estimator_name() str[source]

Some higher order sklearn operators changed the name of the nested estimatator in later versions. This returns the appropriate version dependent paramater name

lale.helpers.import_from_sklearn(sklearn_obj: Any, fitted: bool = True, in_place: bool = False)[source]

This method take an object and tries to wrap sklearn objects (at the top level or contained within hyperparameters of other sklearn objects). It will modify the object to add in the appropriate lale wrappers. It may also return a wrapper or different object than given.

Parameters
  • sklearn_obj – the object that we are going to try and wrap

  • fitted – should we return a TrainedOperator

  • in_place – should we try to mutate what we can in place, or should we aggressively deepcopy everything

Returns

The wrapped object (or the input object if we could not wrap it)

lale.helpers.import_from_sklearn_pipeline(sklearn_pipeline: Any, fitted: bool = True)[source]

Note: Same as import_from_sklearn. This alternative name exists for backwards compatibility.

This method take an object and tries to wrap sklearn objects (at the top level or contained within hyperparameters of other sklearn objects). It will modify the object to add in the appropriate lale wrappers. It may also return a wrapper or different object than given.

Parameters
  • sklearn_pipeline – the object that we are going to try and wrap

  • fitted – should we return a TrainedOperator

Returns

The wrapped object (or the input object if we could not wrap it)

lale.helpers.instantiate_from_hyperopt_search_space(obj_hyperparams, new_hyperparams)[source]
lale.helpers.is_empty_dict(val) bool[source]
lale.helpers.is_numeric_structure(structure_type: str)[source]
lale.helpers.json_lookup(ptr, jsn, default=None)[source]
lale.helpers.make_array_index_name(index, is_tuple: bool = False)[source]
lale.helpers.make_degen_indexed_name(name, index)[source]
lale.helpers.make_indexed_name(name, index)[source]
lale.helpers.make_nested_hyperopt_space(sub_space)[source]
lale.helpers.ndarray_to_json(arr: ndarray, subsample_array: bool = True) Union[list, dict][source]
lale.helpers.nest_HPparam(name: str, key: str)[source]
lale.helpers.nest_HPparams(name: str, grid: Mapping[str, V]) Dict[str, V][source]
lale.helpers.nest_all_HPparams(name: str, grids: Iterable[Mapping[str, V]]) List[Dict[str, V]][source]

Given the name of an operator in a pipeline, this transforms every key(parameter name) in the grids to use the operator name as a prefix (separated by __). This is the convention in scikit-learn pipelines.

lale.helpers.nest_choice_HPparam(key: str)[source]
lale.helpers.nest_choice_HPparams(grid: Mapping[str, V]) Dict[str, V][source]
lale.helpers.nest_choice_all_HPparams(grids: Iterable[Mapping[str, V]]) List[Dict[str, V]][source]

this transforms every key(parameter name) in the grids to be nested under a choice, using a ? as a prefix (separated by __). This is the convention in scikit-learn pipelines.

lale.helpers.partition_sklearn_choice_params(d: Dict[str, Any]) Tuple[int, Dict[str, Any]][source]
lale.helpers.partition_sklearn_params(d: Dict[str, Any]) Tuple[Dict[str, Any], Dict[str, Dict[str, Any]]][source]
lale.helpers.split_with_schemas(estimator, all_X, all_y, indices, train_indices=None)[source]
lale.helpers.to_graphviz(lale_operator: Operator, ipython_display: bool = True, call_depth: int = 1, **dot_graph_attr)[source]
lale.helpers.unnest_HPparams(k: str) List[str][source]
lale.helpers.unnest_choice(k: str) str[source]
class lale.helpers.val_wrapper(base)[source]

Bases: object

This is used to wrap values that cause problems for hyper-optimizer backends lale will unwrap these when given them as the value of a hyper-parameter

classmethod unwrap(obj)[source]
unwrap_self()[source]
lale.helpers.with_fixed_estimator_name(**kwargs)[source]

Some higher order sklearn operators changed the name of the nested estimator in later versions. This fixes up the arguments, renaming estimator and base_estimator appropriately.

lale.helpers.write_batch_output_to_file(file_obj, file_path, total_len, batch_idx, batch_X, batch_y, batch_out_X, batch_out_y)[source]