lale.lib.lale.halving_grid_search_cv module

class lale.lib.lale.halving_grid_search_cv.HalvingGridSearchCV(*, estimator=None, scoring, cv=5, verbose=0, factor=3, resource='n_samples', max_resources='auto', min_resources='exhaust', aggressive_elimination=False, refit=True, error_score=nan, return_train_score=False, random_state=None, n_jobs=None, lale_num_samples=None, lale_num_grids=None, param_grid=None, pgo=None, observer=None, max_opt_time=None)

Bases: PlannedIndividualOp

GridSearchCV performs an exhaustive search over a discretized space.

This documentation is auto-generated from JSON schemas.

Parameters
  • estimator (union type, default None) –

    Planned Lale individual operator or pipeline.

    • operator

    • or None

      lale.lib.sklearn.LogisticRegression

  • scoring (union type, not for optimizer) –

    Scorer object, or known scorer named by string.

    • None

      When not specified, use accuracy for classification tasks and r2 for regression.

    • or union type

      Scorer object, or known scorer named by string.

      • callable

        Callable with signature scoring(estimator, X, y) as documented in sklearn scoring.

        The callable has to return a scalar value, such that a higher score is better. This may be created from one of the sklearn metrics using make_scorer. Or it can be one of the scoring callables returned by the factory functions in lale.lib.aif360 metrics, for example, symmetric_disparate_impact(**fairness_info). Or it can be a completely custom user-written Python callable.

      • or ‘accuracy’, ‘explained_variance’, ‘max_error’, ‘roc_auc’, ‘roc_auc_ovr’, ‘roc_auc_ovo’, ‘roc_auc_ovr_weighted’, ‘roc_auc_ovo_weighted’, ‘balanced_accuracy’, ‘average_precision’, ‘neg_log_loss’, or ‘neg_brier_score’

        Known scorer for classification task.

      • or ‘r2’, ‘neg_mean_squared_error’, ‘neg_mean_absolute_error’, ‘neg_root_mean_squared_error’, ‘neg_mean_squared_log_error’, or ‘neg_median_absolute_error’

        Known scorer for regression task.

  • cv (union type, not for optimizer, default 5) –

    Cross-validation as integer or as object that has a split function.

    The fit method performs cross validation on the input dataset for per trial, and uses the mean cross validation performance for optimization. This behavior is also impacted by the handle_cv_failure flag.

    • union type

      • integer, >=2, >=3 for optimizer, <=4 for optimizer, uniform distribution, default 5

        Number of folds for cross-validation.

      • or None, not for optimizer

        to use the default 5-fold cross validation

    • or CrossvalGenerator, not for optimizer

      Object with split function: generator yielding (train, test) splits as arrays of indices. Can use any of the iterators from https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-iterators

  • verbose (integer, >=0, optional, not for optimizer, default 0) – Controls the verbosity: the higher, the more messages.

  • factor (float, >1, >=2 for optimizer, <=5 for optimizer, optional, not for optimizer, default 3) – The halving parameter, which determines the proportion of candidates that are selected for each subsequent iteration. For example, factor=3 means that only one third of the candidates are selected.

  • resource (string, optional, not for optimizer, default 'n_samples') – Defines the resource that increases with each iteration. By default, the resource is the number of samples. It can also be set to any parameter of the base estimator that accepts positive integer values, e.g. ‘n_iterations’ or ‘n_estimators’ for a gradient boosting estimator.

  • max_resources (union type, optional, not for optimizer, default 'auto') –

    The maximum amount of resource that any candidate is allowed to use for a given iteration.

    • ’auto’

    • or integer, >=1, not for optimizer

  • min_resources (union type, optional, not for optimizer, default 'exhaust') –

    The minimum amount of resource that any candidate is allowed to use for a given iteration

    • ’smallest’

      A heuristic that sets r0 to a small value

    • or ‘exhaust’

      Sets r0 such that the last iteration uses as much resources as possible

    • or integer, >=1, not for optimizer

  • aggressive_elimination (boolean, optional, not for optimizer, default False) – Enable aggresive elimination when there aren’t enough resources to reduce the remaining candidates to at most factor after the last iteration

  • refit (boolean, optional, not for optimizer, default True) – Refit an estimator using the best found parameters on the whole dataset.

  • error_score (union type, optional, not for optimizer, default nan) –

    Value to assign to the score if an error occurs in estimator fitting.

    • ’raise’

      Raise the error

    • or nan

    • or float, not for optimizer

  • return_train_score (boolean, optional, not for optimizer, default False) – Include training scores

  • random_state (union type, optional, not for optimizer, default None) –

    Pseudo random number generator state used for subsampling the dataset when resources != ‘n_samples’. Ignored otherwise.

    • None

      RandomState used by np.random

    • or numpy.random.RandomState

      Use the provided random state, only affecting other users of that same random state instance.

    • or integer

      Explicit seed.

  • n_jobs (union type, not for optimizer, default None) –

    Number of jobs to run in parallel.

    • None

      1 unless in joblib.parallel_backend context.

    • or -1

      Use all processors.

    • or integer, >=1

      Number of jobs to run in parallel.

  • lale_num_samples (union type, not for optimizer, default None) –

    How many samples to draw when discretizing a continuous hyperparameter.

    • integer, >=1

    • or None

      lale.search.lale_grid_search_cv.DEFAULT_SAMPLES_PER_DISTRIBUTION

  • lale_num_grids (union type, not for optimizer, default None) –

    How many top-level disjuncts to explore.

    • None

      If not set, keep all grids.

    • or float, >0.0, <1.0

      Fraction of grids to keep.

    • or integer, >=1

      Number of grids to keep.

  • param_grid (union type, optional, not for optimizer, default None) –

    • None

      Generated automatically.

    • or any type

      Dictionary of hyperparameter ranges in the grid.

  • pgo (union type, not for optimizer, default None) –

    • any type

      lale.search.PGO

    • or None

  • observer (Any, optional, not for optimizer, default None) – a class or object with callbacks for observing the state of the optimization

  • max_opt_time (union type, not for optimizer, default None) –

    Maximum amount of time in seconds for the optimization.

    • float, >=0.0

    • or None

      No runtime bound.

Notes

constraint-1 : any type

max_resources is set to ‘auto’ if and only if resource is set to ‘n_samples’penalty with the liblinear solver.

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (any type) –

  • y (any type) –

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (any type) –

Returns

result

Return type

any type