lale.lib.sklearn.logistic_regression module

class lale.lib.sklearn.logistic_regression.LogisticRegression(*, solver='lbfgs', penalty='l2', dual=False, C=1.0, tol=0.0001, fit_intercept=True, intercept_scaling=1.0, class_weight=None, random_state=None, max_iter=100, multi_class='deprecated', verbose=0, warm_start=False, n_jobs=None, l1_ratio=None)

Bases: PlannedIndividualOp

Logistic regression linear model from scikit-learn for classification.

This documentation is auto-generated from JSON schemas.

Parameters
  • solver (‘lbfgs’, ‘liblinear’, ‘newton-cg’, ‘newton-cholesky’, ‘sag’, or ‘saga’, default ‘lbfgs’) –

    Algorithm to use in the optimization problem. Default is ‘lbfgs’. To choose a solver, you might want to consider the following aspects:

    For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ and ‘saga’ are faster for large ones; For multiclass problems, only ‘newton-cg’, ‘sag’, ‘saga’ and ‘lbfgs’ handle multinomial loss; ‘liblinear’ and is limited to one-versus-rest schemes. ‘newton-cholesky’ is a good choice for n_samples >> n_features, especially with one-hot encoded categorical features with rare categories. Note that it is limited to binary classification and the one-versus-rest reduction for multiclass classification. Be aware that the memory usage of this solver has a quadratic dependency on n_features because it explicitly computes the Hessian matrix.

    See also constraint-1, constraint-2, constraint-3, constraint-4, constraint-6.

  • penalty (‘l1’, ‘l2’, ‘elasticnet’, or None, not for optimizer, default ‘l2’) –

    Norm used in the penalization.

    See also constraint-1, constraint-2, constraint-4, constraint-5, constraint-6.

  • dual (boolean, default False) –

    Dual or primal formulation. Prefer dual=False when n_samples > n_features.

    See also constraint-2.

  • C (float, >0.0, >=0.03125 for optimizer, <=32768 for optimizer, loguniform distribution, not for optimizer, default 1.0) – Inverse regularization strength. Smaller values specify stronger regularization.

  • tol (float, >0.0, >=1e-08 for optimizer, <=0.01 for optimizer, default 0.0001) – Tolerance for stopping criteria.

  • fit_intercept (boolean, default True) – Specifies whether a constant (bias or intercept) should be added to the decision function.

  • intercept_scaling (float, >=0.0, <=1.0, uniform distribution, default 1.0) – Useful only when the solver ‘liblinear’ is used and self.fit_intercept is set to True. In this case, X becomes [X, self.intercept_scaling], i.e. a “synthetic” feature with constant value equal to intercept_scaling is appended to the instance vector. The intercept becomes “intercept_scaling * synthetic_feature_weight”. Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.

  • class_weight (union type, not for optimizer, default None) –

    • None

      By default, all classes have weight 1.

    • or ‘balanced’

      Uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as “n_samples / (n_classes * np.bincount(y))”.

    • or dict, not for optimizer

      Weights associated with classes in the form “{class_label: weight}”.

  • random_state (union type, not for optimizer, default None) –

    Seed of pseudo-random number generator for shuffling data when solver == ‘sag’, ‘saga’ or ‘liblinear’.

    • None

      RandomState used by np.random

    • or numpy.random.RandomState

      Use the provided random state, only affecting other users of that same random state instance.

    • or integer

      Explicit seed.

  • max_iter (integer, >=1, >=10 for optimizer, <=1000 for optimizer, uniform distribution, default 100) – Maximum number of iterations for solvers to converge.

  • multi_class (union type, default 'deprecated') –

    the recommended ‘multinomial’ will always be used for n_classes >= 3. Solvers that do not support ‘multinomial’ will raise an error. Use sklearn.multiclass.OneVsRestClassifier(LogisticRegression()) if you still want to use OvR.

    • ’ovr’, ‘multinomial’, or ‘auto’

    • or ‘deprecated’

    See also constraint-3.

  • verbose (integer, not for optimizer, default 0) – For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.

  • warm_start (boolean, not for optimizer, default False) – When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver.

  • n_jobs (union type, not for optimizer, default None) –

    Number of CPU cores when parallelizing over classes if multi_class is ovr. This parameter is ignored when the “solver” is set to ‘liblinear’ regardless of whether ‘multi_class’ is specified or not.

    • None

      1 unless in joblib.parallel_backend context.

    • or -1

      Use all processors.

    • or integer, >=1

      Number of CPU cores.

  • l1_ratio (union type, optional, not for optimizer, default None) –

    The Elastic-Net mixing parameter.

    • float, >=0.0, <=1.0

    • or None

    See also constraint-5.

Notes

constraint-1 : union type

The newton-cg, sag, and lbfgs solvers support only l2 or no penalties.

  • solver : negated type of ‘newton-cg’, ‘newton-cholesky’, ‘sag’, or ‘lbfgs’

  • or penalty : ‘l2’, ‘none’, or None

constraint-2 : union type

The dual formulation is only implemented for l2 penalty with the liblinear solver.

  • dual : False

  • or dict

    • penalty : ‘l2’

    • solver : ‘liblinear’

constraint-3 : union type

The multi_class multinomial option is unavailable when the solver is liblinear or newton-cholesky.

  • multi_class : negated type of ‘multinomial’

  • or solver : negated type of ‘liblinear’

constraint-4 : union type, not for optimizer

penalty=’none’ is not supported for the liblinear solver

  • solver : negated type of ‘liblinear’

  • or penalty : negated type of ‘none’ or None

constraint-5 : union type, not for optimizer

When penalty is elasticnet, l1_ratio must be between 0 and 1.

  • penalty : negated type of ‘elasticnet’

  • or l1_ratio : float, >0.0, <=1.0

constraint-6 : union type, not for optimizer

Only ‘saga’ solver supports elasticnet penalty

  • penalty : negated type of ‘elasticnet’

  • or solver : ‘saga’

decision_function(X)

Confidence scores for all classes.

Note: The decision_function method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.

Returns

result – Confidence scores for samples for each class in the model.

  • array of items : array of items : float

    In the multi-way case, score per (sample, class) combination.

  • or array of items : float

    In the binary case, score for self._classes[1].

Return type

union type

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array of items : array of items : float) – Features; the outer array is over samples.

  • y (union type) –

    Target class labels; the array is over samples.

    • array of items : float

    • or array of items : string

    • or array of items : boolean

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.

Returns

result – Predicted class label per sample.

  • array of items : float

  • or array of items : string

  • or array of items : boolean

Return type

union type

predict_proba(X)

Probability estimates for all classes.

Note: The predict_proba method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.

Returns

result – Probability of the sample for each class in the model.

Return type

array of items : array of items : float