lale.lib.sklearn.logistic_regression module¶

class lale.lib.sklearn.logistic_regression.LogisticRegression(*, solver='lbfgs', penalty='l2', dual=False, C=1.0, tol=0.0001, fit_intercept=True, intercept_scaling=1.0, class_weight=None, random_state=None, max_iter=100, multi_class='deprecated', verbose=0, warm_start=False, n_jobs=None, l1_ratio=None)¶

Bases: PlannedIndividualOp

Logistic regression linear model from scikit-learn for classification.

This documentation is auto-generated from JSON schemas.

Parameters

solver (‘lbfgs’, ‘liblinear’, ‘newton-cg’, ‘newton-cholesky’, ‘sag’, or ‘saga’, default ‘lbfgs’) –

Algorithm to use in the optimization problem. Default is ‘lbfgs’. To choose a solver, you might want to consider the following aspects:
For small datasets, ‘liblinear’ is a good choice, whereas ‘sag’ and ‘saga’ are faster for large ones; For multiclass problems, only ‘newton-cg’, ‘sag’, ‘saga’ and ‘lbfgs’ handle multinomial loss; ‘liblinear’ and is limited to one-versus-rest schemes. ‘newton-cholesky’ is a good choice for n_samples >> n_features, especially with one-hot encoded categorical features with rare categories. Note that it is limited to binary classification and the one-versus-rest reduction for multiclass classification. Be aware that the memory usage of this solver has a quadratic dependency on n_features because it explicitly computes the Hessian matrix.

See also constraint-1, constraint-2, constraint-3, constraint-4, constraint-6.
penalty (‘l1’, ‘l2’, ‘elasticnet’, or None, not for optimizer, default ‘l2’) –
Norm used in the penalization.

See also constraint-1, constraint-2, constraint-4, constraint-5, constraint-6.
dual (boolean, default False) –
Dual or primal formulation. Prefer dual=False when n_samples > n_features.

See also constraint-2.
C (float, >0.0, >=0.03125 for optimizer, <=32768 for optimizer, loguniform distribution, not for optimizer, default 1.0) – Inverse regularization strength. Smaller values specify stronger regularization.
tol (float, >0.0, >=1e-08 for optimizer, <=0.01 for optimizer, default 0.0001) – Tolerance for stopping criteria.
fit_intercept (boolean, default True) – Specifies whether a constant (bias or intercept) should be added to the decision function.
intercept_scaling (float, >=0.0, <=1.0, uniform distribution, default 1.0) – Useful only when the solver ‘liblinear’ is used and self.fit_intercept is set to True. In this case, X becomes [X, self.intercept_scaling], i.e. a “synthetic” feature with constant value equal to intercept_scaling is appended to the instance vector. The intercept becomes “intercept_scaling * synthetic_feature_weight”. Note! the synthetic feature weight is subject to l1/l2 regularization as all other features. To lessen the effect of regularization on synthetic feature weight (and therefore on the intercept) intercept_scaling has to be increased.
class_weight (union type, not for optimizer, default None) –
- None
  
  By default, all classes have weight 1.
- or ‘balanced’
  
  Uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as “n_samples / (n_classes * np.bincount(y))”.
- or dict, not for optimizer
  
  Weights associated with classes in the form “{class_label: weight}”.
random_state (union type, not for optimizer, default None) –
Seed of pseudo-random number generator for shuffling data when solver == ‘sag’, ‘saga’ or ‘liblinear’.
- None
  
  RandomState used by np.random
- or numpy.random.RandomState
  
  Use the provided random state, only affecting other users of that same random state instance.
- or integer
  
  Explicit seed.
max_iter (integer, >=1, >=10 for optimizer, <=1000 for optimizer, uniform distribution, default 100) – Maximum number of iterations for solvers to converge.
multi_class (union type, default 'deprecated') –
the recommended ‘multinomial’ will always be used for n_classes >= 3. Solvers that do not support ‘multinomial’ will raise an error. Use sklearn.multiclass.OneVsRestClassifier(LogisticRegression()) if you still want to use OvR.
- ’ovr’, ‘multinomial’, or ‘auto’
- or ‘deprecated’
See also constraint-3.
verbose (integer, not for optimizer, default 0) – For the liblinear and lbfgs solvers set verbose to any positive number for verbosity.
warm_start (boolean, not for optimizer, default False) – When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. Useless for liblinear solver.
n_jobs (union type, not for optimizer, default None) –
Number of CPU cores when parallelizing over classes if multi_class is ovr. This parameter is ignored when the “solver” is set to ‘liblinear’ regardless of whether ‘multi_class’ is specified or not.
- None
  
  1 unless in joblib.parallel_backend context.
- or -1
  
  Use all processors.
- or integer, >=1
  
  Number of CPU cores.
l1_ratio (union type, optional, not for optimizer, default None) –
The Elastic-Net mixing parameter.
- float, >=0.0, <=1.0
- or None
See also constraint-5.

Notes

constraint-1 : union type

The newton-cg, sag, and lbfgs solvers support only l2 or no penalties.

solver : negated type of ‘newton-cg’, ‘newton-cholesky’, ‘sag’, or ‘lbfgs’

or penalty : ‘l2’, ‘none’, or None

constraint-2 : union type

The dual formulation is only implemented for l2 penalty with the liblinear solver.

dual : False

or dict

penalty : ‘l2’

solver : ‘liblinear’

constraint-3 : union type

The multi_class multinomial option is unavailable when the solver is liblinear or newton-cholesky.

multi_class : negated type of ‘multinomial’

or solver : negated type of ‘liblinear’

constraint-4 : union type, not for optimizer

penalty=’none’ is not supported for the liblinear solver

solver : negated type of ‘liblinear’

or penalty : negated type of ‘none’ or None

constraint-5 : union type, not for optimizer

When penalty is elasticnet, l1_ratio must be between 0 and 1.

penalty : negated type of ‘elasticnet’

or l1_ratio : float, >0.0, <=1.0

constraint-6 : union type, not for optimizer

Only ‘saga’ solver supports elasticnet penalty

penalty : negated type of ‘elasticnet’

or solver : ‘saga’

decision_function(X)¶

Confidence scores for all classes.

Note: The decision_function method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.

Returns

result – Confidence scores for samples for each class in the model.

array of items : array of items : float

In the multi-way case, score per (sample, class) combination.
or array of items : float

In the binary case, score for self._classes[1].

Return type

union type

fit(X, y=None, **fit_params)¶

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.
y (union type) –
Target class labels; the array is over samples.
- array of items : float
- or array of items : string
- or array of items : boolean

predict(X, **predict_params)¶

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.

Returns

result – Predicted class label per sample.

array of items : float
or array of items : string
or array of items : boolean

Return type

union type

predict_proba(X)¶

Probability estimates for all classes.

Note: The predict_proba method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters: X (array of items : array of items : float) – Features; the outer array is over samples.
Returns: result – Probability of the sample for each class in the model.
Return type: array of items : array of items : float