lale.lib.sklearn.sgd_classifier module

class lale.lib.sklearn.sgd_classifier.SGDClassifier(*, loss='hinge', penalty='l2', alpha=0.0001, l1_ratio=0.15, fit_intercept=True, max_iter=1000, tol=0.001, shuffle=True, verbose=0, epsilon=0.1, n_jobs=None, random_state=None, learning_rate='optimal', eta0=0.0, power_t=0.5, early_stopping=False, validation_fraction=0.1, n_iter_no_change=5, class_weight=None, warm_start=False, average=False)

Bases: PlannedIndividualOp

SGD classifier from scikit-learn uses linear classifiers (SVM, logistic regression, a.o.) with stochastic gradient descent training.

This documentation is auto-generated from JSON schemas.

Parameters
  • loss (union type, default 'hinge') –

    The loss function to be used. Defaults to ‘hinge’, which gives a linear SVM. The possible options are ‘hinge’, ‘log’, ‘modified_huber’, ‘squared_hinge’, ‘perceptron’, or a regression loss: ‘squared_error’, ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’. The ‘log_loss’ loss gives logistic regression, a probabilistic classifier. ‘modified_huber’ is another smooth loss that brings tolerance to outliers as well as probability estimates. ‘squared_hinge’ is like hinge but is quadratically penalized. ‘perceptron’ is the linear loss used by the perceptron algorithm. The other losses are designed for regression but can be useful in classification as well; see SGDRegressor for a description. More details about the losses formulas can be found in the scikit-learn User Guide.

    • ’hinge’, ‘log_loss’, ‘modified_huber’, ‘squared_hinge’, ‘perceptron’, ‘squared_error’, ‘huber’, ‘epsilon_insensitive’, or ‘squared_epsilon_insensitive’

    • or ‘squared_loss’, not for optimizer

  • penalty (‘elasticnet’, ‘l1’, or ‘l2’, default ‘l2’) – The penalty (aka regularization term) to be used. Defaults to ‘l2’

  • alpha (float, >=1e-10 for optimizer, <=1.0 for optimizer, loguniform distribution, default 0.0001) – Constant that multiplies the regularization term. Defaults to 0.0001

  • l1_ratio (float, >=1e-09 for optimizer, <=1.0 for optimizer, loguniform distribution, default 0.15) – The Elastic Net mixing parameter, with 0 <= l1_ratio <= 1.

  • fit_intercept (boolean, default True) – Whether the intercept should be estimated or not. If False, the

  • max_iter (integer, >=10 for optimizer, <=1000 for optimizer, uniform distribution, default 1000) – The maximum number of passes over the training data (aka epochs).

  • tol (union type, default 0.001) –

    The stopping criterion.

    • float, >=1e-08 for optimizer, <=0.01 for optimizer

    • or None

  • shuffle (boolean, default True) – Whether or not the training data should be shuffled after each epoch.

  • verbose (integer, not for optimizer, default 0) – The verbosity level

  • epsilon (float, >=1e-08 for optimizer, <=1.35 for optimizer, loguniform distribution, default 0.1) – Epsilon in the epsilon-insensitive loss functions; only if loss is

  • n_jobs (union type, not for optimizer, default None) –

    The number of CPUs to use to do the OVA (One Versus All, for

    • integer

    • or None

  • random_state (union type, not for optimizer, default None) –

    The seed of the pseudo random number generator to use when shuffling

    • integer

    • or numpy.random.RandomState

    • or None

  • learning_rate (‘optimal’, ‘constant’, ‘invscaling’, or ‘adaptive’, default ‘optimal’) –

    The learning rate schedule:

    See also constraint-1.

  • eta0 (float, >=0.01 for optimizer, <=1.0 for optimizer, loguniform distribution, default 0.0) –

    The initial learning rate for the ‘constant’, ‘invscaling’ or

    See also constraint-1.

  • power_t (float, >=1e-05 for optimizer, <=1.0 for optimizer, uniform distribution, default 0.5) – The exponent for inverse scaling learning rate [default 0.5].

  • early_stopping (boolean, not for optimizer, default False) – Whether to use early stopping to terminate training when validation

  • validation_fraction (float, >=0.0, <=1.0, not for optimizer, default 0.1) – The proportion of training data to set aside as validation set for

  • n_iter_no_change (integer, >=5 for optimizer, <=10 for optimizer, not for optimizer, default 5) – Number of iterations with no improvement to wait before early stopping.

  • class_weight (union type, not for optimizer, default None) –

    Preset for the class_weight fit parameter.

    • dict

    • or ‘balanced’ or None

  • warm_start (boolean, not for optimizer, default False) – When set to True, reuse the solution of the previous call to fit as

  • average (union type, not for optimizer, default False) –

    When set to True, computes the averaged SGD weights and stores the result in the coef_ attribute.

    • boolean

    • or integer, not for optimizer

Notes

constraint-1 : union type

eta0 must be greater than 0 if the learning_rate is not ‘optimal’.

  • learning_rate : ‘optimal’

  • or eta0 : float, >0.0

decision_function(X)

Confidence scores for all classes.

Note: The decision_function method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) –

Returns

result – Confidence scores for samples for each class in the model.

  • array of items : array of items : float

    In the multi-way case, score per (sample, class) combination.

  • or array of items : float

    In the binary case, score for self._classes[1].

Return type

union type

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array of items : array of items : float) –

  • y (union type) –

    • array of items : string

    • or array of items : float

    • or array of items : boolean

  • coef_init (array, optional of items : array of items : float) – The initial coefficients to warm-start the optimization.

  • intercept_init (array, optional of items : float) – The initial intercept to warm-start the optimization.

  • sample_weight (union type, optional, default None) –

    Weights applied to individual samples.

    • array of items : float

    • or None

      Uniform weights.

partial_fit(X, y=None, **fit_params)

Incremental fit to train train the operator on a batch of samples.

Note: The partial_fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array of items : array of items : float) –

  • y (union type) –

    • array of items : string

    • or array of items : float

    • or array of items : boolean

  • classes (union type, optional) –

    • array of items : string

    • or array of items : float

    • or array of items : boolean

  • sample_weight (union type, optional, default None) –

    Weights applied to individual samples.

    • array of items : float

    • or None

      Uniform weights.

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) –

Returns

result

  • array of items : string

  • or array of items : float

  • or array of items : boolean

Return type

union type

predict_proba(X)

Probability estimates for all classes.

Note: The predict_proba method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) –

Returns

result – Returns the probability of the sample for each class in the model,

Return type

array of items : array of items : float