lale.lib.sklearn.pca module

class lale.lib.sklearn.pca.PCA(*, n_components=None, copy=True, whiten=False, svd_solver='auto', tol=0.0, iterated_power='auto', random_state=None, n_oversamples=10, power_iteration_normalizer='auto')

Bases: PlannedIndividualOp

Principal component analysis transformer from scikit-learn for linear dimensionality reduction.

This documentation is auto-generated from JSON schemas.

Parameters
  • n_components (union type, default None) –

    • None

      If not set, keep all components.

    • or ‘mle’

      Use Minka’s MLE to guess the dimension.

    • or float, >0.0, <1.0

      Select the number of components such that the amount of variance that needs to be explained is greater than the specified percentage.

    • or integer, >=1, <=’X/items/maxItems’, not for optimizer

      Number of components to keep.

    See also constraint-2, constraint-3.

  • copy (boolean, not for optimizer, default True) – If false, overwrite data passed to fit.

  • whiten (boolean, default False) – When true, multiply the components vectors by the square root of n_samples and then divide by the singular values to ensure uncorrelated outputs with unit component-wise variances.

  • svd_solver (‘auto’, ‘full’, ‘arpack’, or ‘randomized’, default ‘auto’) –

    Algorithm to use.

    See also constraint-2, constraint-3, constraint-4.

  • tol (float, >=0.0, <=1 for optimizer, not for optimizer, default 0.0) – Tolerance for singular values computed by svd_solver arpack.

  • iterated_power (union type, not for optimizer, default 'auto') –

    • integer, >=0, <=10 for optimizer

      Number of iterations for the power method computed by svd_solver randomized.

    • or ‘auto’

      Pick automatically.

    See also constraint-4.

  • random_state (union type, not for optimizer, default None) –

    Seed of pseudo-random number generator for shuffling data.

    • None

      RandomState used by np.random

    • or numpy.random.RandomState

      Use the provided random state, only affecting other users of that same random state instance.

    • or integer

      Explicit seed.

  • n_oversamples (integer, >=0, <=1000 for optimizer, optional, not for optimizer, default 10) – This parameter is only relevant when svd_solver="randomized". It corresponds to the additional number of random vectors to sample the range of X so as to ensure proper conditioning. See randomized_svd for more details.

  • power_iteration_normalizer (‘auto’, ‘QR’, ‘LU’, or ‘none’, optional, not for optimizer, default ‘auto’) – Power iteration normalizer for randomized SVD solver. Not used by ARPACK. See randomized_svd for more details.

Notes

constraint-1 : negated type of ‘X/isSparse’

This class does not support sparse input. See TruncatedSVD for an alternative with sparse data.

constraint-2 : union type

Option n_components mle can only be set for svd_solver full or auto.

  • n_components : negated type of ‘mle’

  • or svd_solver : ‘full’ or ‘auto’

constraint-3 : union type

Setting 0 < n_components < 1 only works for svd_solver full.

  • n_components : negated type of float, >0.0, <1.0

  • or svd_solver : ‘full’

constraint-4 : union type

Option iterated_power can be set for svd_solver randomized.

  • iterated_power : ‘auto’

  • or svd_solver : ‘randomized’

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (array of items : array of items : float) – Features; the outer array is over samples.

  • y (Any, optional) – Target for supervised learning (ignored).

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – Features; the outer array is over samples.

Returns

result – Features; the outer array is over samples.

Return type

array of items : array of items : float