lale.lib.autogen.mini_batch_k_means module

class lale.lib.autogen.mini_batch_k_means.MiniBatchKMeans(*, n_clusters=8, init='k-means++', max_iter=100, batch_size=100, verbose=0, compute_labels=True, random_state=None, tol=0.0, max_no_improvement=10, init_size=None, n_init=3, reassignment_ratio=0.01)

Bases: PlannedIndividualOp

Combined schema for expected data and hyperparameters.

This documentation is auto-generated from JSON schemas.

Parameters
  • n_clusters (integer, >=2 for optimizer, <=8 for optimizer, uniform distribution, default 8) – The number of clusters to form as well as the number of centroids to generate.

  • init (union type, default 'k-means++') –

    Method for initialization, defaults to ‘k-means++’: ‘k-means++’ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence

    • ’k-means++’ or ‘random’

    • or callable, not for optimizer

  • max_iter (integer, >=10 for optimizer, <=1000 for optimizer, uniform distribution, default 100) – Maximum number of iterations over the complete dataset before stopping independently of any early stopping criterion heuristics.

  • batch_size (integer, >=3 for optimizer, <=128 for optimizer, uniform distribution, default 100) – Size of the mini batches.

  • verbose (union type, not for optimizer, default 0) –

    Verbosity mode.

    • boolean

    • or integer

  • compute_labels (boolean, default True) – Compute label assignment and inertia for the complete dataset once the minibatch optimization has converged in fit.

  • random_state (union type, not for optimizer, default None) –

    Determines random number generation for centroid initialization and random reassignment

    • integer

    • or numpy.random.RandomState

    • or None

  • tol (float, >=1e-08 for optimizer, <=0.01 for optimizer, default 0.0) – Control early stopping based on the relative center changes as measured by a smoothed, variance-normalized of the mean center squared position changes

  • max_no_improvement (integer, >=10 for optimizer, <=11 for optimizer, uniform distribution, default 10) – Control early stopping based on the consecutive number of mini batches that does not yield an improvement on the smoothed inertia

  • init_size (None, not for optimizer, default None) – Number of samples to randomly sample for speeding up the initialization (sometimes at the expense of accuracy): the only algorithm is initialized by running a batch KMeans on a random subset of the data

  • n_init (integer, >=3 for optimizer, <=10 for optimizer, uniform distribution, default 3) – Number of random initializations that are tried

  • reassignment_ratio (float, not for optimizer, default 0.01) – Control the fraction of the maximum number of counts for a center to be reassigned

Notes

constraint-1 : any type

constraint-2 : any type

fit(X, y=None, **fit_params)

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters
  • X (union type) –

    Training instances to cluster

    • array of items : Any

    • or array of items : array of items : float

  • y (any type) – not used, present here for API consistency by convention.

  • sample_weight (union type, optional, default 'deprecated') –

    The parameter sample_weight is deprecated in version 1.3 and will be removed in 1.5.

    • array of items : float

    • or None or ‘deprecated’

predict(X, **predict_params)

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters
  • X (array of items : array of items : float) – New data to predict.

  • sample_weight (union type, optional, default None) –

    The weights for each observation in X

    • array of items : float

    • or None

Returns

result – Index of the cluster each sample belongs to.

Return type

array of items : float

transform(X, y=None)

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – New data to transform.

Returns

result – X transformed in the new space.

Return type

array of items : array of items : float