lale.lib.autogen.mini_batch_k_means module¶

class lale.lib.autogen.mini_batch_k_means.MiniBatchKMeans(*, n_clusters=8, init='k-means++', max_iter=100, batch_size=100, verbose=0, compute_labels=True, random_state=None, tol=0.0, max_no_improvement=10, init_size=None, n_init=3, reassignment_ratio=0.01)¶

Bases: PlannedIndividualOp

Combined schema for expected data and hyperparameters.

This documentation is auto-generated from JSON schemas.

Parameters

n_clusters (integer, >=2 for optimizer, <=8 for optimizer, uniform distribution, default 8) – The number of clusters to form as well as the number of centroids to generate.
init (union type, default 'k-means++') –
Method for initialization, defaults to ‘k-means++’: ‘k-means++’ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence
- ’k-means++’ or ‘random’
- or callable, not for optimizer
max_iter (integer, >=10 for optimizer, <=1000 for optimizer, uniform distribution, default 100) – Maximum number of iterations over the complete dataset before stopping independently of any early stopping criterion heuristics.
batch_size (integer, >=3 for optimizer, <=128 for optimizer, uniform distribution, default 100) – Size of the mini batches.
verbose (union type, not for optimizer, default 0) –
Verbosity mode.
- boolean
- or integer
compute_labels (boolean, default True) – Compute label assignment and inertia for the complete dataset once the minibatch optimization has converged in fit.
random_state (union type, not for optimizer, default None) –
Determines random number generation for centroid initialization and random reassignment
- integer
- or numpy.random.RandomState
- or None
tol (float, >=1e-08 for optimizer, <=0.01 for optimizer, default 0.0) – Control early stopping based on the relative center changes as measured by a smoothed, variance-normalized of the mean center squared position changes
max_no_improvement (integer, >=10 for optimizer, <=11 for optimizer, uniform distribution, default 10) – Control early stopping based on the consecutive number of mini batches that does not yield an improvement on the smoothed inertia
init_size (None, not for optimizer, default None) – Number of samples to randomly sample for speeding up the initialization (sometimes at the expense of accuracy): the only algorithm is initialized by running a batch KMeans on a random subset of the data
n_init (integer, >=3 for optimizer, <=10 for optimizer, uniform distribution, default 3) – Number of random initializations that are tried
reassignment_ratio (float, not for optimizer, default 0.01) – Control the fraction of the maximum number of counts for a center to be reassigned

Notes

constraint-1 : any type

constraint-2 : any type

fit(X, y=None, **fit_params)¶

Train the operator.

Note: The fit method is not available until this operator is trainable.

Once this method is available, it will have the following signature:

Parameters

X (union type) –
Training instances to cluster
- array of items : Any
- or array of items : array of items : float
y (any type) – not used, present here for API consistency by convention.
sample_weight (union type, optional, default 'deprecated') –
The parameter sample_weight is deprecated in version 1.3 and will be removed in 1.5.
- array of items : float
- or None or ‘deprecated’

predict(X, **predict_params)¶

Make predictions.

Note: The predict method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters

X (array of items : array of items : float) – New data to predict.
sample_weight (union type, optional, default None) –
The weights for each observation in X
- array of items : float
- or None

Returns

result – Index of the cluster each sample belongs to.

Return type

array of items : float

transform(X, y=None)¶

Transform the data.

Note: The transform method is not available until this operator is trained.

Once this method is available, it will have the following signature:

Parameters: X (array of items : array of items : float) – New data to transform.
Returns: result – X transformed in the new space.
Return type: array of items : array of items : float