lale.lib.autogen.mini_batch_k_means module¶
- class lale.lib.autogen.mini_batch_k_means.MiniBatchKMeans(*, n_clusters=8, init='k-means++', max_iter=100, batch_size=100, verbose=0, compute_labels=True, random_state=None, tol=0.0, max_no_improvement=10, init_size=None, n_init=3, reassignment_ratio=0.01)¶
Bases:
PlannedIndividualOp
Combined schema for expected data and hyperparameters.
This documentation is auto-generated from JSON schemas.
- Parameters
n_clusters (integer, >=2 for optimizer, <=8 for optimizer, uniform distribution, default 8) – The number of clusters to form as well as the number of centroids to generate.
init (union type, default 'k-means++') –
Method for initialization, defaults to ‘k-means++’: ‘k-means++’ : selects initial cluster centers for k-mean clustering in a smart way to speed up convergence
’k-means++’ or ‘random’
or callable, not for optimizer
max_iter (integer, >=10 for optimizer, <=1000 for optimizer, uniform distribution, default 100) – Maximum number of iterations over the complete dataset before stopping independently of any early stopping criterion heuristics.
batch_size (integer, >=3 for optimizer, <=128 for optimizer, uniform distribution, default 100) – Size of the mini batches.
verbose (union type, not for optimizer, default 0) –
Verbosity mode.
boolean
or integer
compute_labels (boolean, default True) – Compute label assignment and inertia for the complete dataset once the minibatch optimization has converged in fit.
random_state (union type, not for optimizer, default None) –
Determines random number generation for centroid initialization and random reassignment
integer
or numpy.random.RandomState
or None
tol (float, >=1e-08 for optimizer, <=0.01 for optimizer, default 0.0) – Control early stopping based on the relative center changes as measured by a smoothed, variance-normalized of the mean center squared position changes
max_no_improvement (integer, >=10 for optimizer, <=11 for optimizer, uniform distribution, default 10) – Control early stopping based on the consecutive number of mini batches that does not yield an improvement on the smoothed inertia
init_size (None, not for optimizer, default None) – Number of samples to randomly sample for speeding up the initialization (sometimes at the expense of accuracy): the only algorithm is initialized by running a batch KMeans on a random subset of the data
n_init (integer, >=3 for optimizer, <=10 for optimizer, uniform distribution, default 3) – Number of random initializations that are tried
reassignment_ratio (float, not for optimizer, default 0.01) – Control the fraction of the maximum number of counts for a center to be reassigned
Notes
constraint-1 : any type
constraint-2 : any type
- fit(X, y=None, **fit_params)¶
Train the operator.
Note: The fit method is not available until this operator is trainable.
Once this method is available, it will have the following signature:
- Parameters
X (union type) –
Training instances to cluster
array of items : Any
or array of items : array of items : float
y (any type) – not used, present here for API consistency by convention.
sample_weight (union type, optional, default 'deprecated') –
The parameter sample_weight is deprecated in version 1.3 and will be removed in 1.5.
array of items : float
or None or ‘deprecated’
- predict(X, **predict_params)¶
Make predictions.
Note: The predict method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – New data to predict.
sample_weight (union type, optional, default None) –
The weights for each observation in X
array of items : float
or None
- Returns
result – Index of the cluster each sample belongs to.
- Return type
array of items : float
- transform(X, y=None)¶
Transform the data.
Note: The transform method is not available until this operator is trained.
Once this method is available, it will have the following signature:
- Parameters
X (array of items : array of items : float) – New data to transform.
- Returns
result – X transformed in the new space.
- Return type
array of items : array of items : float