lale.lib.imblearn.borderline_smote module¶

class lale.lib.imblearn.borderline_smote.BorderlineSMOTE(*, operator, sampling_strategy='auto', random_state=None, k_neighbors=5, n_jobs=1, m_neighbors=10, kind='borderline-1')¶

Bases: PlannedIndividualOp

Over-sampling using Borderline SMOTE, which is a variant of the original SMOTE algorithm.

This documentation is auto-generated from JSON schemas.

Borderline samples will be detected and used to generate new synthetic samples.

Parameters

operator (operator, optional) –
Trainable Lale pipeline that is trained using the data obtained from the current imbalance corrector.

Predict, transform, predict_proba or decision_function would just be forwarded to the trained pipeline. If operator is a Planned pipeline, the current imbalance corrector can’t be trained without using an optimizer to choose a trainable operator first. Please refer to lale/examples for more examples.
sampling_strategy (union type, optional, not for optimizer, default 'auto') –
Sampling information to resample the data set.
- float, not for optimizer
  
  Desired ratio of the number of samples in the minority class over the number of samples in the majority class after resampling. Therefore, the ratio is expressed as $\alpha_{os} = N_{rm} / N_{M}$ where $N_{rm}$ is the number of samples in the minority class after resampling and $N_{M}$ is the number of samples in the majority class.
  
  Warning
  
  Only available for binary classification. An error is raised for multi-class classification.
- or ‘minority’, ‘not minority’, ‘not majority’, ‘all’, or ‘auto’
  The class targeted by the resampling. The number of samples in the different classes will be equalized. Possible choices are:
  - 'minority': resample only the minority class;
  - 'not minority': resample all classes but the minority class;
  - 'not majority': resample all classes but the majority class;
  - 'all': resample all classes;
  - 'auto': equivalent to 'not majority'.
- or dict, not for optimizer
  
  Keys correspond to the targeted classes and values correspond to the desired number of samples for each targeted class.
- or callable, not for optimizer
  
  Function taking y and returns a dict. The keys correspond to the targeted classes and the values correspond to the desired number of samples for each class.
random_state (union type, optional, not for optimizer, default None) –
Control the randomization of the algorithm.
- None
  
  RandomState used by np.random
- or integer
  
  The seed used by the random number generator
- or numpy.random.RandomState
  
  Random number generator instance.
k_neighbors (union type, optional, not for optimizer, default 5) –
Number of nearest neighbours to use to construct synthetic samples.
- integer
  
  Number of nearest neighbours to use to construct synthetic samples.
- or Any
  
  An estimator that inherits from sklearn.neighbors.base.KNeighborsMixin that will be used to find the n_neighbors.
n_jobs (integer, optional, not for optimizer, default 1) – The number of threads to open if possible.
m_neighbors (union type, optional, not for optimizer, default 10) –
Number of nearest neighbours to use to determine if a minority sample is in danger.
- integer
  
  Number of nearest neighbours to use to construct synthetic samples.
- or Any
  
  An estimator that inherits from sklearn.neighbors.base.KNeighborsMixin that will be used to find the n_neighbors.
kind (‘borderline-1’ or ‘borderline-2’, optional, not for optimizer, default ‘borderline-1’) – The type of SMOTE algorithm to use.