lale.lib.rasl.datasets module

lale.lib.rasl.datasets.arff_data_loader(file_name: str, label_name: str, rows_per_batch: int) Iterable[Tuple[DataFrame, Series]][source]

Incrementally load an ARFF file and yield it one (X, y) batch at a time.

lale.lib.rasl.datasets.csv_data_loader(file_name: str, label_name: str, rows_per_batch: int) Iterable[Tuple[DataFrame, Series]][source]

Incrementally load an CSV file and yield it one (X, y) batch at a time.

lale.lib.rasl.datasets.mockup_data_loader(X: DataFrame, y: Series, n_batches: int, astype: Literal['pandas'], shuffle: bool = False) Iterable[Tuple[DataFrame, Series]][source]
lale.lib.rasl.datasets.mockup_data_loader(X: DataFrame, y: Series, n_batches: int, astype: Literal['pandas', 'spark'], shuffle: bool = False) Iterable[Tuple[DataFrame, Series]]

Split (X, y) into batches to emulate loading them incrementally.

Only intended for testing purposes, because if X and y are already materialized in-memory, there is little reason to batch them.

lale.lib.rasl.datasets.openml_data_loader(dataset_name: str, batch_size: int) Iterable[Tuple[DataFrame, Series]][source]

Download the OpenML dataset, incrementally load it, and yield it one (X,y) batch at a time.