lale.lib.rasl.functions module¶
- class lale.lib.rasl.functions.ColumnMonoidFactory(col_maker: Callable[[Union[str, int]], MonoidFactory[Any, bool, _D]])[source]¶
Bases:
ColumnSelector
[DictMonoid
[_D
]]Given a MonoidFactory for deciding if a given column is valid, This returns the list of valid columns
- class lale.lib.rasl.functions.ColumnSelector(*args, **kwargs)[source]¶
Bases:
MonoidFactory
[Any
,List
[Union
[str
,int
]],_D
],Protocol
- class lale.lib.rasl.functions.DictMonoid(m: Dict[Any, _D])[source]¶
-
Given a monoid, this class lifts it to a dictionary pointwise
- combine(other: DictMonoid[_D])[source]¶
Combines this monoid instance with another, producing a result. This operation must be observationally associative, satisfying
x.from_monoid(a.combine(b.combine(c))) == x.from_monoid(a.combine(b).combine(c)))
where x is the instance of :class:MonoidFactory that created these instances.
- property is_absorbing¶
A monoid value x is absorbing if for all y, x.combine(y) == x. This can help stop training early for monoids with learned coefficients.
- class lale.lib.rasl.functions.categorical(max_values: int = 5)[source]¶
Bases:
ColumnMonoidFactory
Creates a MonoidFactory (and callable) for projecting categorical columns with sklearn’s ColumnTransformer or Lale’s Project operator.
- Parameters
max_values (int) – Maximum number of unique values in a column for it to be considered categorical.
- Returns
Function that, given a dataset X, returns a list of columns, containing either string column names or integer column indices.
- Return type
callable
- class lale.lib.rasl.functions.categorical_column(col: Union[str, int], threshold: int = 5)[source]¶
Bases:
MonoidFactory
[Any
,bool
,_column_distinct_count_data
]Determines if a column should be considered categorical, by seeing if there are more than threshold distinct values in it
- class lale.lib.rasl.functions.count_distinct_column(col: Union[str, int], limit: Optional[int] = None)[source]¶
Bases:
MonoidFactory
[Any
,int
,_column_distinct_count_data
]Counts the number of distinct elements in a given column. If a limit is specified, then, once the limit is reached, the count may no longer be accurate (but will always remain over the limit).
- class lale.lib.rasl.functions.date_time(fmt)[source]¶
Bases:
object
Creates a callable for projecting date/time columns with sklearn’s ColumnTransformer or Lale’s Project operator.
- Parameters
fmt (str) – Format string for strptime(), see https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
- Returns
Function that, given a dataset X, returns a list of columns, containing either string column names or integer column indices.
- Return type
callable