kumoai.encoder#

While the Kumo platform intelligently infers encoders based on colum data and semantic types, Kumo also supports custom encoder overrides for columns via the ColumnProcessingPlan specification when defining a model plan. The following objects can be used as encoder overrides, bearing in mind that the selected encoder must be supported on the semantic type of the column that is being overridden.

Enums#

NAStrategy

Kumo-supported null value imputation strategies.

Scaler

Kumo-supported numerical value scaling strategies.

Encoders#

Null

A Null encoder skips encoding its corresponding column.

Numerical

A Numerical encoder encodes its corresponding numerical column with a normalization specified by scaler and strategy for null value imputation specified by na_strategy.

MaxLogNumerical

A MaxLogNumerical encoder encodes its corresponding numerical column, after applying the transformation

MinLogNumerical

A MinLogNumerical encoder encodes its corresponding numerical column, after applying the transformation

Index

An Index encoder encodes its corresponding categorical column by assigning each unique value with frequency above min_occ to an embedding of size channels from the model plan.

Hash

A Hash encoder encodes its corresponding categorical column by hashing each value to range [0..num_components], and using this hashed value to determine the corresponding embedding (with size channels from the model plan).

MultiCategorical

A MultiCategorical encoder encodes its corresponding multicategorical column by treating each categorical value independently, and fusing the results.

GloVe

A GloVe encoder uses embeddings from the GloVe project to embed text in a semantically meaningful manner.

NumericalList

A NumericalList encoder encodes numerical sequences, by treating these sequences as input features without any applied transformations.

Datetime

A Datetime encoder encodes a date or time value, representing it with various user-specified granularities.