Querying KumoRFM#

Predictive Query is a querying language that allows you to define a predictive problem. Predictive Query (PQL) lets you define predictive problems by specifying:

The target aggregation expression
The entities to predict for
Optional filters that can be defined to refine the context data.

For the full thorough introduction to predictive query, please refer to the predictive query tutorial. In this page, you can find how to use predictive query to interact with KumoRFM.

Note

KumoRFM is currently in experimental phase. Some of the predictive query features are not fully supported yet.

Writing Queries in Kumo#

In general, follow these five steps to author a PQL:

Choose your entity – a table and its primary key you predict for.
Define the target – a raw column or an aggregation over a future window.
Pin the entity list – pass a single ID or multiple IDs to make predictions for.
(Optional) Refine the context – filters to restrict which historical rows are used for feature generation.
Run & fetch – run KumoRFM.predict() or KumoRFM.evaluate().

Defining entities#

The general PQL structure is:

PREDICT <aggregation_expression> FOR <entity_specification> WHERE <optional_filters>

Component	Purpose
`PREDICT <target_expression>`	Declares the value or aggregate the model should predict
`FOR <entity_specification>`	Specifies the single ID or list of IDs to predict for
`WHERE <filters>` (optional)	Filters which historical rows are used to generate features

Unlike the enterprise product, KumoRFM makes a prediction for a handful of selected entities at a time. As such, entities for each query can be specified in one of two ways:

By specifying a single entity id, e.g. users.user_id=1
By specifying a tuple of entity ids, e.g. users.user_id IN (1, 2, 3 )

Improving the context through entity filters#

KumoRFM makes its entity-specific predictions based on context examples, collected from the database. Just like entity filters allow you to control the training data in the Kumo enterprise product, they can be used to provide more control over KumoRFM context examples. For example, to exclude users without recent activity from the context, we can write:

PREDICT COUNT(orders.*, 0, 30, days) > 0
FOR users.user_id=1 WHERE COUNT(orders.*, -30, 0, days) > 0

This limits the context examples to predicting churn for active users, limiting the context to examples relevant to your case and improving the performance. These filters are NOT applied to the provided entity list.

Evaluation mode#

Besides making predictions, KumoRFM also defines an evaluation mode to perform automatic evaluation on a sample of predictions.

>>> query = "EVALUATE PREDICT COUNT(orders.*, 0, 30, days) FOR users.user_id=1"
>>> metrics = rfm.evaluate(query)
>>> print(metrics)

Unsupported features#

Due to the experimental nature of KumoRFM, some features are not yet fully supported and will be added soon.

Only numerical and categorical columns are valid target columns, except for LIST_DISTINCT() aggregation, where only foreign key targets are supported.
ASSUMING clause is not permitted.
Filtering by column value (e.g., WHERE users.age > 21) is only supported for columns within the same table. Same goes for predicting a single non-aggregated value, e.g., PREDICT users.age.
LIST_DISTINCT() without a time interval is not supported.
LAST() and FIRST() aggregations are not supported.

Querying KumoRFM

Contents

Querying KumoRFM#

Writing Queries in Kumo#

Defining entities#

Improving the context through entity filters#

Evaluation mode#

Unsupported features#