Querying KumoRFM#
Predictive Query is a querying language that allows you to define a predictive problem.
It defines two main components: Entities to make the prediction for and the actual target for each entity.
For the full thorough introduction to predictive query, please refer to
the predictive query tutorial.
In this page, you can find how to use predictive query to interact with KumoRFM
.
Note
KumoRFM
is currently in experimental phase. Some of the predictive query
features are not fully supported yet.
Defining entities#
The general PQL structure stays the same:
PREDICT <aggregation_expression> FOR <entity_specification>
Unlike the enterprise product, KumoRFM makes a prediction for a handful of selected entities at a time. As such, entities for each query can be specified in one of two ways:
By specifying a single entity id, e.g.
users.user_id=1
By specifying a tuple of entity ids, e.g.
users.user_id IN (1, 2, 3 )
Improving the context through entity filters#
KumoRFM
makes its entity-specific predictions based on context examples,
collected from the database. Just like entity filters allow you to control the
training data in the Kumo enterprise product, they can be used to provide
more control over KumoRFM
context examples.
For example, to exclude users without recent activity from the context, we
can write:
PREDICT COUNT(orders.*, 0, 30, days) > 0
FOR user.user_id=1 WHERE COUNT(orders.*, -30, 0, days) > 0
This limits the context examples to predicting churn for active users, limiting the context to examples relevant to your case and improving the performance. These filters are NOT applied to the provided entity list.
Evaluation mode#
Besides making predictions, KumoRFM
also defines an evaluation mode to perform automatic
evaluation on a sample of predictions.
# Query the model
query = "EVALUATE PREDICT COUNT(orders.*, 0, 30, days) FOR users.user_id=1"
metrics = rfm.evaluate(query)
print(metrics) # MAE MSE RMSE
# ... ... ...
Unsupported features#
Due to the experimental nature of KumoRFM
, some features are not yet fully
supported and will be added soon.
Only numerical and categorical columns are valid target columns, except for
LIST_DISTINCT()
aggregation, where only foreign key targets are supported.
ASSUMING
clause is not permitted.Filtering by column value (e.g.,
WHERE user.age > 21
) is only supported for columns within the same table. Same goes for predicting a single non-aggregated value, e.g., PREDICT user.age.
LIST_DISTINCT()
without a time interval is not supported.
LAST()
andFIRST()
aggregations are not supported.