Supported Prediction Types#

KumoRFM supports a variety of prediction types, organized into temporal tasks (which involve a future time horizon) and static tasks (which infer attributes without a time component).

Note

Placeholder: Task taxonomy diagram showing temporal vs static tasks

Temporal Tasks (Forecast)#

All temporal tasks predict future outcomes over a defined time horizon using historical data in relational tables. Every temporal prediction is defined by:

  • Target: What is being predicted (an aggregation expression)

  • Entity: Who the prediction is for (a table primary key)

  • Horizon: When the prediction applies (a future time window)

The general PQL pattern for temporal tasks is:

PREDICT <aggregation>(table.column, <start>, <end>, <unit>) FOR entity_table.pk=value

where <start> and <end> define the future time window relative to “now”, and <unit> is the time granularity (e.g., days, hours, minutes).

Forecast: Regression#

Predict a continuous numeric value for an entity over a future time horizon.

Use case: Demand forecasting, revenue prediction, quantity estimation.

Supported aggregations: SUM, AVG, COUNT, MAX, MIN

-- Predict total revenue for item_id=42 in the next 30 days
PREDICT SUM(orders.price, 0, 30, days)
FOR items.item_id=42
result = model.predict(
    "PREDICT SUM(orders.price, 0, 30, days) FOR items.item_id=42"
)

Output: Numeric value per entity.

Metrics: MAE, RMSE, MAPE, R².

Forecast: Binary Classification#

Predict whether an entity will or will not experience an event within a future time window. This is defined by applying a boolean condition to an aggregation expression.

Use case: Customer churn prediction, event occurrence prediction.

Supported aggregations: SUM, AVG, COUNT, MAX, MIN (with a boolean condition such as = 0, > 100)

-- Predict whether user_id=42 will make zero orders in the next 90 days (churn)
PREDICT COUNT(orders.*, 0, 90, days) = 0
FOR users.user_id=42
result = model.predict(
    "PREDICT COUNT(orders.*, 0, 90, days) = 0 FOR users.user_id=42"
)

The boolean condition (= 0, > 100, etc.) on the aggregation makes this a binary classification task.

Output: Boolean (True/False) and probability per entity.

Metrics: AUC-ROC, Log Loss, Precision, Recall, F1-Score.

Forecast: Multi-Class Classification#

Predict which class or state an entity will belong to at a future point in time. Use FIRST() to predict the first value that will occur in the window, or LAST() to predict the final value.

Use case: Tier migration, lifecycle stage prediction, feature engagement.

Supported aggregations: FIRST, LAST

-- Predict what subscription tier user_id=42 will be in after 30 days
PREDICT FIRST(subscriptions.tier, 0, 30, days)
FOR users.user_id=42
result = model.predict(
    "PREDICT FIRST(subscriptions.tier, 0, 30, days) FOR users.user_id=42"
)

Output: Class label and class probabilities per entity.

Metrics: Accuracy, Macro/Weighted F1, Log Loss.

Recommendations#

Predict a ranked list of items an entity is most likely to interact with over a future time window. Use LIST_DISTINCT() with RANK TOP N to get the top N recommended items.

Use case: Product recommendations, content ranking, next best action.

Supported aggregations: LIST_DISTINCT with RANK TOP N

-- Predict the top 10 items user_id=42 is most likely to order in the next 30 days
PREDICT LIST_DISTINCT(orders.item_id, 0, 30, days) RANK TOP 10
FOR users.user_id=42
result = model.predict(
    "PREDICT LIST_DISTINCT(orders.item_id, 0, 30, days) RANK TOP 10 "
    "FOR users.user_id=42"
)

Output: Ranked list of item IDs per entity.

Metrics: Recall@K, Precision@K, NDCG@K.

Multi-Horizon Regression (Forecasting)#

Predict a numeric value for an entity across multiple future time steps. This produces a time series of predictions.

Use case: Multi-step demand forecasting, time series prediction.

Supported aggregations: SUM, AVG, COUNT, MAX, MIN

-- Predict weekly revenue for item_id=42 over the next 60 weeks
PREDICT SUM(orders.price, 0, 7, days) FORECAST 60 TIMEFRAMES
FOR items.item_id=42
result = model.predict(
    "PREDICT SUM(orders.price, 0, 7, days) FORECAST 60 TIMEFRAMES "
    "FOR items.item_id=42"
)

Note

Placeholder: Multi-horizon forecasting diagram showing multiple future prediction steps

The FORECAST N TIMEFRAMES clause tells KumoRFM to produce N predictions, each separated by the time window specified in the aggregation (7 days in this example). So FORECAST 60 TIMEFRAMES with a 7-day window predicts out 60 × 7 = 420 days total.

Output: Time-indexed numeric values (one per horizon).

Metrics: MAE@h, RMSE@h, MAPE@h, Mean MAE, Mean RMSE.

Static Tasks#

Static tasks infer latent or unknown entity attributes without modeling temporal evolution. There is no time horizon — the prediction is about the current state of the entity based on its attributes and relational context.

Every static prediction is defined by: Target × Entity (no horizon).

The general PQL pattern for static tasks is:

PREDICT table.column FOR entity_table.pk=value

Static Regression#

Infer a continuous numeric attribute of an entity.

Use case: Age estimation, price imputation, value scoring.

Supported target type: Numeric columns

-- Predict the age of user_id=42
PREDICT users.age
FOR users.user_id=42
result = model.predict("PREDICT users.age FOR users.user_id=42")

Output: Numeric value per entity.

Metrics: MAE, RMSE, MAPE, R².

Static Binary Classification#

Infer whether an entity belongs to one of two classes based on its attributes.

Use case: Fraud detection, quality classification.

Supported target type: Boolean columns

-- Predict whether transaction_id=42 is fraudulent
PREDICT transactions.is_fraudulent
FOR transactions.transaction_id=42
result = model.predict(
    "PREDICT transactions.is_fraudulent FOR transactions.transaction_id=42"
)

Output: Boolean (True/False) and probability per entity.

Metrics: AUC-ROC, Log Loss, Precision, Recall, F1-Score.

Static Multi-Class Classification#

Infer which single class an entity belongs to from a set of possible classes.

Use case: Customer segmentation, category prediction.

Supported target type: Categorical columns

-- Predict the customer segment for customer_id=42
PREDICT customers.segment
FOR customers.customer_id=42
result = model.predict(
    "PREDICT customers.segment FOR customers.customer_id=42"
)

Output: Class label and class probabilities per entity.

Metrics: Accuracy, Macro/Weighted F1, Log Loss.

Summary#

Task Type

PQL Pattern

Output

Category

Temporal Regression

PREDICT SUM(t.col, 0, N, days) FOR ...

Numeric

Temporal

Temporal Binary Classification

PREDICT COUNT(t.*, 0, N, days) = 0 FOR ...

Boolean + Probability

Temporal

Temporal Multi-Class Classification

PREDICT FIRST(t.col, 0, N, days) FOR ...

Class + Probabilities

Temporal

Recommendations

PREDICT LIST_DISTINCT(t.col, 0, N, days) RANK TOP K FOR ...

Ranked item list

Temporal

Multi-Horizon Forecasting

PREDICT SUM(t.col, 0, N, days) FORECAST K TIMEFRAMES FOR ...

Time-indexed numerics

Temporal

Static Regression

PREDICT t.numeric_col FOR ...

Numeric

Static

Static Binary Classification

PREDICT t.bool_col FOR ...

Boolean + Probability

Static

Static Multi-Class

PREDICT t.categorical_col FOR ...

Class + Probabilities

Static