Getting Started with KumoRFM#

This guide walks you through setting up and making your first prediction with KumoRFM.

Authentication#

Before using KumoRFM, you need to authenticate. There are several ways to do this:

Option 1: API Key

import kumoai.experimental.rfm as rfm

rfm.init(api_key="YOUR_API_KEY")

Option 2: OAuth2 Browser Login

import kumoai.experimental.rfm as rfm

rfm.authenticate()  # Opens a browser window for login

Option 3: Google Colab

In Google Colab, authenticate() automatically detects the environment and provides a widget-based login flow.

Option 4: Environment Variables

Set the KUMO_API_KEY and optionally RFM_API_URL environment variables before running your script:

export KUMO_API_KEY="YOUR_API_KEY"

Then simply call:

import kumoai.experimental.rfm as rfm

rfm.init()

Option 5: Snowflake Native App

When running inside a Snowflake notebook with KumoRFM deployed as a Snowflake Native App:

import kumoai.experimental.rfm as rfm

rfm.init(snowflake_application="YOUR_APP_NAME")

End-to-End Example#

Here is a complete example using local pandas DataFrames to predict customer churn:

import pandas as pd
import kumoai.experimental.rfm as rfm

# 1. Authenticate
rfm.init(api_key="YOUR_API_KEY")

# 2. Prepare your data as pandas DataFrames
df_users = pd.DataFrame({
    "user_id": [1, 2, 3],
    "signup_date": pd.to_datetime(["2023-01-01", "2023-02-15", "2023-03-20"]),
    "location": ["US", "UK", "US"],
})

df_orders = pd.DataFrame({
    "order_id": [101, 102, 103, 104],
    "user_id": [1, 1, 2, 3],
    "price": [50.0, 30.0, 100.0, 75.0],
    "timestamp": pd.to_datetime([
        "2024-01-10", "2024-02-15", "2024-01-20", "2024-03-05"
    ]),
})

# 3. Create a Graph (automatically infers metadata and links)
graph = rfm.Graph.from_data({
    "users": df_users,
    "orders": df_orders,
})

# 4. Initialize KumoRFM
model = rfm.KumoRFM(graph)

# 5. Make a prediction
query = "PREDICT COUNT(orders.*, 0, 30, days) > 0 FOR users.user_id=1"
result = model.predict(query)
print(result)

The result is a pandas DataFrame containing the prediction for each entity.

Using Other Data Sources#

KumoRFM supports multiple data backends beyond pandas DataFrames:

# From a SQLite database:
graph = rfm.Graph.from_sqlite("my_database.db")

# From Snowflake:
graph = rfm.Graph.from_snowflake(schema="MY_SCHEMA")

# From a RelBench benchmark dataset:
graph = rfm.Graph.from_relbench("f1")

See Data Requirements for KumoRFM for full details on each data connector.

Next Steps#