SQLite Connector#
KumoRFM can connect directly to SQLite databases, automatically inferring table metadata and relationships from the database schema.
Installation#
The SQLite backend requires the ADBC SQLite driver:
pip install kumoai[sqlite]
Or install the driver directly:
pip install adbc_driver_sqlite
Quick Start#
The simplest way to create a graph from a SQLite database:
import kumoai.experimental.rfm as rfm
graph = rfm.Graph.from_sqlite("my_database.db")
This will:
Connect to the SQLite database
Discover all tables automatically
Infer column metadata (data types, semantic types, primary keys, time columns)
Detect foreign key relationships from the database schema
Print a summary of the inferred metadata and links
Specifying Tables#
You can control which tables to include and customize their configuration:
graph = rfm.Graph.from_sqlite("data.db", tables=[
"USERS", # Include by name
dict(name="ORDERS", source_name="ORDERS_V2"), # Rename source table
dict(name="ITEMS", primary_key="ITEM_ID"), # Override primary key
])
Table configuration options:
Key |
Description |
Required |
|---|---|---|
|
The table name used in PQL queries |
Yes |
|
The actual table name in the database (if different from |
No |
|
Override the auto-detected primary key |
No |
Connection Options#
You can pass a file path or an existing ADBC connection:
# From a file path (string or Path):
graph = rfm.Graph.from_sqlite("path/to/database.db")
# From an existing ADBC connection:
from kumoai.experimental.rfm.backend.sqlite import connect
conn = connect("path/to/database.db")
graph = rfm.Graph.from_sqlite(conn)
Controlling Metadata Inference#
By default, Graph.from_sqlite() infers metadata and links. You can
disable this for manual configuration:
graph = rfm.Graph.from_sqlite(
"data.db",
infer_metadata=False, # Skip automatic type inference
verbose=False, # Suppress output
)
# Manually configure metadata afterwards:
graph.infer_metadata()
graph.infer_links()
Manual Edge Specification#
Override automatic link detection by providing edges explicitly:
graph = rfm.Graph.from_sqlite("data.db", edges=[
("ORDERS", "user_id", "USERS"),
("ORDERS", "item_id", "ITEMS"),
])
Optimizing Performance#
When initializing KumoRFM with a SQLite-backed graph, the
optimize=True flag creates database indices for faster context sampling:
model = rfm.KumoRFM(graph, optimize=True)
This is recommended for larger databases where sampling performance matters. Note that this requires write access to the database file.