kumoai.experimental.rfm.LocalGraph#

class kumoai.experimental.rfm.LocalGraph[source]#

Bases: object

A graph of LocalTable objects, akin to relationships between tables in a relational database.

Creating a graph is the final step of data definition; after a LocalGraph is created, you can use it to initialize the Kumo Relational Foundation Model (KumoRFM).

import kumoai.experimental.rfm as rfm

# dataframes
df1 = pd.DataFrame(...)
df2 = pd.DataFrame(...)
df3 = pd.DataFrame(...)

# define tables
table1 = kumoai.LocalTable(name="table1", data=df1)
table2 = kumoai.LocalTable(name="table2", data=df2)
table3 = kumoai.LocalTable(name="table3", data=df3)

# create a graph from a list of tables
graph = kumoai.LocalGraph(
    tables={
        "table1": table1,
        "table2": table2,
        "table3": table3,
    },
    edges=[],
)

# infer links
graph.infer_links()

# remove edges between tables
graph.unlink(table1, table2, fkey="id1")

# infer metadata
graph.infer_metadata()

# validate graph
graph.validate()

# construct a graph from dataframes
graph = rfm.LocalGraph.from_data(data={
    "table1": df1,
    "table2": df2,
    "table3": df3,
})

# remove edge between tables
graph.unlink(table1, table2, fkey="id1")

# validate graph
graph.validate()

# re-link tables
graph.link(table1, table2, fkey="id1")
__init__(tables, edges=None)[source]#
static from_data(data, edges=None)[source]#

Creates a LocalGraph from a dictionary of pandas.DataFrame objects.

Parameters:
  • data (Dict[str, DataFrame]) – A dictionary of data frames, where the keys are the names of the tables and the values hold table data.

  • edges (Optional[List[Edge]]) – An optional list of Edge objects to add to the graph. If not provided, edges will be automatically inferred from the data.

Return type:

LocalGraph

Note

This method will automatically infer metadata and links for the graph.

Example

>>> import kumoai.experimental.rfm as rfm
>>> df1 = pd.DataFrame(...)
>>> df2 = pd.DataFrame(...)
>>> df3 = pd.DataFrame(...)
>>> graph = rfm.LocalGraph.from_data(data={
...     "table1": df1,
...     "table2": df2,
...     "table3": df3,
... })
... graph.validate()  
has_table(name)[source]#

Returns True if this graph has a table with name name; False otherwise.

Return type:

bool

table(name)[source]#

Returns the table with name name in this graph.

Raises:

KeyError – If name is not present in this graph.

Return type:

LocalTable

property tables: Dict[str, LocalTable]#

Returns the dictionary of table objects.

infer_metadata()[source]#

Infers metadata for the tables in this LocalGraph, by inferring the metadata of each LocalTable in the graph. :rtype: LocalGraph

Note

For more information, please see kumoai.experimental.rfm.LocalTable.infer_metadata().

property edges: List[Edge]#

Returns the edges of this graph.

Links two tables (src_table and dst_table) from the foreign key fkey in the source table to the primary key in the destination table.

These edges are treated as bidirectional.

Parameters:
  • *args (Union[str, Edge, None]) – Any arguments to construct an Edge, or an Edge object itself.

  • **kwargs (str) – Any keyword arguments to construct an Edge.

Raises:

ValueError – if the edge is already present in the graph, if the source table does not exist in the graph, if the destination table does not exist in the graph, if the source key does not exist in the source table, or if the primary key of the source table is being treated as a foreign key.

Return type:

LocalGraph

Removes an Edge from the graph.

Parameters:
  • *args (Union[str, Edge, None]) – Any arguments to construct an Edge, or an Edge object itself.

  • **kwargs (str) – Any keyword arguments to construct an Edge.

Raises:

ValueError – if the edge is not present in the graph.

Return type:

LocalGraph

Infers links for the tables and adds them as edges to the graph. :rtype: LocalGraph

Note

This function expects graph edges to be undefined upfront.

Raises:

ValueError – If edges are not empty.

validate()[source]#

Validates the graph to ensure that all relevant metadata is specified for its tables and edges.

Concretely, validation ensures that all tables are valid (see kumoai.experimental.rfm.LocalTable.validate() for more information), and that edges properly link primary keys and foreign keys between valid tables. It additionally ensures that primary and foreign keys between tables in an Edge are of the same data type.

Return type:

LocalGraph

Example

>>> import kumoai
>>> graph = kumoai.LocalGraph(...)  
>>> graph.validate()  
ValueError: ...
Raises:

ValueError – if validation fails.

to_kumo_graph()[source]#

Upload tables and convert LocalGraph to kumo.graph.Graph

This method handles both uploading the table data and converting the LocalGraph to a kumo Graph object.

Return type:

Graph

Returns:

A kumo Graph object ready for use

Raises:

ValueError – If the total size of all tables exceeds 10GB