kumoai.experimental.rfm.LocalGraph#
- class kumoai.experimental.rfm.LocalGraph[source]#
Bases:
object
A graph of
LocalTable
objects, akin to relationships between tables in a relational database.Creating a graph is the final step of data definition; after a
LocalGraph
is created, you can use it to initialize the Kumo Relational Foundation Model (KumoRFM
).import kumoai.experimental.rfm as rfm # dataframes df1 = pd.DataFrame(...) df2 = pd.DataFrame(...) df3 = pd.DataFrame(...) # define tables table1 = kumoai.LocalTable(name="table1", data=df1) table2 = kumoai.LocalTable(name="table2", data=df2) table3 = kumoai.LocalTable(name="table3", data=df3) # create a graph from a list of tables graph = kumoai.LocalGraph( tables={ "table1": table1, "table2": table2, "table3": table3, }, edges=[], ) # infer links graph.infer_links() # remove edges between tables graph.unlink(table1, table2, fkey="id1") # infer metadata graph.infer_metadata() # validate graph graph.validate() # construct a graph from dataframes graph = rfm.LocalGraph.from_data(data={ "table1": df1, "table2": df2, "table3": df3, }) # remove edge between tables graph.unlink(table1, table2, fkey="id1") # validate graph graph.validate() # re-link tables graph.link(table1, table2, fkey="id1")
- static from_data(data, edges=None)[source]#
Creates a
LocalGraph
from a dictionary ofpandas.DataFrame
objects.- Parameters:
- Return type:
Note
This method will automatically infer metadata and links for the graph.
Example
>>> import kumoai.experimental.rfm as rfm >>> df1 = pd.DataFrame(...) >>> df2 = pd.DataFrame(...) >>> df3 = pd.DataFrame(...) >>> graph = rfm.LocalGraph.from_data(data={ ... "table1": df1, ... "table2": df2, ... "table3": df3, ... }) ... graph.validate()
- has_table(name)[source]#
Returns
True
if this graph has a table with namename
;False
otherwise.- Return type:
- table(name)[source]#
Returns the table with name
name
in this graph.- Raises:
KeyError – If
name
is not present in this graph.- Return type:
- property tables: Dict[str, LocalTable]#
Returns the dictionary of table objects.
- infer_metadata()[source]#
Infers metadata for the tables in this
LocalGraph
, by inferring the metadata of eachLocalTable
in the graph. :rtype:LocalGraph
Note
For more information, please see
kumoai.experimental.rfm.LocalTable.infer_metadata()
.
- link(*args, **kwargs)[source]#
Links two tables (
src_table
anddst_table
) from the foreign keyfkey
in the source table to the primary key in the destination table.These edges are treated as bidirectional.
- Parameters:
- Raises:
ValueError – if the edge is already present in the graph, if the source table does not exist in the graph, if the destination table does not exist in the graph, if the source key does not exist in the source table, or if the primary key of the source table is being treated as a foreign key.
- Return type:
- unlink(*args, **kwargs)[source]#
Removes an
Edge
from the graph.- Parameters:
- Raises:
ValueError – if the edge is not present in the graph.
- Return type:
- infer_links()[source]#
Infers links for the tables and adds them as edges to the graph. :rtype:
LocalGraph
Note
This function expects graph edges to be undefined upfront.
- Raises:
ValueError – If edges are not empty.
- validate()[source]#
Validates the graph to ensure that all relevant metadata is specified for its tables and edges.
Concretely, validation ensures that all tables are valid (see
kumoai.experimental.rfm.LocalTable.validate()
for more information), and that edges properly link primary keys and foreign keys between valid tables. It additionally ensures that primary and foreign keys between tables in anEdge
are of the same data type.- Return type:
Example
>>> import kumoai >>> graph = kumoai.LocalGraph(...) >>> graph.validate() ValueError: ...
- Raises:
ValueError – if validation fails.
- to_kumo_graph()[source]#
Upload tables and convert LocalGraph to kumo.graph.Graph
This method handles both uploading the table data and converting the LocalGraph to a kumo Graph object.
- Return type:
- Returns:
A kumo Graph object ready for use
- Raises:
ValueError – If the total size of all tables exceeds 10GB