kumoai.experimental.rfm.LocalTable#
- class kumoai.experimental.rfm.LocalTable[source]#
Bases:
object
A table backed by a
pandas.DataFrame
.A
LocalTable
fully specifies the relevant metadata, i.e. selected columns, column semantic types, primary keys and time columns.LocalTable
is used to create aLocalGraph
.import pandas as pd import kumoai.experimental.rfm as rfm # Load data from a CSV file: df = pd.read_csv("data.csv") # Create a table from a `pandas.DataFrame` and infer its metadata ... table = rfm.LocalTable(df, name="my_table").infer_metadata() # ... or create a table explicitly: table = rfm.LocalTable( df=df, name="my_table", primary_key="id", time_column="time", end_time_column=None, ) # Verify metadata: table.print_metadata() # Change the semantic type of a column: table[column].stype = "text"
- Parameters:
df (
DataFrame
) – The data frame to create the table from.name (
str
) – The name of the table.primary_key (
Optional
[str
]) – The name of the primary key of this table, if it exists.time_column (
Optional
[str
]) – The name of the time column of this table, if it exists.end_time_column (
Optional
[str
]) – The name of the end time column of this table, if it exists.
- has_column(name)[source]#
Returns
True
if this table holds a column with namename
;False
otherwise.- Return type:
- property columns: List[Column]#
Returns a list of
Column
objects that represent the columns in this table.
- has_primary_key()[source]#
Returns
True`
if this table has a primary key;False
otherwise.- Return type:
- property primary_key: Column | None#
The primary key column of this table.
The getter returns the primary key column of this table, or
None
if no such primary key is present.The setter sets a column as a primary key on this table, and raises a
ValueError
if the primary key has a non-ID semantic type or if the column name does not match a column in the data frame.
- has_time_column()[source]#
Returns
True
if this table has a time column;False
otherwise.- Return type:
- property time_column: Column | None#
The time column of this table.
The getter returns the time column of this table, or
None
if no such time column is present.The setter sets a column as a time column on this table, and raises a
ValueError
if the time column has a non-timestamp semantic type or if the column name does not match a column in the data frame.
- has_end_time_column()[source]#
Returns
True
if this table has an end time column;False
otherwise.- Return type:
- property end_time_column: Column | None#
The end time column of this table.
The getter returns the end time column of this table, or
None
if no such end time column is present.The setter sets a column as an end time column on this table, and raises a
ValueError
if the end time column has a non-timestamp semantic type or if the column name does not match a column in the data frame.
- property metadata: DataFrame#
Returns a
pandas.DataFrame
object containing metadata information about the columns in this table.The returned dataframe has columns
name
,dtype
,stype
,is_primary_key
,is_time_column
andis_end_time_column
, which provide an aggregate view of the properties of the columns of this table.Example
>>> >>> import kumoai.experimental.rfm as rfm >>> table = rfm.LocalTable(df=..., name=...).infer_metadata() >>> table.metadata name dtype stype is_primary_key is_time_column is_end_time_column 0 CustomerID float64 ID True False False
- print_metadata()[source]#
Prints the
metadata()
of the table.- Return type: