kumoai.experimental.rfm.Table#

class kumoai.experimental.rfm.Table[source]#

Bases: ABC

A Table fully specifies the relevant metadata of a single table, i.e. its selected columns, data types, semantic types, primary keys and time columns.

Parameters:
  • name (str) – The name of this table.

  • columns (Optional[Sequence[str]]) – The selected columns of this table.

  • primary_key (Optional[str]) – The name of the primary key of this table, if it exists.

  • time_column (Optional[str]) – The name of the time column of this table, if it exists.

  • end_time_column (Optional[str]) – The name of the end time column of this table, if it exists.

__init__(name, columns=None, primary_key=None, time_column=None, end_time_column=None)[source]#
property name: str#

The name of this table.

has_column(name)[source]#

Returns True if this table holds a column with name name; False otherwise.

Return type:

bool

column(name)[source]#

Returns the data column named with name name in this table.

Parameters:

name (str) – The name of the column.

Raises:

KeyError – If name is not present in this table.

Return type:

Column

property columns: List[Column]#

Returns a list of Column objects that represent the columns in this table.

add_column(name)[source]#

Adds a column to this table.

Parameters:

name (str) – The name of the column.

Raises:

KeyError – If name is already present in this table.

Return type:

Column

remove_column(name)[source]#

Removes a column from this table.

Parameters:

name (str) – The name of the column.

Raises:

KeyError – If name is not present in this table.

Return type:

Self

has_primary_key()[source]#

Returns True` if this table has a primary key; False otherwise.

Return type:

bool

property primary_key: Column | None#

The primary key column of this table.

The getter returns the primary key column of this table, or None if no such primary key is present.

The setter sets a column as a primary key on this table, and raises a ValueError if the primary key has a non-ID semantic type or if the column name does not match a column in the data frame.

has_time_column()[source]#

Returns True if this table has a time column; False otherwise.

Return type:

bool

property time_column: Column | None#

The time column of this table.

The getter returns the time column of this table, or None if no such time column is present.

The setter sets a column as a time column on this table, and raises a ValueError if the time column has a non-timestamp semantic type or if the column name does not match a column in the data frame.

has_end_time_column()[source]#

Returns True if this table has an end time column; False otherwise.

Return type:

bool

property end_time_column: Column | None#

The end time column of this table.

The getter returns the end time column of this table, or None if no such end time column is present.

The setter sets a column as an end time column on this table, and raises a ValueError if the end time column has a non-timestamp semantic type or if the column name does not match a column in the data frame.

property metadata: DataFrame#

Returns a pandas.DataFrame object containing metadata information about the columns in this table.

The returned dataframe has columns name, dtype, stype, is_primary_key, is_time_column and is_end_time_column, which provide an aggregate view of the properties of the columns of this table.

Example

>>> 
>>> import kumoai.experimental.rfm as rfm
>>> table = rfm.LocalTable(df=..., name=...).infer_metadata()
>>> table.metadata
    name        dtype    stype  is_primary_key  is_time_column  is_end_time_column
0   CustomerID  float64  ID     True            False           False
print_metadata()[source]#

Prints the metadata() of this table.

Return type:

None

infer_metadata(verbose=True)[source]#

Infers metadata, i.e., primary keys and time columns, in the table.

Parameters:

verbose (bool) – Whether to print verbose output.

Return type:

Self