kumoai.connector.S3Connector#

class kumoai.connector.S3Connector[source]#

Bases: Connector

Defines a connector to a table stored as a file (or partitioned set of files) on the Amazon S3 object store. Any table behind an S3 bucket accessible by the shared external IAM role can be accessed through this connector.

import kumoai
connector = kumoai.S3Connector(root_dir="s3://...")  # an S3 path.

# List all tables:
print(connector.table_names())  # Returns: ['articles', 'customers', 'users']

# Check whether a table is present:
assert "articles" in connector

# Fetch a source table (both approaches are equivalent):
source_table = connector["articles"]
source_table = connector.table("articles")
Parameters:

root_dir (Optional[str]) – The root directory of this connector. If provided, the root directory is used as a prefix for tables in this connector. If not provided, all tables must be specified by their full S3 paths.

__init__(root_dir=None, _connector_id=None)[source]#
property name: str#

Not supported by S3Connector; returns an internal specifier.

property source_type: DataSourceType#

Returns the data source type accessible by this connector.

table(name)[source]#

Returns a SourceTable object corresponding to a source table on Amazon S3.

Parameters:

name (str) – The name of the table on S3. If root_dir is provided, the path will be specified as root_dir/name. If root_dir is not provided, the name should be the full path (e.g. starting with s3://).

Raises:

ValueError – if name does not exist in the backing connector.

Return type:

SourceTable

classmethod get_by_name(name)[source]#

Returns an instance of a named S3 Connector, created in the Kumo UI.

Note

Named S3 connectors are read-only: if you would like to modify the root directory, please do so from the UI.

Parameters:

name (str) – The name of the existing connector.

Return type:

Self

Example

>>> import kumoai
>>> connector = kumoai.S3Connector.get_by_name("name")  
has_table(name)#

Returns True if the table exists in this connector, False otherwise.

Parameters:

name (str) – The table name.

Return type:

bool

table_names()#

Returns a list of table names accessible through this connector.

Return type:

List[str]