kumoai.connector.S3Connector#

class kumoai.connector.S3Connector[source]#

Bases: Connector

Defines a connector to a table stored as a file (or partitioned set of files) on the Amazon S3 object store. Any table behind an S3 bucket accessible by the shared external IAM role can be accessed through this connector.

import kumoai
connector = kumoai.S3Connector(root_dir="s3://...")  # an S3 path.

# List all tables:
print(connector.table_names())  # Returns: ['articles', 'customers', 'users']

# Check whether a table is present:
assert "articles" in connector

# Fetch a source table (both approaches are equivalent):
source_table = connector["articles"]
source_table = connector.table("articles")
Parameters:

root_dir (Optional[str]) – The root directory of this connector. If provided, the root directory is used as a prefix for tables in this connector. If not provided, all tables must be specified by their full S3 paths.

__init__(root_dir=None)[source]#
property name: str#

Not supported by S3Connector; returns an internal specifier.

property source_type: DataSourceType#

Returns the data source type accessible by this connector.

has_table(name)[source]#

Returns True if the table exists in this connector, False otherwise.

Parameters:

name (str) – The name of the table on S3. If root_dir is provided, the path will be specified as root_dir/name. If root_dir is not provided, the name should be the full path (e.g. starting with s3://).

Return type:

bool

table(name)[source]#

Returns a SourceTable object corresponding to a source table on Amazon S3.

Parameters:

name (str) – The name of the table on S3. If root_dir is provided, the path will be specified as root_dir/name. If root_dir is not provided, the name should be the full path (e.g. starting with s3://).

Raises:

ValueError – if name does not exist in the backing connector.

Return type:

SourceTable

table_names()#

Returns a list of table names accessible through this connector.

Return type:

List[str]