kumoai.connector#

The Kumo Connector and SourceTable interfaces allow users to access and inspect raw data behind backing connectors. These data can be used to create Table and Graph objects, which are for machine learning downstream.

../_images/data_source.png

Uploading Your Own Data#

Kumo supports uploading your own tables. Each table must be < 1 GB in size, and must be a single Parquet or CSV file on your local machine. Uploaded tables can be used in Kumo with FileUploadConnector, and can be deleted with delete_uploaded_table().

upload_table

Synchronously uploads a table located on your local machine to the Kumo data plane.

delete_uploaded_table

Synchronously deletes a previously uploaded table from the Kumo data plane.

Connector#

Connectors support connecting Kumo with data in a backing data store. The Kumo SDK currently supports the Amazon S3 object store, the BigQuery data warehouse, the Snowflake data warehouse, and the Databricks data warehouse as stores for source tables.

S3Connector

Defines a connector to a table stored as a file (or partitioned set of files) on the Amazon S3 object store.

SnowflakeConnector

Establishes a connection to a Snowflake database.

DatabricksConnector

Establishes a connection to a Databricks database.

BigQueryConnector

Establishes a connection to a BigQuery database.

FileUploadConnector

Defines a connector to files directly uploaded to Kumo, either as 'parquet' or 'csv' (non-partitioned) data.

Source Data#

Tables accessed from connectors are represented as SourceTable objects, with source columns represented as SourceColumn objects.

SourceTable

A source table is a reference to a table stored behind a backing Connector.

SourceTableFuture

A representation of an on-going SourceTable generation process.

SourceColumn

The metadata of a column in a source table.