kumoai.connector

kumoai.connector#

The Kumo Connector and SourceTable interfaces allow users to access and inspect raw data behind backing connectors. These data can be used to create Table and Graph objects, which are for machine learning downstream.

../_images/data_source.png

Uploading Your Own Data#

Kumo supports uploading your own tables. Files >1GB are supported by default through automatic partitioning. Tables must be single Parquet or CSV file on your local machine. Tables can be uploaded with upload() and deleted with delete(). They can be used with FileUploadConnector.

Connector#

Connectors support connecting Kumo with data in a backing data store. The Kumo SDK currently supports the Amazon S3 object store, the BigQuery data warehouse, the Snowflake data warehouse, and the Databricks data warehouse as stores for source tables.

S3Connector

Defines a connector to a table stored as a file (or partitioned set of files) on the Amazon S3 object store.

SnowflakeConnector

Establishes a connection to a Snowflake database.

DatabricksConnector

Establishes a connection to a Databricks database.

BigQueryConnector

Establishes a connection to a BigQuery database.

FileUploadConnector

Defines a connector to files directly uploaded to Kumo, either as 'parquet' or 'csv' (non-partitioned) data.

Source Data#

Tables accessed from connectors are represented as SourceTable objects, with source columns represented as SourceColumn objects.

SourceTable

A source table is a reference to a table stored behind a backing Connector.

SourceTableFuture

A representation of an on-going SourceTable generation process.

LLMSourceTableFuture

A representation of an on-going SourceTable generation process for LLM.

SourceColumn

The metadata of a column in a source table.