kumoai.connector

Contents

kumoai.connector#

The Kumo Connector and SourceTable interfaces allow users to access and inspect raw data behind backing connectors. These data can be used to create Table and Graph objects, which are for machine learning downstream.

../_images/data_source.png

Uploading Your Own Data#

Kumo supports uploading your own tables. Files >1GB are supported by default through automatic partitioning. Tables must be single Parquet or CSV file on your local machine. Uploaded tables can be used in Kumo with FileUploadConnector, and can be deleted with delete_uploaded_table().

`upload_table`	Synchronously uploads a table located on your local machine to the Kumo data plane.
`delete_uploaded_table`	Synchronously deletes a previously uploaded table from the Kumo data plane.

Connector#

Connectors support connecting Kumo with data in a backing data store. The Kumo SDK currently supports the Amazon S3 object store, the BigQuery data warehouse, the Snowflake data warehouse, and the Databricks data warehouse as stores for source tables.

`S3Connector`	Defines a connector to a table stored as a file (or partitioned set of files) on the Amazon S3 object store.
`SnowflakeConnector`	Establishes a connection to a Snowflake database.
`DatabricksConnector`	Establishes a connection to a Databricks database.
`BigQueryConnector`	Establishes a connection to a BigQuery database.
`FileUploadConnector`	Defines a connector to files directly uploaded to Kumo, either as 'parquet' or 'csv' (non-partitioned) data.

Source Data#

Tables accessed from connectors are represented as SourceTable objects, with source columns represented as SourceColumn objects.

`SourceTable`	A source table is a reference to a table stored behind a backing `Connector`.
`SourceTableFuture`	A representation of an on-going `SourceTable` generation process.
`LLMSourceTableFuture`	A representation of an on-going `SourceTable` generation process for LLM.
`SourceColumn`	The metadata of a column in a source table.