Airbyte to ClickHouse

Synchronize data from over 350+ sources to Propel’s Serverless ClickHouse.

Get started with Airbyte

Step-by-step instructions to connect any Airbyte source to Propel.

Features

Data Pools created with the Airbyte destination are Webhook Data Pools that support the following features:

Feature name	Supported	Notes
Syncs inserts, updates, and deletes	✅	Propel collects inserts, updates, and deletes sent from Airbyte.
Re-sync	✅	You can trigger a resync in Airbyte.
Configurable sync interval	✅	You can configure sync interval in Airbyte.
Sync Pausing / Resuming	✅	You can pause / resume syncs in Airbyte.
Sync mode: Full refresh - Overwrite	✅	Supports the Full refresh - Overwrite Airbyte sync mode.
Sync mode: Full refresh - Append	✅	Supports the Full refresh - Append Airbyte sync mode.
Sync mode: Incremental - Append	✅	Supports the Incremental - Append Airbyte sync mode.
Sync mode: Incremental - Append + Deduped	✅	Supports the Incremental - Append + Deduped Airbyte sync mode.
Schema changes	✅	When Airbyte detects a column was added in your source table, Propel will automatically add it to the payload in the `_propel_payload` column. It will not add it as a column in your destination Data Pool.
Delete Job API	✅	See Delete Job API.
API configurable	✅	See API docs.
Terraform configurable	✅	See Terraform docs.

How does the Airbyte destination work?

Propel creates a Webhook Data Pool for each Airbyte source table. Airbyte streams inserts, updates, and deletes as individual records to Propel.

Data Pools mirror the source table’s structure, with additional columns from Airbyte and Propel:

Column	Type	Description
`_propel_received_at`	`TIMESTAMP`	The timestamp when the record was received by Propel in UTC.
`_propel_payload`	`JSON`	The JSON Payload of the record.
`_airbyte_raw_id`	`STRING`	The unique ID of the record generated by Airbyte.
`_airbyte_extracted_at`	`TIMESTAMP`	The timestamp when the data was extracted from the source by Airbyte in UTC.

After connecting, Airbyte performs initial and scheduled syncs, sending data to Propel’s Webhook Data Pool API. Propel then processes this data, making it available for SQL and the Query APIs.

Key configurations

Default Timestamp: _airbyte_extracted_at
Primary Key: _airbyte_raw_id
Sort Key: Combination of primary key and default timestamp

You can modify the sort key using a Materialized View.

Data handling

Airbyte streams capture inserts, updates, and deletes. Process these changes using a Materialized View to create a new, updated Data Pool.

Schema changes

Source schema changes are reflected in the _propel_payload JSON, but not in the Data Pool schema after initial setup.

Data types

The following table illustrates how we transform Airbyte data types into Propel-supported types:

Airbyte Type	Airbyte Format	Propel Type
(Not specified or Null)	-	`String`
`String`	-	`String`
`String`	`Date`	`Date`
`String`	`DateTime (TimestampWOTZ)`	`String`
`String`	`DateTime`	`Timestamp`
`String`	`Time`	`String`
`Boolean`	-	`Boolean`
`Number`	-	`Double`
`Integer`	-	`Int64`
`Object`	-	`JSON`
`Array`	-	`JSON`

Transforming data

Once your data is in a Webhook Data Pool, you can use Materialized Views to:

Get Started

Streaming

Data warehouses

Databases

ETL Platforms

Airbyte to ClickHouse

Get started with Airbyte

Features

How does the Airbyte destination work?

Key configurations

Data handling

Schema changes

Data types

Transforming data

Get Started

Streaming

Data warehouses

Databases

ETL Platforms

Get started with Airbyte

​Features

​How does the Airbyte destination work?

​Key configurations

​Data handling

​Schema changes

​Data types

​Transforming data

Features

How does the Airbyte destination work?

Key configurations

Data handling

Schema changes

Data types

Transforming data