Snowflake
The Snowflake Data Pool lets you synchronize a Snowflake table or view to Propel, providing an easy way to power your analytic dashboards, reports, and workflows with a low-latency data API on top of Snowflake.
Consider using Propel on top of Snowflake when:
- You require sub-second query performance for dashboards or reports.
- You need to support high-concurrency and high-availability data workloads, such as customer-facing or mission-critical applications.
- You require fast data access through an API for web and mobile apps.
- You are building B2B SaaS or consumer applications that require multi-tenant access controls.
Read the post about Propel on the Snowflake ❄️ blog: APIs on top of Snowflake.
Get started
Follow our step-by-step Snowflake setup guide to connect your Snowflake data warehouse with Propel.
Architecture Overview
Snowflake Data Pools connect to the specified Snowflake table, view, or Dynamic Table and automatically synchronize the data from your Snowflake into your Data Pool in Propel.
Features
Snowflake Data Pools support the following features:
Feature name | Supported | Notes |
---|---|---|
Syncs inserts, updates, and deletes. | ✅ | See How Propel syncs inserts, updates, and deletes. |
Re-sync | ✅ | See Re-sync section. |
Configurable sync interval | ✅ | See How Propel syncs section. It can be configured to occur at intervals ranging from every minute to every 24 hours. |
Sync Pausing / Resuming | ✅ | |
Batch Deletes | ✅ | See the Batch deletes section. |
API configurable | ✅ | See API reference docs. |
Terraform configurable | ✅ | See Propel Terraform docs. |
How Propel syncs inserts, updates, and deletes
Propel uses a Snowflake stream, a form of change data capture, to synchronize inserted, updated, and deleted records from Snowflake to your Data Pool. It uses each record's primary key, consisting of the primary timestamp and a unique ID, to determine whether it should be inserted, deleted, or updated within the Data Pool.
Data syncing can be configured to occur at intervals ranging from every minute to every 24 hours. The frequency at which you sync data into Propel should depend on your data freshness requirements and your upstream pipeline. For instance, if you promised your customers a data freshness of 3 hours and you have a dbt job that runs every hour, it would not make sense to run syncs every minute since there will be no new data to sync. Running syncs every hour would be sufficient.
Propel wakes up the Snowflake warehouse every time a sync occurs. Please keep in mind the associated costs when selecting the sync interval. We recommend setting up the minimum auto-suspend period, which is 60 seconds.
Supported data structures
Below are the Snowflake data structures that Propel supports.
Name | Description |
---|---|
Table | A Snowflake table. Must have change tracking enabled. |
View | A Snowflake view. The underlying tables must have change tracking enabled. |
Dynamic Tables | A Snowflake dynamic table. |
Data requirements
To use any of the supported data structures with Propel, they must meet the following requirements:
- It must have at least one
DATE
,DATETIME
,TIMESTAMP
,TIMESTAMP_LTZ
,TIMESTAMP_NTZ
, orTIMESTAMP_TZ
column as the primary timestamp. Propel uses the primary timestamp to order and partition your data in Data Pools. It will serve as the time dimension on your Metrics. It must be included, cannot be nullable, and cannot be changed after the Data Pool is created. Timestamps without a timezone will be synced as UTC. Check our Selecting the right primary timestamp column for your Data Pool guide to learn more. - It must have a unique identifier column. Propel uses the primary timestamp and a unique ID to compose a primary key to determine whether it should be inserted, deleted, or updated within the Data Pool.
- The table or view will need change tracking enabled so Propel can detect the inserts, updates, and deletes. Dynamic tables have change tracking enabled by default. Check our How to enable change tracking guide for step-by-step instructions.
Use the TIMESTAMP_LTZ
type for date time columns as it specifies the timezone.
Re-syncing data
Snowflake Data Pools support the re-syncing of all your data. This is a safe operation, meaning that Propel will continue to serve data to your application during a re-sync, will not create duplicates, and will not persist any deleted data in Snowflake.
You should consider re-syncing your Data Pool when the Snowflake stream is deleted, failed, or stale, causing the Data Pool sync to fail. This happens when the table is re-created, or the stream is manually deleted. It also happens when you run a dbt full-refresh
operation that re-creates the table.
During a re-sync, Propel will recreate the Snowflake stream on the table or view. To determine uniqueness, the re-sync process relies on the primary key, composed of the timestamp and unique ID columns. If a record has a different primary key, Propel will treat it as a new record.
Resyncing is also available via the API with the reSyncSnowflakeDataPool
mutation.
Data Types
The table below describes default data type mappings from Snowflake to Propel types. When creating a Snowflake Data Pool, you can modify these default mappings. For instance, if you know that a column originally typed as a NUMBER contains a UNIX timestamp, you can convert it to a TIMESTAMP by changing the default mapping.
Snowflake Type | Propel Type | Notes |
---|---|---|
DATETIME, TIMESTAMP_NTZ, TIMESTAMP_LTZ, TIMESTAMP_TZ, TIMESTAMP(p) | TIMESTAMP | Timestamps without a timezone will be synced as UTC. |
DATE | DATE | |
BOOLEAN | BOOLEAN | |
NUMBER(p≤9, s=0) | INT32 | |
NUMBER(p≤18, s=0) | INT64 | |
NUMBER(p≤9, s>0) | FLOAT | |
NUMBER(p≤18, s>0), NUMBER(p>18, s) | DOUBLE | |
NUMBER(p, s), DECIMAL(p, s), NUMERIC(p, s) | Depends on precision and scale | |
NUMBER(38, s>0), INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT | DOUBLE | Snowflake maps INT types to NUMBER(38, 0), which Propel represents as a DOUBLE as it can have numbers larger than INT64. See Snowflake numeric types. |
FLOAT, FLOAT4, FLOAT8, DOUBLE, DOUBLE PRECISION, REAL | DOUBLE | |
VARCHAR, CHAR, CHARACTER, STRING, TEXT, BINARY, VARBINARY | STRING | |
TIME | STRING | |
VARIANT, OBJECT, ARRAY | JSON | |
GEOGRAPHY, GEOMETRY | Not supported |
Key Guides
- How to set up a Snowflake Data Pool
- How to enable change tracking on a Snowflake table
- Selecting the right timestamp column for your Data Pool
Frequently Asked Questions
How long does the data take to sync from Snowflake to Propel?
Data is synchronized based on your configured sync interval. You can sync as often as every minute or up to 24 hours. Once a sync is initiated, the data will be available in Propel and served via the API in 1-3 minutes.
Does Propel read through to Snowflake on every query?
No. Propel only queries Snowflake to sync newly inserted, updated, or deleted records.
API reference documentation
Below is the relevant API documentation for the Snowflake Data Pool.
Queries
Mutations
- Create Data Pool
- Modify Data Pool
- Delete Data Pool by ID
- Delete Data Pool by unique name
- Create Snowflake Data Source
- Modify Snowflake Data Source
- Delete Data Source by ID
- Delete Data Source by unique name
Limits
No limits at this point.
Related blog posts
- How to build a Snowflake API
- How to Deduplicate Events in Snowflake with dbt
- How to set up development and production environments in Snowflake
- How to get your data from an AWS RDS database into Snowflake
- How Does Snowflake Storage Work? (Databases & Schemas)
- What Is the Difference Between a Database and a Warehouse in Snowflake?
- How to build Snowflake data apps with GraphQL
- What Is a Multi-Cluster Virtual Warehouse in Snowflake Data Platform?
- How to build in-product analytics with Snowflake and GraphQL
- What Are Warehouses in Snowflake Data Analytics Platform?