Photo: Propel Data
Today we are thrilled to announce Propel's AWS S3 Data Source. The AWS S3 Data Source enables you to power your customer-facing analytics from Parquet files in your S3 bucket. Whether you have a Data Lake in S3, are landing Parquet files in S3 as part of your data pipeline or event-driven architecture, or are extracting data using services like Airbyte of Fivetran, you can now define Metrics and query them blazingly fast via Propel's GraphQL API.
The AWS S3 Data Source supports the Parquet file format. Parquet is a columnar file format that is supported by many data processing frameworks, including Spark, Fink, and Impala, among others. It provides efficient storage and encoding of data, as well as optimized query performance.
The Propel AWS S3 Data Source is an extremely flexible way to integrate a wide range of data architectures to Propel. Consider using it in the following scenarios:
When you have an event-driven architecture and need to power customer-facing analytics from those events. These events typically go through an event bus like AWS Event Bridge then a consumer like Kinesis Firehose will pick them up, perform the necessary transformations and land them in S3.
When you have your data in a Parquet-based data lake or lakehouse, and you need to expose your customers via your web or mobile app. These could be homegrown data lakes or data lakehouses like Dremio or AWS Lake formation.
The Propel S3 Data Source integrates with an S3 bucket in your AWS account, so there is no need to move data around. You will have to provide an AWS credential and specify the path of Parquet files inside a given bucket. Propel will use these credentials to access the Parquet files, cache the data, and make them available for querying via Propel’s GraphQL API.
When new Parquet files land in the S3 bucket, Propel automatically detects them and caches the data making it available via the API within a couple of minutes.
The Propel AWS S3 Data Source can simplify your data architecture by leveraging the data you already have. With Propel, you don't need to worry about caching or aggregating the data for your different use cases.
If you don't have your Propel Account yet Join our waitlist!
We are onboarding users first as fast as we can. We can’t wait to see what you can build with Propel!
Product and data teams struggle to work together because there's a tradeoff in data between flexibility, performance and cost-effectiveness.