We had a blast at Snowflake Summit last month and wanted to share some of our takeaways and takes on where the Snowflake ecosystem is going. After the Snowflake Summit, I took a week off where I had the chance to digest everything, so here is my recap. I hope you find it useful.
1. Dynamic tables are the sh*t.
Dynamic tables are set to become the default data transformation and pipelining tool for Snowflake. This will reduce the need for other data transformation tools, as you only need to write a SQL statement and set a target lag for Snowflake to incrementally materialize the view. Additionally, you can chain dynamic tables into pipelines that update continuously with minimal operational overhead. Dynamic tables were announced in private preview last year and made public preview this year. We’ve had a chance to play with them, and they do what they say they do. Before introducing new tools, we recommend you try them out first.
2. BI tools will be built by data teams
It might sound crazy, but hear me out. Streamlit has made it significantly easier to build user interfaces on top of Snowflake for internal use cases. Streamlit is a Python framework for building interactive data visualizations for those unfamiliar. This is a natural fit since data teams are already familiar with Python. Data teams could benefit from deploying smaller, purpose-specific Streamlit apps rather than old-school business intelligence tools that charge per seat. One neat thing is that Streamlit apps can be shared, which can completely change the dynamic of seat-based BI tools.
Although Snowflake was pushing hard to get Streamlit hosting out by the Summit, it didn't quite make it. Once Streamlit app hosting is available within Snowsight, it would make perfect sense to start replacing internal dashboards and reports delivered by traditional BI tools with lightweight, purpose-specific Streamlit apps.
3. Data APIs unlock “the rest of the company”, not just the data team, to build with data
I attended all the talks and workshops about Snowflake's SQL API, and they were all heavily attended. Curious about what the attendees were looking to build, I asked them. Overwhelmingly, they expressed a need to empower other engineering teams across their company to build using Snowflake data. Attendees from software companies to enterprises were looking to make Snowflake data available to business and product teams responsible for customer-facing web applications, mobile apps, and external data APIs. One important aspect was that they wanted to offer a secure and performant self-service API without giving every team access to their Snowflake account.
The Snowflake SQL API falls short in many ways. From the talks, it was evident that data teams would need to build a lot of infrastructure to expose an API to other teams within their organization. This includes multi-tenant authentication and access controls, caching, and rate limiting. It was clear from the talks that data teams still need to build much of this functionality to securely and efficiently expose an API to application teams within their organizations.
If you are looking for a Data API on top of Snowflake, you should check out Propel.
4. The app marketplace might change the nature of data companies
Traditionally, building a data company required massive venture investment to develop all the compliance, infrastructure, operations, and go-to-market functions necessary to serve enterprise customers. Consequently, data companies typically target massive markets and offer a broad range of functionality, leading to the functionality overlap that exists in today's market.
However, the app marketplace has the potential to change this. Native apps run in the end customers' Snowflake, so compliance is not an issue. Additionally, since they are discovered and distributed via the marketplace, distribution is handled by Snowflake, eliminating the need for sophisticated go-to-market teams. Finally, operations are massively simplified since the app runs on Snowflake's infrastructure, creating the opportunity for simple, purpose-specific apps. You could imagine an app that optimizes SQL queries or another that validates and formats street addresses that are built by a single developer.
Does this open the door for solo-founder millionaires from data apps on the Snowflake marketplace? Not sure, but it could. It will be interesting to see this play out.
5. SnowPark Containers, bring the apps to the data
One of the main themes discussed was bringing the applications to the data. The underlying argument and thesis is that moving the applications is easier than moving the data. Snowflake is betting that the gravitational pull from data is stronger than that of a broad set of applications. While it is unlikely that a platform like Salesforce or Twilio would run inside Snowflake, it is feasible to imagine a GraphQL API Layer like Propel running in the customer's Snowflake account.
I can see entire categories of apps that are well suited to move to run where the data lives, and for all the rest, a Data API like Propel is the key enabler for all apps that cannot move and need performant and secure access to the data.
6. Support for Iceberg tables is Snowflake’s bet on open standards
Many companies store vast amounts of data in cloud storage, such as Amazon S3, that is not loaded into Snowflake. Until last year, using this data with Snowflake as a first-class citizen was impossible. However, Snowflake made two announcements last year: read-only access for Iceberg tables and data catalog integration to support writes and updates. This approach had two issues: 1) performance lagged significantly behind native Snowflake tables, and 2) it required a data catalog to support write and update operations.
This year, Snowflake introduced an evolution of that model with managed and unmanaged Iceberg tables. Unmanaged Iceberg tables allow you to query your data (read-only) in cloud storage organized in Iceberg tables. However, what is more, interesting is that the managed Iceberg tables provide you with a first-class Snowflake experience, including reads, writes, DDL updates, and comparable performance.
Apache Iceberg is Snowflake’s bet on open standards. By supporting the unmanaged and managed versions, they pave a clear path to first bring the data into the Snowflake ecosystem and, second, take advantage of all the capabilities of the Snowflake platform while still owning the underlying files and data store.
7. LLMs and ML. Inference and training running on Snowflake
Last but not least LLMs, ML, and AI. This was a big focus of the keynote and the sessions. You could break it into three themes: 1) Bring ML apps to the data, 2) AI-powered features 3) AI-powered Snowflake experience.
Bring the ML apps to the data
ML workloads, both for inference and training, are data intensive. It makes sense to move them close to the data. The approach of the Snowpark Container services and the Nvidia partnership to support the GPU accelerated workloads make the approach very compelling.
The point above is mostly about infrastructure. Although that is an important part of the Snowflake AI & ML strategy, it does not stop there. They are baking ML capabilities natively into the Snowflake SQL with the ability to run predictions and forecasting. It might not serve the most sophisticated use cases, but it will be a great starting point for a lot of companies. Features they demoed, like Document AI, go up the stack into more targeted ML workloads.
AI-powered Snowflake experience
The latest addition is a Copilot-like experience designed to help you write SQL queries. Simply write a comment with your request, such as "Give me all the sales for California broken down by county," and it will generate the corresponding SQL query for you.
That’s all for now. If you want to learn more about Propel + Snowflake, check us out.