A data warehouse is a central repository where information arrives from various sources, allowing data to be ingested, transformed, and processed. The goal of a data warehouse is to allow users to make use of data gathered from across the organization in decision-making.
According to the Corporate Finance Institute, there are three main types of data warehouses: an enterprise data warehouse, an operational data store, and a data mart.
A business may use all three of these types of data warehouses together in order to provide business intelligence (BI), metric reporting, and team-specific insights. For example, executives might prefer a big-picture EDW while department heads could want an ODS and/or a data mart.
While most data engineers refer to Snowflake as a data warehouse or data lake, Snowflake itself is a data platform that enables all three types of data warehouses. With Snowflake as a company’s central data repository, it’s possible to build an enterprise data warehouse (EDW), an operational data store (ODS), or a team-specific data mart, depending on your specific needs.
(Note that “virtual warehouses” in Snowflake are virtual computing clusters with a certain allocation of computational resources, so they are different from “data warehouses.”)
For instance, you might need an EDW to merge all the organization’s data in a single place in support of Business Intelligence (BI) and analytics reporting. Using Snowflake as an EDW can help answer questions about overall business profitability and help identify opportunities to cut costs and increase sales. Generally, an EDW is great if a company needs a single, reliable place to keep all data, such as for compliance or in order to get a holistic view of the customer.
Meanwhile, real-time streaming analytics showing current performance statistics would need an ODS to be built. To illustrate, Snowflake could be used as an ODS when building a dashboard for product managers and support staff that analyzes usage statistics based on real-time user events. In this case, the data team would probably be using a tool like Snowpipe to continuously monitor for new events so that the data could then be used to build data apps from Snowflake. Or, they might use Unistore, Snowflake’s new transactional table for this type of workload.
Last but not least, Snowflake can be used to create data marts specific to teams or functions within a company. Since the Snowflake separates storage (the data itself) from compute (the processing time necessary for processing that data), it’s possible to set up many different data marts within the same account. As an example, data engineers could build a data pipeline to automatically load customer data from Salesforce for deeper analysis by the sales team.
There are no hard and fast rules for which type of data warehouse you can build with Snowflake because its SQL query engine has great performance in a variety of different use cases. Once you’re set up your data pipelines to make Snowflake your single source of truth, you’re able to build the type of data warehouse that works best for your organization and specific needs.
Instead of managing multiple types of data warehouses, using a third-party analytics API can make it even easier to build product dashboards and other data apps that support enterprise decision-making. We built Propel Data especially for teams who are developing analytics apps with the data they have in Snowflake. Our easy-to-use GraphQL API is a high-performance and cost-effective choice for building complex data apps, and it requires no extra data engineering.
Databases and schemas ("namespaces") are used to organize data in Snowflake storage, which uses a columnar format internally for analytics.
Multi-cluster virtual warehouses auto-scale compute resources based on the demands on the data warehouse. Here’s how they work in Snowflake.