Snowflake Concepts

How Does Snowflake Storage Work? (Databases & Schemas)

Snowflake data warehouse uses a high-performance columnar database structure internally, though you can’t access the stored data directly. Instead, you use databases and schemas (together called “namespaces”) to manage data storage in Snowflake.

Cover photo by Amariei Mihai on Unsplash

Snowflake’s storage of data for analytics is complicated “under the hood” because it uses columnar storage, as illustrated by this close-up image of a snowflake perched vertically on a block of ice like a column.

Databases and schemas are used to organize data stored inside tables in Snowflake data warehouse, similar to how relational databases are commonly used by data scientists, analysts, and engineers.

Unlike how Snowflake data platform defines warehouses to mean virtual compute resources that process data, the Snowflake terms database and schema have more traditional definitions.

“A database is a logical grouping of schemas. Each database belongs to a single Snowflake account. A schema is a logical grouping of database objects (tables, views, etc.). Each schema belongs to a single database.” – Snowflake Docs

Together, a database and schema create what is called a “namespace” in Snowflake. Typically, the namespace is inferred automatically while using Snowflake, based on the current database and schema that are in use during your session, but you can specify the namespace manually.

One important feature of Snowflake’s storage system is the ability to create a “share,” a set of database objects (such as schemas that contain tables and secure views) that you would like to share with other Snowflake accounts. Shares are one way Snowflake protects data privacy.

You manipulate any of these organizational pieces of Snowflake’s data storage via Snowflake’s data definition language (DDL), which is based on SQL (Structured Query Language). For a database, schema, or share, the main commands are CREATE, ALTER, DROP, and SHOW.

How Does Snowflake Storage Work?

Under the hood, groups of rows in Snowflake’s storage tables are stored as individual micro-partitions, organized internally as groups of columns. Since Snowflake is meant for big data, a Snowflake table in storage may contain thousands or millions of micro-partitions.

“When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage.” – Snowflake Docs

Snowflake’s unique internal architecture avoids unnecessary scanning of micro-partitions during database queries, which significantly accelerates the performance of queries referring to a given set of columnar data. This process happens automatically without additional configuration.

Unlike other database solutions, data stored inside Snowflake is not accessible directly; it’s only accessible via SQL commands. And, while there’s no limit to the number of databases, schema, or tables you can create, there is a storage cost associated with keeping data inside Snowflake.

If you enjoyed this article on Snowflake analytics, please consider signing up for our free email newsletter, following our Twitter account @propeldatacloud, or joining our customer waiting list to stay in touch with the latest news and education from Propel Data.

Related Content

Snowflake warehouses aren’t exactly multi-processor computing clusters with hundreds of nodes, but it can make sense of thinking of Snowflake credits as analogous to nodes, as illustrated by this image of dozens of snowflakes falling at sunset.

Snowflake Concepts

How Many Nodes Are in a Snowflake Virtual Warehouse?

Snowflake uses credits, which are analogous to CPU nodes, in order to pay for the virtual warehouses that power its analytical query engine.

Snowflake is considered a data warehouse because it’s cloud-based platform is central repository of data that separates storage of the data from the compute resources needed to process that data for analytical queries, as illustrated by this image of a single snowflake on lint-covered fabric.

Snowflake Concepts

Is Snowflake a Data Warehouse for Analytics and Insights?

Snowflake data platform is referred to as a data warehouse or data lake because it separates storage (data) from compute (processing power).