“A database is a logical grouping of schemas. Each database belongs to a single Snowflake account. A schema is a logical grouping of database objects (tables, views, etc.). Each schema belongs to a single database.” – Snowflake Docs
Together, a database and schema create what is called a “namespace” in Snowflake. Typically, the namespace is inferred automatically while using Snowflake, based on the current database and schema that are in use during your session, but you can specify the namespace manually.
One important feature of Snowflake’s storage system is the ability to create a “share,” a set of database objects (such as schemas that contain tables and secure views) that you would like to share with other Snowflake accounts. Shares are one way Snowflake protects data privacy.
You manipulate any of these organizational pieces of Snowflake’s data storage via Snowflake’s data definition language (DDL), which is based on SQL (Structured Query Language). For a database, schema, or share, the main commands are CREATE, ALTER, DROP, and SHOW.
Under the hood, groups of rows in Snowflake’s storage tables are stored as individual micro-partitions, organized internally as groups of columns. Since Snowflake is meant for big data, a Snowflake table in storage may contain thousands or millions of micro-partitions.
“When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized, compressed, columnar format. Snowflake stores this optimized data in cloud storage.” – Snowflake Docs
Snowflake’s unique internal architecture avoids unnecessary scanning of micro-partitions during database queries, which significantly accelerates the performance of queries referring to a given set of columnar data. This process happens automatically without additional configuration.
Unlike other database solutions, data stored inside Snowflake is not accessible directly; it’s only accessible via SQL commands. And, while there’s no limit to the number of databases, schema, or tables you can create, there is a storage cost associated with keeping data inside Snowflake.
If you enjoyed this article on Snowflake analytics, please consider signing up for our free email newsletter, following our Twitter account @propeldatacloud, or joining our customer waiting list to stay in touch with the latest news and education from Propel Data.
Snowflake uses credits, which are analogous to CPU nodes, in order to pay for the virtual warehouses that power its analytical query engine.
Snowflake data platform is referred to as a data warehouse or data lake because it separates storage (data) from compute (processing power).