image
Data Sharing

Contents

Introduction
1
Course Intro
PREVIEW1m 46s
2
Snowflake Intro
PREVIEW8m 43s
Architecture
Snowflake
4
6
8
Security
9m 43s
9
Pricing
6m 39s
11
12
Snowpipe
4m 23s
Summary
13
Start course
Difficulty
Beginner
Duration
1h 33m
Students
703
Ratings
4.4/5
starstarstarstarstar-half
Description

Snowflake is an insanely cool next generation SaaS data warehousing solution that operates in the cloud!

Engineered from the ground up, Snowflake takes advantage of the elasticity that the cloud provides – and is truly revolutionary in every aspect.

Harnessing the power of the cloud, Snowflake has unique capabilities in the form of unlimited and instant scalability, making it perhaps the ultimate data warehouse solution. Cloud elasticity is very much at the heart of Snowflake – making its unique architecture and value proposition difficult to compete with in the market.

From an end user perspective, Snowflake is incredibly appealing. Building data warehouses and petabyte data scaled solutions without having to worry about on-prem compute and storage issues means your focus remains solely on the data itself and even more importantly, the analytics you derive from

In this course, you'll learn about the many distinguishing features that set Snowflake apart from its competitors.

For any feedback, queries, or suggestions relating to this course, please contact us at support@cloudacademy.com.

Learning Objectives

  • Learn about Snowflake and how it can provision cloud-hosted data warehouses
  • Learn how to administrate a Snowflake data warehouse
  • Learn how to scale Snowflake data warehouses instantly and on-demand
  • Learn how to use Snowflake to perform analytics on petabyte scale and beyond datasets

Intended Audience

  • Anyone interested in learning about Snowflake, and the benefits of using it to build a data warehouse in the cloud

Prerequisites

To get the most from this course, it would help to have a basic understanding of:

  • Basic Cloud and SaaS knowledge
  • Basic DBA knowledge
  • Basic SQL knowledge
Transcript

Welcome back. In this lesson, we'll go through the concept of Snowflake data sharing. Snowflake provides you with the ability to easily share out various Snowflake database subjects held within your Snowflake account with other Snowflake accounts and vice versa. Let's jump in and learn how this can be accomplished. The data sharing feature is considered one of Snowflake's key differentiators when compared with competitive solutions. Fundamentally, data sharing is nothing more than the process of getting data published out to other enterprises who may have a need or want for the data and vice versa. From a technology point of view, this process has often been impeded by various technical constraints. Consider transfer mechanisms such as email, FTP or even USB drives, all of which have an upper limit as to the amount of data being transferred. From a security point of view, you lose control of the data once it's outside of your system.

Consider also updates. How do you deal with updates and the frequency of them? What about cost? When you're dealing with petabytes of information, who owns the cost of storage and data transfer? Many of these issues simply no longer exist when working with Snowflake's secure data sharing feature. Mostly based on the fact that Snowflake itself is underpinned by the cloud. Data sharing within Snowflake involves providers and consumers. Providers are Snowflake accounts which provide access to databases and data objects stored within them.

Consumers are on the other side of the fence. They are Snowflake accounts which are granted access to read into a provider account. Consumer accounts can read from multiple provider accounts. A Snowflake account can be both a provider and consumer at the same time. From a data and data flow point of view, the data itself remains physically within the provider account, but flows from the provider account to the consumer account, and a single provider account can be accessed by multiple consumer accounts. Unfortunately for now, Snowflake data sharing can only happen across accounts that exist in the same cloud and region.

Some further notes about consumers. A consumer completes the data sharing process by choosing to create a database from any data share created and made available by a provider. One and only one read-only database can be created from any given share. Users within the consumer account can only view and query the data, strictly no modifications. This simple role implies each of the following: inserts updates and deletes, denied. Creating new objects, denied. Time travel does not exist for shared databases, and creating clones of objects, denied. Additionally, shared databases and all database objects within cannot be re-shared downstream to other Snowflake accounts.

Restrictions aside, consumers have the ability to consume unlimited shares from as many producers as they deem necessary. Snowflake data sharing is built around the concept of a share. A share is a first class object configured and stored at the account level. Establishing data sharing between Snowflake accounts requires the configuration and establishment of a share. Outbound shares are configured within the provider account and can only be done so by someone who has account admin role privileges.

On the other side, inbound shares are established for consumers. They will only show up in the consumer account based on the sharing configuration applied within the provider account. That is, the provider account configuration drives both ends in terms of share objects. Shares provide a controlled entry point, either out of the provider account or into the consumer account. When working with provider shares, they can be tied down to particular databases, schemers, and/or objects within them. Specifically tables, external tables, secure views, secure materialized views, and/or secure UDFs or User Defined Functions. Outside of these particular database subjects, nothing else can be shared.

Some summary notes regarding data shares. One, the sharing feature is performed through Snowflake's global service layer and the metadata layer. Therefore, there is no cost associated when establishing a data share. Two, data storage costs always belong to the producer account, whereas compute and virtual warehouse costs belongs to the consumer account. Except when using reader accounts, in this case, both costs go back to the producer. Three, a single outbound share within a producer account can be shared amongst multiple consumers. Four, there are no hard limits on the number of shares you can create on the producer's side nor how many consumers can be added to a single share. Five, share objects are stored at the account level, not at the database or schema levels. Six, all shared database objects, tables, views etc., are read-only. Modifications are strictly not permitted. Seven, a single Snowflake account can be both a provider and consumer at the same time. Eight, Snowflake itself uses data sharing to share out the account usage and sample datasets that it maintains.

And some summary notes about share objects. One, share objects encapsulate all the information required to share a database. Two, share objects are secure, configurable, and 100% controlled on the provider side. Three, a share object cannot spare multiple databases on the provider side. There is a strict one-to-one mapping  between a share and a databse. between a share and a database. Four, new objects can be added to an existing share, and when done so will appear in real time on the consumer side. And five, the provider has the right to immediately revoke share access without notice.

When working within the Snowflake admin console on the consumer side, databases are visually tagged with a small black arrow to indicate to the end user that they have originated from a share. Additionally, when hovering over a shared database, extra information is revealed about the database itself, including the fact that it has come from a share, as seen here in the provided screenshot. When establishing data sharing, Snowflake provides a special account option to address the need to share data to users who may not have a licensed Snowflake account themselves. Reader accounts provide this ability.

Reader accounts are provisioned by a provider account and provide the end user with Snowflake Web Admin console access, much like full Snowflake account holders have. The main difference between a reader account and a normal Snowflake account is that accumulated compute costs within the reader account are passed back to the provider account. With this in mind, it's important to trust those who get provided user credentials into the reader account or to maybe consider some type of chargeback model in place to cover incurred costs. This is something that you as the provider account holder must own and take on when creating reader accounts. Currently, a provider account can by default create a maximum of 20 reader accounts, although this limit can be raised if requested and approved through the official Snowflake support channels.

Reader accounts when access through the Snowflake admin console are tagged with the word reader, highlighting to the end user that the current account is indeed a reader account and that all associated compute charges. Those involved in spinning up virtual warehouses will be passed back up to the provider account. Let's now quickly walk through the process of establishing a data share. Data share management is performed within the shares area within the Snowflake Web Admin console as seen here.

Note that the shares menu option is only available to users who have been assigned the account admin role. The process really begins here when you click on the create button. Within the pop-up pane, a name for the share must be specified and a database to be shared must be chosen, remembering that there is a one-to-one mapping between a share and a database. Clicking the tables and views option will open another modal pane on top, as seen in the next slide. Keep in mind that as you proceed through the current configuration, you can hit the 'Show SQL' link at the bottom to translate the current configuration into an equivalent set of SQL statements, which in turn can be executed within a Snowflake worksheet. When deciding which database objects to share, you have the option to select specific tables and/or views. In this example, I've selected the Snowflake course's secure view only. But, more often than not, multiple tables and/or views will be selected. Having completed this section of the share configuration, proceed by clicking the 'Create' button. Clicking the create button creates the share in the background.

In the screenshot displayed here, you can now see the presence of the new cloud academy share within the secure shares table. You're also presented with a review pane, which allows you to quickly review the data about to be shared. Assuming that you are now comfortable with the data to be shared, your next job is to configure the consumers who will be granted access to the share by clicking the 'Add consumers' button. When adding a new consumer, the account type defaults to reader.

As you'll recall, the main implication of this account type is that all charges associated with running future queries within it will be passed back to the current account which is acting as the provider account. A reader account can be created on the fly or can be chosen from a drop-down list that contains those that already exist within the current Snowflake account. Changing the account type from reader to full implies that the consumer has a fully licensed Snowflake account, and in which case becomes responsible for any compute or virtual warehouse charges associated with executing queries within it.

The process of adding consumers can also be initiated outside of the create data share wizard by clicking the add consumer's link either next to the inbound-outbound toggle or within the data sharer item. Going with the reader account option on the previous pane and having clicked the create a reader account link to create it on the fly, presents the current create reader account pane. Here, you need to provide an account name for the new reader account. This is an arbitrary name that you choose, something that is meaningful which represents the account usage. In the example provided here, the reader account name is set to be Cloud Academy reader account 1. When the new reader account is created, an initial admin user is created within it. The username and password credentials for this admin user need to be supplied in the bottom section. Again, both the username and password are arbitrary values that you supply.

With all the required reader account configuration in place, proceed by hitting the 'Create account' button. When the reader account has been provisioned successfully, you'll be notified with a message saying so, "Included in this message is the newly formed Snowflake reader account URL which is the entry point into the reader account." Those unfamiliar with the Snowflake data sharing setup will be tempted to immediately click this link, which you can. However, this won't work as expected until you return back to the previous add consumers pane and actually select and set the newly created reader account as the consumer of choice. To do this, click on the 'Done' button. Back within the add consumers pane, complete the reader account setup by selecting and setting the reader account which you just created.

Finish the consumer setup by clicking the 'Add' button. With the consumers setup now complete, Snowflake informs you with the following message. Again, the reader account URL is presented, and from here you can proceed by clicking the link to open up the Snowflake reader account admin console. Now, in a productionized environment, before we hand over this URL to the consumer, we should normally log into the reader account with the admin user and then establish proper least privileged user accounts within it. But for the purposes of this lesson, I'll simply proceed as a consumer using the admin user credentials on the following screens.

So, as just mentioned, I'm now acting here onwards as a consumer, albeit using the admin user account. Browsing to the newly formed Snowflake reader account via the URL presented in the previous pane opens up a new browser tab with the login page presented. Note, the Snowflake reader account login page may take a few minutes to become functional. Snowflake reader account provisioning times vary, but usually they complete within five minutes if not often quicker.

Having authenticated successfully, you're now welcomed into the reader account as seen here. The welcome message is mostly informational, with options to take for further help and guidance, provided to help new users understand and navigate the admin interface. Navigating into the shares area within the reader account, you should see the Cloud Academy data share listed within the inbound list as we do in the screenshot. Note, that the shares menu option only becomes available once the currently logged in user has assumed the account admin role.

Once you're able to view the inbound data share, the next thing you would do as a consumer of the data share is to create a new database from it. To do this, select the inbound data share and then click on the 'Create database' from secure share link, which is displayed both on the secure shares menu and within the right hand side details panel. From here, it's just a case of specifying initial grants on the database, which default to the sys admin role, and then clicking create database. With all data sharing configuration in place, both on the provider and consumer sites, you can now start up a new worksheet and begin to query the data which exists within the data share. Keep in mind that an existing virtual warehouse will need to be available within the reader account. If one doesn't, then one must be first established.

Remember also to set the correct role, warehouse, database, and schema for the current worksheets context before executing your SQL queries on the data share. In the example shown here, the worksheet context is set to sys admin for role, WHx for virtual warehouse, which is a virtual warehouse that I happen to create within this reader account. Cloud Academy for database, which happens to be the database we created for the inbound data share and online for schema, which is the schema available within the Cloud Academy database. Executing the SQL query returns data from the data share, which can be seen here in the screenshot in the bottom pane.

Now that we have completed the end-to-end process of setting up and establishing data sharing between a provider and a consumer, I quickly just want to return to the add consumers config area, just to make mention of the fact that you can specify the account type to be full instead of reader. And that when you do so, you can add in multiple Snowflake accounts where the actual value provided is the Snowflake account name of the consumer in question. This can be sourced from the very top right hand drop-down menu all by running the SQL query select (current_account) within the consumer account. Finally, when creating data shares, you can always consider dropping into a SQL worksheet and creating the data share using the create share SQL statement, together with a set of grant statements to grant access to the shared database and objects within it as seen here.

 

About the Author
Students
133772
Labs
68
Courses
111
Learning Paths
191

Jeremy is a Content Lead Architect and DevOps SME here at Cloud Academy where he specializes in developing DevOps technical training documentation.

He has a strong background in software engineering, and has been coding with various languages, frameworks, and systems for the past 25+ years. In recent times, Jeremy has been focused on DevOps, Cloud (AWS, Azure, GCP), Security, Kubernetes, and Machine Learning.

Jeremy holds professional certifications for AWS, Azure, GCP, Terraform, Kubernetes (CKA, CKAD, CKS).