Getting the Most From Azure Storage
The course is part of these learning pathsSee 3 more
The Azure Storage suite of services form the core foundation of much of the rest of the Azure services ecosystem. Blobs are low-level data primitives that can store any data type and size. Tables provide inexpensive, scalable NoSQL storage of key/value data pairs. Azure queues provide a messaging substrate to asynchronously and reliably connect distinct elements of a distributed system. Azure files provide an SMB-compatible file system for enabling lift-and-shift scenarios of legacy applications that use file shares. Azure disks provide consistent, high-performance storage for virtual machines running in the cloud.
In this Introduction to Azure Storage course you'll learn about the features of these core services, and see demonstrations of their use. Specifically, you will:
- Define the major components of Azure Storage
- Understand the different types of blobs and their intended use
- Learn basic programming APIs for table storage
- Discover how queues are used to pipeline cloud compute node together
- Learn to integrate Azure files with multiple applications
- Understand the tradeoffs between standard/premium storage and unmanaged/managed disks
Let's now review the major features in Azure Storage and demonstrate how it's used to handle many types of data in a scalable, fault tolerant way. Azure Storage is a general purpose data storage and management service that forms the foundation of most other core services in Azure. As well as custom services you build and run in the Azure Cloud.
It's suitable for use in a range of data scenarios. For large scale, raw or binary or text data to key value, no sequel data, message oriented queuing or even file based access from an operating system like Windows or Lenex. Azure Storage offers up to a four nines SLA for read access for data stored in geo redundant accounts.
And scales automatically to match incoming request levels up to the limit of that SLA. As noted Azure Storage is the foundation for other services in Azure like virtual machines and cloud services. Any VM you run in the Azure cloud is backed by the Azure Storage subsystem. This provides a level of trust for the overall reliability and robustness of Azure Storage.
If Microsoft trusts it enough to be a foundational pillar of Azure, you can trust it as well. Note that while Azure Storage does provide these core storage capabilities, in general it doesn't directly compete with higher level data technologies like relational databases, most other NOSQL stores or in memory data caches.
Products like Azure SQL DB, Document DB and Redis tend to offer additional developer and management oriented features beyond what Azure Storage itself offers. This makes these products more attractive than Azure storage for some audiences and use cases. But also typically more expensive. The message here is not that Azure Storage is better or worse than other data storage choices in the Cloud, but rather that like any technology, Azure Storage has strengths and weaknesses relative to other available options.
And relative to the problem at hand. Later in the course we'll talk more about good potential use cases for the Azure Storage service. Like many Cloud services, Azure Storage is priced on a consumption basis where you pay only for what and how much you use. The consumption metrics vary a bit depending on the exact type of storage you use.
But in general you pay for the total amount of data stored and the number of read and write options you perform against that stored data. Note that you also pay for any data that crosses the network boundary beyond the data center within which it resides. This includes transmitting data from an Azure data center to an external computer.
It also includes transmitting data from one Azure data center to another. Note that it does not include data accessed by compute resources running within the same data center. So for this reason it's very common and recommended to co-locate say data stored in Azure Blog Storage with a website that interacts with that data.
All interaction with Azure Storage occurs within the context of a top level storage account. An account is specific to a single Azure subscription. It has a globally unique name that makes the account reachable via DNS Lookup and an HTTP rest API. You can allow public access to the resources within the account or constrain access by authenticated user and resource groups.
Or even individual data resources. There are two types of storage accounts. General purpose accounts provide access to all types of storage resources. Blogs, tables, queues, files, and disks. A general purpose account can be configured with standard or premium performance tiers. The premium tier provides high end performance for Azure virtual machines used for IO intensive operations like data analytics.
As the name implies, blob-only storage accounts only support the blob storage type. These specialty accounts allow configuration of hot and cool storage tiers which optimize performance and cost for frequent versus infrequent data access patterns. To ensure reliability while also providing cost effectiveness for a variety of customer needs, Azure accounts are configured with the replication strategy upon creation.
These four strategies are as follows. Local replication maintains three copies of your data within the same data center. This provides a reasonable level of reliability in exchange for the most cost effective replication option. Zone replication maintains three copies of your data across two to three data centers.
Which increases reliability but also cost. Note that zone replication can only be used for certain types of blob storage. Geo replication maintains three copies of your data in each of two geographically distinct regions. This maximizes reliability but of course comes at a cost premium relative to other options.
The final option to consider is read-access geo replication which provides the same benefits as core-geo replication but also allows you to leverage replicated data as secondary read-only copies. Which can be very useful in some application scenarios with a geographically disbursed user base and low latency requirements.
We'll talk more about Azure Storage pricing later in the course. You can interact with Azure Storage in any of several ways depending upon your desired use case and skill set. For account creation and basic management functionality, the Azure portal provides access to many core storage features. You can browse some of the data elements stored within your account like blobs and files.
You can also create new blobs and containers, tables, queues and so on as well as configure secure access to those assets using role-based policies and shared access signatures. Scripting is a common way to automate and batch Azure Storage management tasks like creation or removal of blobs, tables, and queues.
Azure Storage is fully scriptable using a number of Powershell commandlets. You can also use the full featured Azure command line interface to perform any storage management operations from the Windows command line, Powershell, or Bash in Windows or Lenex. A number of external tools exist for interacting with Azure Storage as well.
One of the most popular tools is the Azure Storage explorer. Which is a desktop application for graphically browsing and interacting with Azure Storage assets across multiple subscriptions and storage accounts. Another very useful tool is AzCopy. It's a small utility for configuring fast and efficient data transfer across storage accounts.
And of course, all functionality in Azure Storage is accessible from a standard set of REST APIs that you can use from any platform or tool capable of outbound HTTP calls. You provide secure programmatic access to the data in your Azure Storage account by either configuring a calling application with one of your private account keys, or with a restricted temporary access token.
Each storage account has two private keys that provide unrestricted access to all contents and management operations of the account. These keys can be regenerated at any time as needed. However, given the potential danger of these keys leaking to undesired third parties, it's not recommended that you rely on them for general purpose, programmatic access.
Instead, you can generate shared access signature, or SAS tokens to provide narrow and limited access to only the resources you designate. And for only the time that you wish to grant access to them. You can also further restrict access to certain IP addresses if you wish. SAS tokens are merely URIs with an encoded signature component.
Calling applications use these URIs to access storage assets as needed. Their access is validated against the rules for their specific offered token. Note that securing Azure Storage against unintended management activity uses different mechanisms than those presented here. For management security, Azure Storage relies on the standard Azure role-based access control or RBAC feature.
And integration with Azure Active Directory. We'll talk more about programmatic and management level security later in the course. For now, let's look closer at creating a storage account and securing access to contained data elements.
About the Author
Josh Lane is a Microsoft Azure MVP and Azure Trainer and Researcher at Cloud Academy. He’s spent almost twenty years architecting and building enterprise software for companies around the world, in industries as diverse as financial services, insurance, energy, education, and telecom. He loves the challenges that come with designing, building, and running software at scale. Away from the keyboard you'll find him crashing his mountain bike, drumming quasi-rythmically, spending time outdoors with his wife and daughters, or drinking good beer with good friends.