This course provides the practical knowledge and expertise you will need to master the Design an Application Storage and Data Access Strategy section of the Microsoft Azure 70-534 certification exam. In this session, we will cover: Options for data storage, mobile application back-ends, push notifications, web API and web jobs and hybrid data access patterns. We will also discuss Azure Media Services (streaming, video on-demand and monitoring).
Welcome back. In this lesson we're going to be covering some of the storage options available in Azure.
Let's start with the storage option apply named Azure Storage. Azure Storage is an extremely flexible and massively scalable storage platform, and it's a fundamental building block for Azure. As an example of how it's fundamental, for ISVMs, the virtual hard disks are stored in Azure Storage.
You can use Azure Storage to store up to 500 terabytes of data, and it can handle up to 20,000 IO operations per second. And it's elastic which ensures consistent performance.
By default, Azure Storage is resilient, so it creates three replicas inside of a region, plus you can enable geo replication to the paired region. Geo replication by default is for disaster recovery since replicated data isn't accessible. It only becomes available if Microsoft initiates a fail over. However, you can use read-only geo replication, and this will, as the name suggests, allow you to use those replicas for read-only access. And that can provide lower latency for N-users. You can interact with Azure Storage via a REST API and any of the libraries that are built on top of it.
So let's look into some of the different storage options that are built in to Azure Storage. Let's start of with Blob Storage. Azure Blob Storage allows you to store unstructured data in the Cloud as blobs or objects. Blob Storage can store any type of text or binary data such as documents, pictures, videos, backups, et cetera. You can think of it as kind of a file system in the Cloud. Blobs are available in two tiers: hot, which is optimized for storing data that is accessed frequently, and cool, which is optimized for storing data that is infrequently accessed and long lived.
This diagram shows how blobs fit in to the storage hierarchy. We first create a storage account and we create multiple containers to organize our data inside of it. A storage account can store up to 500 terabytes of data shared across all sub services. Inside each container, you can unstore* multiple blobs. In this example we have a storage account from movies, then we have containers for different movie genres such as sci-fi, comedy, action, romance, et cetera. And then we can use this to store videos that fit into these genres.
An Azure Storage Account provides us with a unique address so that you can store and access a set of Azure Storage types. When you create a storage account, the name selected which must be globally unique is used to set the URL that we'll use to access that stored data. In the example here, we created an account called movies storage account, and this generates the URL https://moviesstorageaccount.blob.core.windows.net* to access the blobs. And there are also URLs like this for tables, queues, and files.
A container allows you to sub divide blobs into categories. You can create an unlimited number of them, but you're restricted to use in lowercase names. And they only operate at a single level, so you can't create real sub folders with containers. However, when uploading blobs, you can specify a folder structure. Though these folders are virtual and they're only to help you organize blobs. The container names are used to generate the URL used to access the blobs by simply appending a forward slash and then the container name to the URL for the storage account. In the example that we have here, we have /scifi.
Once we have a storage account and containers, we can start uploading blobs. The most common type blob is a Block Blob, and that's designed to hold any type of binary data or text file, however we have two other types, we have Append Blobs and Page Blobs. The Append Blob is optimized for data which you intend to add to, for example a log file. And a Page Blob is used for very large files of data such as an entire hard disk. It's optimized for frequent reads and writes just like you'd use on hard drive. A blob's name is used to generate the URL used to access the blob by simply appending a forward slash and then the name of the blob to the URL after the container. In this example we'd append a forward slash and then totalrecall.avi.
The next storage option that we're gonna talk about is Azure Storage File storage. File Storage allows us to create file shares in the cloud which can help us to migrate a legacy application to Azure. Server Message Block or SMB is available to use with File Storage, and we can use SMB 2.1 and 3.0. Applications running in Azure can mount the file share in the same way that you would use SMB to mount a file share on an internal machine. And since it's basically just a Cloud-hosted file share, you can use existing tools in APIs, as an example the Standard File System APIs will work as well as file system tools related to PowerShell Commandments.
Other uses apart from migrating legacy applications include things such as sharing application settings and configuration files, storing diagnostic data. And these may be things like logs, metrics, crash dumps, et cetera, and storing tools and utilities for development and administration on Azure.
This diagram illustrates how files fit in to the storage hierarchy. You first create a storage account and then you can create file shares. Inside of each file share, you can create a hierarchy of directories to store files in. In this case, we've repeated the Blob Storage example and created a file share for movies. So we have directories for the different movie genres. Our examples are sci-fi, comedy, action, et cetera. And then we can store videos that fit in to those genres in those folders.
Okay, up next, let's cover Table Storage. Tables allow you store structured or semi-structured all be it non-relational data in the Cloud in a Schemaless design. It fits into the NoSQL category of Key/Values store. However, it's not suitable for data sets that require complex joins and foreign keys, storage procedures, and things like this. For those, it's probably better to use a SQL option such as Azure SQL.
One of the main benefits of tables is the ability to quickly execute queries against large amounts of data. And you can use the OData protocol and related libraries to access the data.
This diagram illustrates how tables fit into the storage hierarchy. We have a storage account and then we can create multiple tables. Each table is a collection of entities used to organize our data. Each entity holds a set of properties which are similar to rows in a SQL database. And each property is in name value pair. In this example we have a storage account for movies then we have tables to hold details about directors and actors.
In addition to the user defined properties, there are also three system properties in each entity. We have the partition key which is used to segment the data and ensures that data in the same partition key can quickly be queried and updated. Then we have the row key which is used as a unique identifier for an entity. And then there's the timestamp which is a system managed last update value.
The tables are accessed using a similar URL to blobs and files. In this case we have a high level address to access tables within a storage account followed by a string for the table name. Table storage is useful for things such as user information, address books, and variable metadata.
All right, let's move on to Queue Storage. Queues offer up a resilient messaging service. These messages can be accessed using HTTP and HTTPS, and this allows you to develop decoupled applications that communicate to each other through queued messages. You can also develop components that run in different environments such as Cloud, desktop, On-prem, and mobile devices. And you can connect these together using queues.
The main uses for queues are things like scheduling a backlog of work to be processed asynchronously, passing messages between web and worker roles, supporting flexible scaling for different components, and building workflows.
This diagram here shows how queues fit in to the storage hierarchy. We first create a storage account and then we have multiple queues, and each queue is a collection of messages. In this case we have a storage account for movies. We have a queue to hold messages containing instructions for an application to make updates to the tables holding the director and actor data.
So queue storage gives you a way to create messages to pass instructions between applications, which enables asynchronous workflows. This allows you to design a set of decoupled applications including solutions using services hosted in Azure or On-prem. The queues are accessed using similar URLs to the blobs, files, and tables. In this case we have a high level address to access queues within a storage account, followed by a string for the queue name.
All right, let's talk about some general security options for Azure Storage. Azure Storage recommends the use of of HTTPS, and while supported, HTTP connections are not recommended. Azure Storage objects are private by default. To access data inside, you need an access key. Though it is possible to mark a blob container as public, and then the contained blobs are accessible via the URL. We can also give a user temporary permission to part of our Azure Storage data via a shared access signature, which allows us to define a start time, ends time, resource type, permissions, and a signature. And this allows the user access to that resource until the token expires.
Okay, that covers Azure Storage however there are more options than just what's in Azure Storage. We also have some SQL and NoSQL databases. Let's start with the SQL databases. Azure SQL provides a relational database platform as a service. This can be considered a database as a service. Alternatively, you could use ISVMs to configure them yourselves and host SQL server or any other database.
Azure SQL has different performance levels determined by service tiers and the VM sizes inside of the given tier. The levels of performance provided are measured in terms of database transaction units, abbreviated DTU and is available through the three service tiers: basic, standard, and premium.
Azure SQL allows you to select a single database, or if you have multiple database, you can use Elastic Pools. And Elastic Pool gives you the ability to scale as needed based on the database load. The Elastic Pool option also has the same three service tiers and pricing is more flexible with pools making it a cost effective option when our database usage is highly variable. The charges are based on elastic Database Transaction Units known as eDTUs which correspond to DTUs except elastic databases. don't use up any eDTUs until there is some actual database usage.
Azure SQL is without a doubt a feature rich database, and being fully managed makes it quite appealing. However, there are plenty of scenarios where we're going to need to use something else maybe something like MySQL. And for that we can use the MySQL option from the Marketplace which is provided by a Microsoft partner called ClearDb. If you're going to lift and shift an existing MySQL based application to Azure then this is the option to look at. We won't go into detail here. Just know that MySQL, which is a very popular open source database, is provided to Azure through ClearDb as a managed platform.
We can adjust the consistency with DocumentDb, and that ranges from eventual to strong in order to meet different scenarios. This allows us to focus on our used case and adjust accordingly. It's designed for high availability and automatically replicates three times inside of a given region. And it can also replicate geographically for worldwide read-only access.
Using a document database allows us to have an object with all of the data we need, and that means we don't need to normalize the data like we would if we're using SQL. In other NoSQL option similar in a lot of ways to DocumentDb is MongoDB. MongoDB is a database that you can install on-prem or on ISVMs, and Azure also has a managed version of it in Preview. If you need something that's not in Preview that's in general release, you can also use the MongoLabs version which is on the Marketplace. So if you need to use Mongo, check out the Marketplace for some of the options.
All right, we've covered a lot in this lesson, so let's end it here. In our next lesson, we're going to be talking about mobile applications that use mobile app services as their backend. So let's check that out in the next lesson.
About the Author
Ben Lambert is the Director of Engineering and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps.
When he’s not building the first platform to run and measure enterprise transformation initiatives at Cloud Academy, he’s hiking, camping, or creating video games.