Azure Data Fundamentals
Microsoft Azure offers services for a wide variety of data-related needs, including ones you would expect like file storage and relational databases, but also more specialized services, such as for text searching and time-series data. In this course, you will learn which services to choose when implementing a data infrastructure on Azure. Two services that are especially important are Azure SQL Database and Azure Cosmos DB.
Identify the most appropriate Azure services for various data-related needs
People who want to learn Azure fundamentals
General knowledge of IT architecture, especially databases
SQL Server has been one of the most popular relational databases for a very long time, so it’s not surprising that Microsoft has made it available as a managed Azure service as well. But Azure supports other relational databases too. Microsoft offers a surprisingly high level of support for the most popular open-source relational databases, MySQL and PostgreSQL.
If you have existing databases running on either of these systems and you want to migrate them to Azure, then you should use Azure Database for MySQL and Azure Database for PostgreSQL. These are managed services that give you lots of extra goodies, including high availability, backups, security, and compliance. There’s even a free service that can migrate your existing databases to Azure with minimal downtime. The managed service price is quite reasonable, so there isn’t much reason to run these databases directly on virtual machines, because you’d have to manage them yourself.
OK, back to SQL Server. The most obvious way to migrate your SQL Server databases is to use Azure SQL Database. However, if you’re currently running SQL Server 2016 or higher, there’s another option you may want to consider. It’s called SQL Server Stretch Database.
It’s very similar to StorSimple, except it’s for database records instead of files. It migrates cold table rows (that is, infrequently queried rows) to Azure. This happens in the background automatically. And you can still query the data that has migrated to Azure too. The only difference is that the query will take a bit longer than usual.
Stretch Database is a good solution if your on-site data keeps growing. First, instead of buying more storage to handle the growth, you can just migrate your cold data to Azure, which is usually cheaper and easier. Second, as your data grows, backups take longer to run. Eventually, your backups may not be able to run within your backup window. By migrating your cold data to the cloud, your on-site backups run faster and your cloud backups happen automatically.
Of course, if your users almost never need to query your cold data, then it would be cheaper to store it offline than in Azure. But in most organizations, people do need to query cold data occasionally, and it would be a pain to bring that data back online every time it’s needed, so storing it in Azure is usually a better choice.
Microsoft even provides a tool called Stretch Database Advisor that tells you which databases and tables would be good candidates for Stretch Database. It also tells you about potential blocking issues, because there are a number of constraints that can prevent data from being migrated, such as tables that have more than 1,023 columns.
If you’d rather bite the bullet and just move an entire database to Azure, then there are two ways to do it. Ideally, you would move it to Azure SQL Database, which is a managed version of SQL Server. One of the big advantages is that you wouldn’t need to worry about maintenance anymore because Microsoft takes care of all patches and updates, as well as all of the underlying infrastructure.
It isn’t always possible to move an existing SQL Server database to Azure SQL Database, though, because they’re not 100% compatible with each other. Microsoft provides a Data Migration Assistant that tells you what reengineering, if any, would be required to move a database to the SQL Database service.
If the reengineering effort would be too great, then the other option is to move the database to a SQL Server instance running on a virtual machine in Azure. The disadvantage is that you would have to take care of maintaining the instance.
Microsoft has a new option called SQL Database Managed Instance that gives you the best of both worlds. It’s nearly 100% compatible with SQL Server, but it’s managed, so you don’t have to worry about maintenance.
When you create a new instance of Azure SQL Database, you have to decide which service tier to choose. There are three service tiers: Basic, Standard, and Premium.
They all have the same uptime guarantee, but there are big differences in performance and features. Basic has low CPU and very low storage capacity compared to the other two. Its maximum storage size is only 2 gig, while the other two can go up to at least a terabyte. It also lacks support for columnstore indexing, which is useful for data warehouse applications. The main advantages of Premium over Standard are much faster I/O and support for in-memory processing.
Regardless of which service tier you choose, high availability within a region is handled automatically. SQL Database uses Always ON Availability Groups technology from SQL Server to provide HA. The good news is that you don’t have to do anything, even if there is a failure, because failover is automatic too. It may take up to 30 seconds to recover, but that should be acceptable in most cases.
To protect against regional failures, you need to configure active geo-replication. This lets you create up to four read replicas in different regions. If the primary database becomes unavailable, then you (or your application) can failover to one of the secondary databases. That secondary then becomes the new primary.
Active geo-replication also has the side benefit that you can use the secondary databases to make queries faster for users in other regions. This is possible because the secondaries are read replicas, so as long as a user doesn’t need to write to the database, they can connect to the nearest secondary instead of the primary.
Even with high availability, backups are still important, so you can recover from cases of accidental corruption or deletion. Luckily, SQL Database automatically takes care of backups too. It saves them in read-access geo-redundant storage (RA-GRS) to protect against regional failures. Transaction log backups happen about every 5 or 10 minutes, so you won’t lose much data if you do a point-in-time recovery.
And that’s it for relational database storage.
About the Author
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).