Combining On-Premises with the Cloud
Start course
1h 6m

High availability disaster recovery (HADR) is an integral part of any organization’s data strategy. It is also an essential string to every DBA's bow. In an online world that operates 24 hours a day, going offline or losing customers' data cannot be tolerated. This course examines the features that Azure provides to help you make sure your SQL databases, whether they are managed in the cloud or on-premise, are not the point of failure in your systems.

High availability disaster recovery encompasses two fundamental concepts. Firstly, how to minimize the amount of time your databases will be offline in the event of unforeseen events like hardware failures, power outages, or any number of natural disasters. Secondly, it looks at how to minimize data loss when any of these events occur. It is one thing to be offline, but it is another thing to lose data that is already in your custody.

In this course, we will cover the full range of Azure options from turnkey to custom and do-it-yourself solutions. If you have any feedback relating to this course, feel free to reach out to us at

Learning Objectives

  • Understand the concepts and elements of high availability and disaster recovery
  • Learn about hyperscale databases and how they are used
  • Learn how to combine on-premise with the cloud to replicate and sync your data
  • Understand what always on availability groups are and how they operate
  • Implement geo-replication and failover groups
  • Learn about the wide range of options you have for backing up your databases with Azure

Intended Audience

This course is intended for anyone who wants to implement high availability and disaster recovery procedures in place for their Azure SQL databases.


This is an intermediate to advanced course and so to get the most out of it, you should be familiar with SQL Server management studio, database operations like backup and restore, and T-SQL, and also have some familiarity with basic active directory concepts like users and permissions.


There are three scenarios I want to look at regarding on-premise databases using the cloud as a disaster recovery solution. The first one of replicating your databases to an Azure SQL database using publisher and subscriber technology isn’t actually recommended for this purpose but I thought I would mention it just for completeness of possible solutions. While it will keep an up-to-date copy of your data in the cloud there isn’t an automatic failover and restore capability - the synchronization is just one way.

The second is using data sync which is bidirectional replication using a hub and member design and running a data sync agent on the local machine. The data sync technology does not currently work with managed instances though. We shall investigate setting this up next.

Finally, there is the failover group technology which comes in a couple of flavors that can be used entirely on-premise, or with an on-premise instance and an instance running on a VM. Again, we shall investigate this a little later.

As I said earlier one-way transactional replication to the cloud isn’t an ideal way to implement high availability although it does go some way towards disaster recovery if you add a manual restore step at the end. In this scenario, the SQL Server must be publisher and distributor and can be located either on-premise, a virtual machine in the cloud, or a managed instance. It’s one-way replication and requires SQL Server logins to connect to the Azure SQL subscriber database.

All replicated tables must have a primary key. This technology is more suited to migrating your on-premise data to Azure SQL when you want to have minimal downtime.

To set up data sync between an Azure SQL and an on-premise database go into your Azure database and click sync to other databases. Now create a new sync group and give that group a name. I’m going to select an existing database as my metadatabase. The metadatabase managers the syncing of data between the hub and the member databases. I’ll enable automatic syncing, set my sync frequency to 5 minutes and I’ll set up the conflict resolution to “member win” to make the on-premise database the master.

To add sync members first I need to set the login to my hub database which is the Azure SQL DB, so I’ll just enter a username and password to get access. Next, I will add my on-premise database. This will involve creating a new sync agent where I will give it a name and it will generate a key for me to use with my on-premise server to authenticate to the Azure cloud. There is nothing stopping you from synchronizing Azure SQL databases, this is not a hybrid only technology.

Now I’ll just copy that key and install the data sync agent. Installing this piece of software isn’t entirely straightforward when it comes to the username. I log into my PC with a Microsoft account and that account is an administrator, but even when I prepend the account name with the machine name, I still have insufficient privileges to install the software. What I had to do was create a new local user and make them an administrator. This quirky behavior has been around for a long time and caused a fair bit of grief amongst those trying to use it.

So, the first thing I need to do is enter my key and login and password to authorize the data sync agent with my Azure database. Next, I need to register a database for synchronization so I will specify the server and database to be synchronized and I’m just going to use the SQL SA login. With the data sync software all hooked up I will go back to Azure, give my sync member a name, select the on-premise database and then specify bidirectional synchronization. Having set that up now I want to configure the tables to synchronize. I’ll select the on-premise database using the sync agent that I set up earlier. There are only two tables in this small database so I will select both and click save.

Straightaway the processing has begun, and we can see the sync group status is progressing. No errors, so let’s have a look at SQL Management studio and see what’s happening there. Well, the two tables have been created along with additional DataSync tables to help manage the replication, and if I do a couple of select statements, we can see that it has successfully synchronized the data between on-premise and the Azure database.

In terms of high availability and disaster recovery both of these methods outlined so far for a hybrid architecture are less than ideal. When I set the synchronization frequency to 5 minutes I did that because it is the most frequent interval currently allowed. And while data sync is bidirectional, it is not transactional which could potentially cause issues with data integrity and consistency. Another limitation is that DataSync can’t be used with SQL Managed Instance databases.


Course Introduction - Overview - High Availability and Disaster Recovery Concepts - Hyperscale - Always On Availability Groups - Failover in the Cloud - Database Backups - Course Summary

About the Author
Learning Paths

Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a  Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.