1. Home
  2. Training Library
  3. Microsoft Azure
  4. Courses
  5. Implementing Azure SQL Databases

Partitioning Data

Contents

keyboard_tab
Introduction & Overview
1
Upgrade Scenarios
Course Summary
15

The course is part of these learning paths

DP-900 Exam Preparation: Microsoft Azure Data Fundamentals
course-steps
7
certification
2
lab-steps
2
description
1
Microsoft Azure for Solution Architects
course-steps
27
certification
10
lab-steps
14
AZ-303 Exam Preparation: Technologies for Microsoft Azure Architects
course-steps
28
certification
7
lab-steps
13
description
1
more_horizSee 2 more
play-arrow
Start course
Overview
Difficulty
Intermediate
Duration
1h 19m
Students
567
Ratings
4.8/5
starstarstarstarstar-half
Description

This course explores how to plan and implement data platform resources specifically with regards to Azure SQL offerings. In particular, we will explore the benefits and features of the SQL PaaS offerings along with billing models, service tiers, and high availability options. We'll also cover migration scenarios and provide a demo that migrates a sample database from an on-premise SQL Server to an Azure SQL managed instance using DMS.

If you have any feedback relating to this course, feel free to contact us at support@cloudacademy.com.

Learning Objectives

  • Get a general understanding of Azure SQL databases
  • Learn how to deploy Azure SQL databases
  • Understand business continuity and security tie in with SQL databases
  • Learn how to scale, upgrade, and partition your databases
  • Learn how to migrate a database from an on-premise SQL Server to an Azure SQL instance

Intended Audience

  • Anyone who wants to learn about Azure SQL Offerings 
  • Those preparing for Microsoft’s DP-300 exam

Prerequisites

To get the most out of this course, you have should a general understanding of the fundamentals of Microsoft Azure. Experience using databases — especially SQL Server — would also be beneficial.

Transcript

Data partitioning is a broad topic that encompasses several types of partitioning. Within a database, data can be split both vertically and horizontally across tables and data files. Data can also be partitioned across databases or physically separate data stores to mitigate resource limitations.

A typical resource limit for data is storage size. As we've seen, single instances of database products have maximum capacities in terms of storage that can be accessed. There are limits on computing power, which will dictate how many user requests can be simultaneously handled. Even if there is no limit to computing power, they will be network bottlenecks if all requests are trying to access the same resource.

Bandwidth limits maybe hit sooner rather than later depending on the type of information being transferred across the network. Think of the difference between streaming video and pinging geo location coordinates. 

Speaking of geography, data can be physically sighted closer to users for a global application to reduce latency. Different jurisdictions around the world have different legal obligations regarding user data. They require it to be stored separately or within the physical region.

One way to address some of these issues is to throw in more hardware resource at the problem by way of more memory, storage, and CPU. Better get a bigger computer may solve performance problems temporarily, but it mostly postpones the inevitable while incurring more costs.

From a database perspective, sharding is a possible solution. Sharding is where a database is split into multiple copies that all share the same schema, but are located on different servers. Each database will serve a subset of the data. For example, we would store users with surnames starting with the letters A through to M in shard one, and users with surnames beginning with N through to Z in shard two.

Distributing data like this allows a system to be scaled out spreading workload amongst multiple resources, while reducing bottlenecks and latency as shards can be located closer to users in a global environment. It will come as no surprise. There are several strategies for allocating data to different shards.

Lookup strategy is analogous to using an index on a primary key, and is suitable for a multi-tenant scenario. Our request will come in, and depending on the tenant ID will be directed to the appropriate database shard. The strategy offers a high level of control, but also a greater overhead as lookup tables need to be maintained, and each request must be accompanied by the primary key.

Range strategy is a little bit like a lookup except the key value is not an exact mapping, but falls within a range. The strategy is suitable for retrieving a lot of similar data like sales or orders where the information is shattered on the date. In this scenario, the primary key is a natural part of the data, but if you are not careful about retrieval patterns, you could still end up in a bottleneck situation with one shard being hit a lot more often than the others.

A hash strategy will help distribute data and load across the system evenly using a hash generated from attributes in the data. On the positive side, hash functions tend to be quite efficient and don't require artificial keys, but the same property makes rebalancing the shards more problematic. When data is split across multiple shards, there is no guarantee that it will remain evenly distributed over time.

Successful data partitioning depends a lot on the characteristics of the data in conjunction with the sharding strategy employed. As data volumes change, you will no doubt have to scale in or scale out database shards, and this will inevitably involve rebalancing shards. As the combination of data applications and usage patterns are unique, there is no one right sharding solution. Finding what works best for you will come down to a combination of estimation, best guess, and testing and evaluation.

Lectures

Course Introduction - Azure SQL Databases Overview - Deployment Options - ARM Templates Deployment - DEMO: Deploying Azure SQL Databases - Business Continuity - Security Database Services - Scale and Performance - DEMO: Scaling Azure SQL Databases - Migrating to Azure - Migration Scenarios - DEMO: DMS Azure SQL Database Migration Process - Upgrade Scenarios - Summary

About the Author
Avatar
Hallam Webber
Software Architect
Students
10818
Courses
17
Learning Paths
1

Hallam is a software architect with over 20 years experience across a wide range of industries. He began his software career as a  Delphi/Interbase disciple but changed his allegiance to Microsoft with its deep and broad ecosystem. While Hallam has designed and crafted custom software utilizing web, mobile and desktop technologies, good quality reliable data is the key to a successful solution. The challenge of quickly turning data into useful information for digestion by humans and machines has led Hallam to specialize in database design and process automation. Showing customers how leverage new technology to change and improve their business processes is one of the key drivers keeping Hallam coming back to the keyboard.